Journal of Computer Applications

Machine reading comprehension model based on event representation

Yuanlong WANG, Xiaomin LIU, Hu ZHANG

2022, 42(7): 1979-1984. DOI: 10.11772/j.issn.1001-9081.2021050719

Asbtract ( )

HTML ( )

PDF (916KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to truly understand a piece of text， it is very important to grasp the main clues of the original text in the process of reading comprehension. Aiming at the questions of main clues in machine reading comprehension， a machine reading comprehension method based on event representation was proposed. Firstly， the textual event graph including the representation of events， the extraction of event elements and the extraction of event relations was extracted from the reading material by clue phrases. Secondly， after considering the time elements， emotional elements of events and the importance of each word in the document， the TextRank algorithm was used to select the events related to the clues. Finally， the answers of the questions were constructed based on the selected clue events. Experimental results show that on the test set composed of the collected 339 questions of clues， the proposed method is better than the sentence ranking method based on TextRank algorithm on BiLingual Evaluation Understudy （BLEU） and Consensus-based Image Description Evaluation （CIDEr） evaluation indexes. In specific， BLEU-4 index is increased by 4.1 percentage points and CIDEr index is increased by 9 percentage points.

Capsule network knowledge graph embedding model based on relational memory

Heng CHEN, Siyi WANG, Zhengguang LI, Guanyu LI, Xin LIU

2022, 42(7): 1985-1992. DOI: 10.11772/j.issn.1001-9081.2021050764

Asbtract ( )

HTML ( )

PDF (1243KB) ( )

Figures and Tables | References | Related Articles | Metrics

As a semantic knowledge base， Knowledge Graph （KG） uses structured triples to store real-world entities and their internal relationships. In order to infer the missing real triples in the knowledge graph， considering the strong triple representation ability of relational memory network and the powerful feature processing ability of capsule network， a knowledge graph embedding model of capsule network based on relational memory was proposed. First， the encoding embedding vectors were formed through the potential dependencies between encoding entities and relationships and some important information. Then， the embedding vectors were convolved with the filter to generate different feature maps， and the corresponding capsules were recombined. Finally， the connections from the parent capsule to the child capsule was specified through the compression function and dynamic routing， and the confidence coefficient of the current triple was estimated by the inner product score between the child capsule and the weight. Link prediction experimental results show that compared with CapsE model， on the Mean Reciprocal Rank （MRR） and Hit@10 evaluation indicators， the proposed model has the increase of 7.95% and 2.2 percentage points respectively on WN18RR dataset， and on FB15K-237 dataset， the proposed model has the increase of 3.82% and 2 percentage points respectively. Experiments results show that the proposed model can more accurately infer the relationship between the head entity and the tail entity.

Real-time semantic segmentation method based on squeezing and refining network

Juan WANG, Xuliang YUAN, Minghu WU, Liquan GUO, Zishan LIU

2022, 42(7): 1993-2000. DOI: 10.11772/j.issn.1001-9081.2021050812

Asbtract ( )

HTML ( )

PDF (2950KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problem that the current semantic segmentation algorithms are difficult to reach the balance between real-time reasoning and high-precision segmentation， a Squeezing and Refining Network （SRNet） was proposed to improve real-time performance of reasoning and accuracy of segmentation. Firstly， One-Dimensional （1D） dilated convolution and bottleneck-like structure unit were introduced into Squeezing and Refining （SR） unit， which greatly reduced the amount of calculation and the number of parameters of model. Secondly， the multi-scale Spatial Attention （SA） confusing module was introduced to make use of the spatial information of shallow layer features efficiently. Finally， the encoder was formed through stacking SR units， and two SA units were used to form the decoder. Simulation shows that SRNet obtains 68.3% Mean Intersection over Union （MIoU） on Cityscapes dataset with only 30 MB parameters and 8.8×10⁹ FLoating-point Operation Per Second （FLOPS）. Besides， the model reaches a forward reasoning speed of 12.6 Frames Per Second （FPS） with input pixel size of 512×1 024×3 on a single NVIDIA Titan RTX card. Experimental results imply that the designed lightweight model SRNet reaches a good balance between accurate segmentation and real-time reasoning， and is suitable for scenarios with limited computing power and power consumption.

Named entity recognition method combining multiple semantic features

Yayao ZUO, Haoyu CHEN, Zhiran CHEN, Jiawei HONG, Kun CHEN

2022, 42(7): 2001-2008. DOI: 10.11772/j.issn.1001-9081.2021050861

Asbtract ( )

HTML ( )

PDF (2326KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the common non-linear relationship between characters in languages， in order to capture richer semantic features， a Named Entity Recognition （NER） method based on Graph Convolutional Network （GCN） and self-attention mechanism was proposed. Firstly， with the help of the effective extraction ability of character features of deep learning methods， the GCN was used to learn the global semantic features between characters， and the Bidirectional Long Short-Term Memory network （BiLSTM） was used to extract the context-dependent features of the characters. Secondly， the above features were fused and their internal importance was calculated by introducing a self-attention mechanism. Finally， the Conditional Random Field （CRF） was used to decode the optimal coding sequence from the fused features， which was used as the result of entity recognition. Experimental results show that compared with the method that only uses BiLSTM or CRF， the proposed method has the recognition precision increased by 2.39% and 15.2% respectively on MicroSoft Research Asia （MSRA） dataset and Biomedical Natural Language Processing/Natural Language Processing in Biomedical Applications （BioNLP/NLPBA） 2004 dataset， indicating that this method has good sequence labeling capability on both Chinese and English datasets， and has strong generalization capability.

Sensitive information detection method based on attention mechanism-based ELMo

Cheng HUANG, Qianrui ZHAO

2022, 42(7): 2009-2014. DOI: 10.11772/j.issn.1001-9081.2021050877

Asbtract ( )

HTML ( )

PDF (973KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to solve the problems of low accuracy and poor generalization of the traditional sensitive information detection methods such as keyword character matching-based method and phrase-level sentiment analysis-based method， a sensitive information detection method based on Attention mechanism-based Embedding from Language Model （A-ELMo） was proposed. Firstly， the quick matched of trie tree was performed to reduce the comparison of useless words significantly， thereby improving the query efficiency greatly. Secondly， an Embedding from Language Model （ELMo） was constructed for context analysis， and the dynamic word vectors were used to fully represent the context characteristics to achieve high scalability. Finally， the attention mechanism was combined to enhance the identification ability of the model for sensitive features， and further improve the detection rate of sensitive information. Experiments were carried out on real datasets composed of multiple network data sources. The results show that the accuracy of the proposed sensitive information detection method is improved by 13.3 percentage points compared with that of the phrase-level sentiment analysis-based method， and the accuracy of the proposed method is improved by 43.5 percentage points compared with that of the keyword matching-based method， verifying that the proposed method has advantages in terms of enhancing identification ability of sensitive features and improving the detection rate of sensitive information.

Deep hashing retrieval algorithm based on meta-learning

Yaru HAN, Lianshan YAN, Tao YAO

2022, 42(7): 2015-2021. DOI: 10.11772/j.issn.1001-9081.2021040660

Asbtract ( )

HTML ( )

PDF (1262KB) ( )

Figures and Tables | References | Related Articles | Metrics

With the development of mobile Internet technology， the scale of image data is getting larger and larger， and the large-scale image retrieval task has become an urgent problem. Due to the fast retrieval speed and very low storage consumption， the hashing algorithm has received extensive attention from researchers. Deep learning based hashing algorithms need a certain amount of high-quality training data to train the model to improve the retrieval performance. However， the existing hashing methods usually ignore the problem of imbalance of data categories in the dataset， which may reduce the retrieval performance. Aiming at this problem， a deep hashing retrieval algorithm based on meta-learning network was proposed， which can automatically learn the weighting function directly from the data. The weighting function is a Multi-Layer Perceptron （MLP） with only one hidden layer. Under the guidance of a small amount of unbiased meta data， the parameters of the weighting function were able to be optimized and updated simultaneously with the parameters during model training process. The updating equations of the meta-learning network parameters were able to be explained as： increasing the weights of samples which are consistent with the meta-learning data， and reducing the weights of samples which are not consistent with the meta-learning data. The impact of imbalanced data on image retrieval was able to be effectively reduced and the robustness of the model was able to be improved through the deep hashing retrieval algorithm based on meta-learning network. A large number of experiments were conducted on widely used benchmark datasets such as CIFAR-10. The results show that the mean Average Precision （mAP） of the hashing algorithm based on meta-learning network is the highest with large imbalanced rate；especially， under the condition of imbalanced ratio=200， the mAP of the proposed algorithm is 0.54 percentage points，30.93 percentage points and 48.43 percentage points higher than those of central similarity quantization algorithm， Asymmetric Deep Supervised Hashing （ADSH） algorithm and Fast Scalable Supervised Hashing （FSSH） algorithm.

Analysis and improvement of AdaBoost’s sample weight and combination coefficient

Liang ZHU, Hua XU, Jinhai CHENG, Shen ZHU

2022, 42(7): 2022-2029. DOI: 10.11772/j.issn.1001-9081.2021050726

Asbtract ( )

HTML ( )

PDF (1311KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problems of low linear combination efficiency and too much attention to hard examples of the base classifiers of Adjusts Adaptive Boosting （AdaBoost） algorithm， two improved algorithms based on margin theory， named sample Weight and Parameterization of Improved AdaBoost （WPIAda） and sample Weight and Parameterization of Improved AdaBoost-Multitude （WPIAda.M）， were proposed. Firstly， the updates of sample weights were divided into four situations by both WPIAda and WPIAda.M algorithms， which increased the sample weights with the margin changing from positive to negative to suppress the negative movement of the margin and reduce the number of samples with the margin at zero. Secondly， according to the error rates of the base classifiers and the distribution of the sample weights， a new method to solve the coefficients of base classifiers was given by WPIAda.M algorithm， thereby improving the combination efficiency of base classifiers. On 10 UCI datasets， compared with algorithms such as WLDF_Ada （dfAda）， skAda， SWA-Adaboost （swaAda）， WPIAda and WPIAda.M algorithms had the test error reduced by 7.46 percentage points and 7.64 percentage points on average respectively， and the Area Under Curve （AUC） increased by 11.65 percentage points and 11.92 percentage points respectively. Experimental results show that WPIAda and WPIAda.M algorithms can effectively reduce the attention to hard examples， and WPIAda.M algorithm can integrate base classifiers more efficiently， so that the two algorithms can both further improve the classification performance.

Lightweight face recognition method based on deep residual network

Huaiqing HE, Jianqing YAN, Kanghua HUI

2022, 42(7): 2030-2036. DOI: 10.11772/j.issn.1001-9081.2021050880

Asbtract ( )

HTML ( )

PDF (1142KB) ( )

Figures and Tables | References | Related Articles | Metrics

As deep residual network has problems such as complex network structure and high time cost in face recognition applications of small mobile devices， a lightweight model based on deep residual network was proposed. Firstly， by simplifying and optimizing the structure of the deep residual network and combining the knowledge transfer method， a lightweight residual network （student network） was reconstructed from the deep residual network （teacher network）， which reduced the network structural complexity while ensuring accuracy. Then， in the student network， the parameters of the model were reduced by decomposing standard convolution， thereby reducing the time complexity of the feature extraction network. Experimental results show that on four different datasets such as LFW （Labeled Faces in the Wild）， VGG-Face （Visual Geometry Group Face）， AgeDB （Age Database） and CFP-FP （Celebrities in Frontal Profile with Frontal-Profile）， with the recognition accuracy close to the mainstream face recognition methods， the proposed model has the time of reasoning reaches 16 ms every image， and the speed is increased by 10% to 20%. Therefore， the proposed model can have the speed of reasoning effectively improved with the recognition accuracy basically not reduced.

Face liveness detection based on InceptionV3 and feature fusion

Ruijie YANG, Guilin ZHENG

2022, 42(7): 2037-2042. DOI: 10.11772/j.issn.1001-9081.2021050814

Asbtract ( )

HTML ( )

PDF (2380KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the photo spoofing problem that often occurs in identity verification， a face liveness detection model based on InceptionV3 and feature fusion， called InceptionV3 and Feature Fusion （InceptionV3_FF）， was proposed. Firstly， the InceptionV3 model was pretrained on ImageNet dataset. Secondly， the shallow， middle， and deep features of the image were obtained from different layers of the InceptionV3 model. Thirdly， different features were fused to obtain the final features. Finally， the fully connected layer was used to classify the features to achieve end-to-end training. The InceptionV3_FF model was simulated on NUAA dataset and self-made STAR dataset. Experimental results show that the proposed InceptionV3_FF model achieves the accuracy of 99.96% and 98.85% on NUAA dataset and STAR dataset respectively， which are higher than those of the InceptionV3 transfer learning and transfer fine-tuning models. Compared with Nonlinear Diffusion-CNN （ND-CNN）， Diffusion Kernel （DK）， Heterogeneous Kernel-Convolutional Neural Network （HK-CNN） and other models， the InceptionV3_FF model has higher accuracy on NUAA dataset and has certain advantages. When the InceptionV3_FF model recognizes a single image randomly selected from the dataset， it only takes 4 ms. The face liveness detection system consisted of the InceptionV3_FF model and OpenCV can identify real and fake faces.

Video playback speed recognition based on deep neural network

Rongyuan CHEN, Jianmin YAO, Qun YAN, Zhixian LIN

2022, 42(7): 2043-2051. DOI: 10.11772/j.issn.1001-9081.2021050799

Asbtract ( )

HTML ( )

PDF (2746KB) ( )

Figures and Tables | References | Related Articles | Metrics

Most of the current video playback speed recognition algorithms have poor extraction accuracy and many model parameters. Aiming at these problems， a dual-branch lightweight video playback speed recognition network was proposed. First， this network was a Three Dimensional （3D） convolutional network constructed on the basis of the SlowFast dual-branch network architecture. Secondly， in order to deal with the large number of parameters and many floating-point operations of S3D-G （Separable 3D convolutions network with Gating mechanism） network in video playback speed recognition tasks， a lightweight network structure adjustment was carried out. Finally， the Efficient Channel Attention （ECA） module was introduced in the network structure to generate the channel range corresponding to the focused content through the channel attention module， which helped to improve the accuracy of video feature extraction. In experiments， the proposed network was compared with S3D-G， SlowFast networks on the Kinetics-400 dataset. Experimental results show that with similar accuracy， the proposed network reduces both model size and model parameters by about 96% compared to SlowFast network， and the number of floating-point operations of the network is reduced to 5.36 GFLOPs， which means the running speed is increased significantly.

Group activity recognition based on partitioned attention mechanism and interactive position relationship

Bo LIU, Linbo QING, Zhengyong WANG, Mei LIU, Xue JIANG

2022, 42(7): 2052-2057. DOI: 10.11772/j.issn.1001-9081.2021060904

Asbtract ( )

HTML ( )

PDF (2504KB) ( )

Figures and Tables | References | Related Articles | Metrics

Group activity recognition is a challenging task in complex scenes， which involves the interaction and the relative spatial position relationship of a group of people in the scene. The current group activity recognition methods either lack the fine design or do not take full advantage of interactive features among individuals. Therefore， a network framework based on partitioned attention mechanism and interactive position relationship was proposed， which further considered individual limbs semantic features and explored the relationship between interaction feature similarity and behavior consistency among individuals. Firstly， the original video sequences and optical flow image sequences were used as the input of the network， and a partitioned attention feature module was introduced to refine the limb motion features of individuals. Secondly， the spatial position and interactive distance were taken as individual interaction features. Finally， the individual motion features and spatial position relation features were fused as the features of the group scene undirected graph nodes， and Graph Convolutional Network （GCN） was adopted to further capture the activity interaction in the global scene， thereby recognizing the group activity. Experimental results show that this framework achieves 92.8% and 97.7% recognition accuracy on two group activity recognition datasets （CAD （Collective Activity Dataset） and CAE （Collective Activity Extended Dataset））. Compared with Actor Relationship Graph （ARG） and Confidence Energy Recurrent Network （CERN） on CAD dataset， this framework has the recognition accuracy improved by 1.8 percentage points and 5.6 percentage points respectively. At the same time， the results of ablation experiment show that the proposed algorithm achieves better recognition performance.

Human activity recognition based on progressive neural architecture search

Zhenyu WANG, Lei ZHANG, Wenbin GAO, Weiming QUAN

2022, 42(7): 2058-2064. DOI: 10.11772/j.issn.1001-9081.2021050798

Asbtract ( )

HTML ( )

PDF (1638KB) ( )

Figures and Tables | References | Related Articles | Metrics

Concerning the sensor data based activity recognition problem， deep Convolutional Neural Network （CNN） was used to perform activity recognition on public OPPORTUNITY sensor dataset， and an improved Progressive Neural Architecture Search （PNAS） algorithm was proposed. Firstly， in the process of neural network model design， without manual selection of suitable topology， PNAS algorithm was used to design the optimal topology in order to maximize the F1 score. Secondly， a Sequential Model-Based Optimization （SMBO） strategy was used， in which the structure space was searched in the order of low complexity to high complexity， while a surrogate function was learned to guide the search of the structure space. Finally， the top 20 models with the best performance in the search process were fully trained on OPPORTUNIT dataset， and the best performing model was selected as the optimal architecture searched. The F1 score of the optimal architecture searched in this way reaches 93.08% on OPPORTUNITY dataset， which is increased by 1.34% and 1.73% respectively compared with those of the optimal architecture searched by evolutionary algorithm and DeepConvlSTM， which indicates that the proposed method can improve previously manually-designed architectures and is feasible and effective.

EfficientNet based dual-branch multi-scale integrated learning for pedestrian re-identification

Tianhao QIU, Shurong CHEN

2022, 42(7): 2065-2071. DOI: 10.11772/j.issn.1001-9081.2021050852

Asbtract ( )

HTML ( )

PDF (3415KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to deal with the problem of low pedestrian re-identification rate in video images due to small target pedestrians， occlusions and variable pedestrian postures， a dual-channel multi-scale integrated learning method was established based on efficient network EfficientNet. Firstly， EfficientNet-B1 （EfficientNet-Baseline1） network was used as the backbone structure. Secondly， a weighted Bidirectional Feature Pyramid Network （BiFPN） branch was used to integrate the extracted global features at different scales. In order to improve the identification rate of small target pedestrians， the global features with different semantic information were obtained. Thirdly， PCB （Part-based Convolutional Baseline） branch was used to extract deep local features to mine non-significant information of pedestrians and reduce the influence of pedestrian occlusion and posture variability on identification rate. Finally， in the training stage， the pedestrian features extracted by the two branch networks respectively were calculated by the Softmax loss function to obtain different subloss functions， and they were added for joint representation. In the test stage， the global features and deep local features obtained were spliced and fused， and the Euclidean distance was calculated to obtain the pedestrian re-identification matching results. The accuracy of Rank-1 of this method on Market1501 and DukeMTMC-Reid datasets reaches 95.1% and 89.1% respectively， which is 3.9 percentage points and 2.3 percentage points higher than that of the original backbone structure respectively. Experimental results show that the proposed model improves the accuracy of pedestrian re-identification effectively.

Music genre classification algorithm based on attention spectral-spatial feature

Wanjun LIU, Jiaming WANG, Haicheng QU, Libing DONG, Xinyu CAO

2022, 42(7): 2072-2077. DOI: 10.11772/j.issn.1001-9081.2021050740

Asbtract ( )

HTML ( )

PDF (2397KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to improve the extraction effect of the deep convolutional neural network on music spectrum genre features， a music genre classification algorithm model based on attention spectral-spatial feature， namely DCNN-SSA （Deep Convolutional Neural Network Spectral Spatial Attention）， was proposed. In DCNN-SSA model， the genre features of different music Mel spectrograms were effectively annotated in the spatial domain， and the network structure was changed to improve the feature extraction effect while ensuring the effectiveness of the model， thereby improving the accuracy of music genre classification. Firstly， the original audio signals were Mel-filtered to effectively filter the sound intensity and rhythm change of the music by simulating the filtering operation of the human ear， and the generated Mel spectrograms were cut and input into the network. Then， the model was enhanced in genre feature extraction by deepening the number of network layers， changing the convolution structure and adding spatial attention mechanism. Finally， through multiple batches of training and verification on the dataset， the features of music genres were extracted and learned effectively， and a model that can effectively classify music genres was obtained. Experimental results on GTZAN dataset show that compared with other deep learning models， the music genre classification algorithm based on spatial attention increases the music genre classification accuracy by 5.36 percentage points to 10.44 percentage points and improves model convergence effect.

Outlier detection algorithm based on autoencoder and ensemble learning

Yiyang GUO, Jiong YU, Xusheng DU, Shaozhi YANG, Ming CAO

2022, 42(7): 2078-2087. DOI: 10.11772/j.issn.1001-9081.2021050743

Asbtract ( )

HTML ( )

PDF (2364KB) ( )

Figures and Tables | References | Related Articles | Metrics

The outlier detection algorithm based on autoencoder is easy to over-fit on small- and medium-sized datasets， and the traditional outlier detection algorithm based on ensemble learning does not optimize and select the base detectors， resulting in low detection accuracy. Aiming at the above problems， an Ensemble learning and Autoencoder-based Outlier Detection （EAOD） algorithm was proposed. Firstly， the outlier values and outlier label values of the data objects were obtained by randomly changing the connection structure of the autoencoder generate different base detectors. Secondly， local region around the object was constructed according to the Euclidean distance between the data objects calculated by the nearest neighbor algorithm. Finally， based on the similarity between the outlier values and the outlier label values， the base detectors with strong detection ability in the region were selected and combined together， and the object outlier value after combination was used as the final outlier value judged by EAOD algorithm. In the experiments， compared with the AutoEncoder （AE） algorithm， the proposed algorithm has the Area Under receiver operating characteristic Curve （AUC） and Average Precision （AP） scores increased by 8.08 percentage points and 9.17 percentage points respectively on Cardio dataset； compared with the Feature Bagging （FB） ensemble learning algorithm， the proposed algorithm has the detection time cost reduced by 21.33% on Mnist dataset. Experimental results show that the proposed algorithm has good detection performance and real-time performance under unsupervised learning.

Hyperspectral clustering algorithm by double dimension-reduction based on super-pixel and anchor graph

Xingjin LAI, Zhiyuan ZHENG, Xiaoyan DU, Sha XU, Xiaojun YANG

2022, 42(7): 2088-2093. DOI: 10.11772/j.issn.1001-9081.2021050825

Asbtract ( )

HTML ( )

PDF (1709KB) ( )

Figures and Tables | References | Related Articles | Metrics

Traditional spectral clustering algorithms are difficult to be applied to large-scale hyperspectral images， and the existing improved spectral clustering algorithms are not effective in processing large-scale hyperspectral images. To address these problems， a hyperspectral clustering algorithm based on double dimension-reduction of super-pixel and anchor graph was proposed to reduce the complexity of clustering data that is to reduce the computational cost of clustering process， thereby improving the clustering performance in many aspects. Firstly， Principal Component Analysis （PCA） was performed to the hyperspectral image data， and dimension-reduction was carried out to the data based on super-pixel segmentation according to the regional characteristics of hyperspectral image. Then， the anchor points of the data obtained in previous step were selected with the idea of constructing anchor graph. And the adjacent anchor graph was constructed to achieve double dimension-reduction for spectral clustering. At the same time， in order to remove the artificial adjustment of parameters in the operation of the algorithm， a kernel-free anchor graph construction method with the Gaussian kernel removed was used in the construction of anchor graph to achieve automatic graph construction. Experimental results on Indian Pines dataset and Salinas dataset show that the proposed algorithm can improve the overall effects of clustering with guaranteeing availability and low time consumption， thus verifying that the proposed algorithm can improve the quality and performance of clustering.

Attribute based encryption scheme based on elliptic curve cryptography and supporting revocation

Jingyu SUN, Jiayu ZHU, Ziqiang TIAN, Guozhen SHI, Chuanjiang GUAN

2022, 42(7): 2094-2103. DOI: 10.11772/j.issn.1001-9081.2021040602

Asbtract ( )

HTML ( )

PDF (1632KB) ( )

Figures and Tables | References | Related Articles | Metrics

In view of the scenarios where the resources of cloud terminal users are limited， the traditional attribute based encryption schemes have the disadvantages of high computing cost and being unable to achieve real-time revocation. In order to realize the safe and efficient sharing of cloud data， an attribute based encryption scheme based on Elliptic Curve Cryptography （ECC） algorithm and supporting fine-grained revocation was proposed. In the scheme， the relatively lightweight scalar multiplication on the elliptic curve was used to replace the bilinear pairing with higher computational cost in the traditional attribute based encryption schemes， thereby reducing the computational cost of users during decryption in the system， improving the efficiency of the system and making the scheme more suitable for resource constrained cloud terminal user scenarios. In order to reduce the redundant attributes embedded in the ciphertext to shorten the length of the ciphertext， the more expressive and computationally efficient Ordered Binary Decision Diagram （OBDD） structure was used to describe the user-defined access policy. An attribute group composed of users with the attribute was established for each attribute， and a unique user attribute group key was generated for each member of the group. When the attribute revocation occurred， the minimum subset cover technology was used to generate a new attribute group for the remaining members in the group to realize real-time fine-grained attribute revocation. Security analysis shows that the proposed scheme has the indistinguishability of selective plaintext attacks， forward security and backward security. Performance analysis shows that the proposed scheme outperforms （t，n） threshold secret sharing scheme and Linear Secret Sharing Scheme （LSSS） in terms of access structure expression and computing capability， and has the decryption computational efficiency meeting the need of resource constrained cloud terminal users.

Internet of things access control model based on blockchain and edge computing

Jie ZHANG, Shanshan XU, Lingyun YUAN

2022, 42(7): 2104-2111. DOI: 10.11772/j.issn.1001-9081.2021040626

Asbtract ( )

HTML ( )

PDF (3421KB) ( )

Figures and Tables | References | Related Articles | Metrics

The emergence of edge computing has expanded the scope of Internet of Things （IoT） cloud-terminal architecture. With the reduction of transmission and processing delays of massive data on terminal devices， it also brings new security issues. Aiming at the problem of data security and management issues between edge nodes of IoT and massive heterogeneous devices， and considering that blockchain technology is widely used in the security management of data in distributed systems， an IoT access control model Smart Contract for Attribute-Based Access Control （SC-ABAC） was proposed based on blockchain and edge computing. Firstly， an IoT access control architecture integrated with edge computing was proposed， and by combining smart contracts with Attribute-Based Access Control （ABAC）， SC-ABAC was proposed and designed. Then， the optimization of Proof of Work （PoW） consensus algorithm and the access control management flow of SC-ABAC were given. Experimental results show that the time consumed by the proposed model increases linearly with the number of times under continuous access to the block， the Central Processing Unit （CPU） utilization rate is stable， and the CPU security is good during the continuous access process. In this model， the time consumption of calling contracts in the query process only increases linearly with the times， and the time consumptions of the strategy addition and judgment process are both constant. And the optimized consensus mechanism has about 18.37 percentage points less time consumption than PoW consensus per 100 blocks. Therefore， the proposed model can provide decentralized， fine-grained and dynamic access control management in the IoT environment， and can reach consensus faster in a distributed system to ensure data consistency.

Improved consensus algorithm based on binomial swap forest and HotStuff

Chunming TANG, Yuqing CHEN, Zidi ZHANG

2022, 42(7): 2112-2117. DOI: 10.11772/j.issn.1001-9081.2021040659

Asbtract ( )

HTML ( )

PDF (2344KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problems of Byzantine Fault Tolerant （BFT） consensus mechanisms in the blockchain such as high communication complexity， complex view change and poor scalability， a consensus algorithm based on binomial swap forest and HotStuff named HSP （HotStuff Plus） consensus algorithm was proposed. In order to realize signature batch verification and signature aggregation， the Boneh-Lynn-Shacham （BLS） signature algorithm was adopted； in order to reduce the communication complexity of the system， threshold signature technology was adopted； in order to reduce the communication complexity during view change， the consensus process adopted a three-phase confirmation method； in order to reduce the number of communications between the primary and secondary nodes and reduce the pressure on the primary node when aggregating signatures， an improved binomial swap forest technology was adopted. Test results show that when the total number of system nodes is 64 and the request and reply are both 256 bytes， the throughput of HSP consensus algorithm is 33.8% higher than that of HotStuff consensus mechanism， and the consensus delay of HSP consensus algorithm is 16.4% lower than that of HotStuff consensus mechanism. It can be seen that HSP consensus algorithm has better performance when the number of nodes is large.

Intrusion detection system with dynamic weight loss function based on internet of things platform

Ning DONG, Xiaorong CHENG, Mingquan ZHANG

2022, 42(7): 2118-2124. DOI: 10.11772/j.issn.1001-9081.2021040692

Asbtract ( )

HTML ( )

PDF (1166KB) ( )

Figures and Tables | References | Related Articles | Metrics

With the increasing number of Internet of Things （IoT） access devices， and the lack of awareness of the security of IoT devices of network management and maintenance staffs， attacks in IoT environment and on IoT devices spread gradually. In order to strengthen network security in IoT environment， an intrusion detection dataset based on IoT platform was used， the Convolutional Neural Network （CNN） + Long-Short Term Memory （LSTM） network was adopted as the model architecture， CNN was used to extract data spatial features， and LSTM was used to extract the data temporal features， the cross-entropy loss function was improved to a dynamic weight cross-entropy loss function， and an Intrusion Detection System （IDS） for IoT environment was produced. Experiments were designed and analyzed， and accuracy， precision， recall and F1-Measure were used as evaluation metrics. Experimental results show that compared with the model using traditional cross-entropy loss function， the proposed model using dynamic weight loss function under CNN-LSTM network architecture has an improvement of 47 percentage points in F1-Measure for Address Resolution Protocol （ARP） samples in the dataset， and has an improvement of 2 percentage points to 10 percentage points for other minority class samples in the dataset， which verifies the dynamic weight loss function can enhance the model’s ability to discriminate minority class samples， and this method can improve IDS’s ability to judge minority class attack samples.

Malware propagation model based on characteristic behavior detection in P2P networks

Hanlun LI, Jianguo REN

2022, 42(7): 2125-2131. DOI: 10.11772/j.issn.1001-9081.2021040625

Asbtract ( )

HTML ( )

PDF (2736KB) ( )

Figures and Tables | References | Related Articles | Metrics

Concerning the problem that the existing malware propagation models lack the mechanism of real-time detection of new malware and dynamic sharing of prevention and control information between nodes in Peer-to-Peer （P2P） networks， a detection-propagation model was established based on malware characteristic behavior detection technology. Firstly， based on the classic Susceptible-Infected-Recovered （SIR） propagation model， broadcast nodes were introduced （broadcast nodes refer to special nodes that generate prevention and control information after successfully detecting files containing malware and continuously send this message to neighbor nodes）. The model after introducing broadcast nodes can effectively reduce the risk of nodes themselves being infected through detection technology and can restrain the spread of malware in the network by dynamically sharing malware information between nodes in the network. Then， the equilibrium point was calculated and the propagation threshold of the model was obtained by the next generation matrix theory. Finally， the local stability and global stability of the equilibrium point of the model were proved by Hurwitz criterion and constructing Liapunov function. Experimental results show that when the propagation threshold is less than 1， compared with the degraded SIR model， under the detection rate of 0.5， 0.7 and 0.9， the proposed detection-propagation model has the total number of infected nodes at the peak point decreased by 41.37%， 48.23% and 48.64% respectively. Therefore， the detection-propagation model based on characteristic behavior detection technology can restrain the rapid propagation of malware in the network in the early stage， and the higher the detection rate， the better the containment effect.

Public transportation epidemic monitoring system based on edge computing

Huiwen XIA, Zhongyu ZHAO, Zhuoer WANG, Qingyong ZHANG, Feng PENG

2022, 42(7): 2132-2138. DOI: 10.11772/j.issn.1001-9081.2021050727

Asbtract ( )

HTML ( )

PDF (1577KB) ( )

Figures and Tables | References | Related Articles | Metrics

In view of the existing monitoring system’s inability to cope with the problems of cross-infection and traceability difficulties in the epidemic environment， a design scheme for a public transportation detection system based on edge computing was proposed. Firstly， a graph database was established to store passengers and ride information， and at the same time a dual database model was used to prevent the blockage caused by building index， thereby achieving the balance between insertion efficiency and search efficiency. Then， in the extraction of vehicle and human image information， the HSV （Hue Saturation Value） color space was used to preprocess the image， and a three-dimensional space model of face was established to improve the recognition accuracy of the neural network. When the object wore a mask， the feature point information was able to be regressed through the obvious nose tip feature points， lower jaw feature points， and unobstructed nose bridge feature points. Finally， k-hop search was used to find close contacts quickly. In the feature comparison test， the correct rates of this model are 99.44% and 99.23% on BioID dataset and PubFig dataset， respectively， and the false negative rates of the model on the two datasets are both less than 0.01%. In the graph search efficiency test， there is no big difference between the graph database and the relational database when searching at a shallow level. When the search level becomes deeper， the graph database is more efficient. After verifying the theoretical feasibility， the actual environment of buses and bus stops was simulated. In the test， the proposed system has the recognition accuracy of 99.98%， and the average recognition time of about 21 ms， which meets the requirements of epidemic monitoring. The proposed system design can meet the special needs of public safety during the epidemic period， and can realize the functions of person recognition， route recording， and potential contact search， which can effectively ensure public transportation safety.

Alternately optimizing algorithm based on Brownian movement and gradient information

Linxiu SHA, Fan NIE, Qian GAO, Hao MENG

2022, 42(7): 2139-2145. DOI: 10.11772/j.issn.1001-9081.2021050839

Asbtract ( )

HTML ( )

PDF (2126KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problems that swarm intelligence optimization algorithms are easy to fall into local optimum as well as have low population diversity in the optimization process and are difficult to optimize high-dimensional functions， an Alternately Optimizing Algorithm based on Brownian-movement and Gradient-information （AOABG） was proposed. First， a global and local alternately optimizing strategy was used in the proposed algorithm， which means the local search was switched in the range of getting better and the global search was switched in the range of getting worse. Then， the random walk of uniform distribution probability based on gradient information was introduced into local search， and the random walk of Brownian motion based on optimal solution position was introduced into global search. The proposed AOABG algorithm was compared with Harris Hawk Optimization （HHO）， Sparrow Search Algorithm （SSA） and Special Forces Algorithm （SFA） on 10 test functions. When the dimension of test function is 2 and 10， the mean value and standard deviation of AOABG’s 100 final optimization results on 10 test functions are better than those of HHO， SSA and SFA. When the test function is 30-dimensional， except for Levy function where HHO performs better than AOABG but the mean value of the two is in the same order of magnitude， AOABG performs best on the other nine test functions with an increase of 4.64%-94.89% in the average optimization results compared with the above algorithms. Experimental results show that AOABG algorithm has faster convergence speed， better stability and higher accuracy in high-dimensional function optimization.

Data naming mechanism of low earth orbit satellite mega-constellation for internet of things

Hongqiu LUO, Shengbo HU

2022, 42(7): 2146-2154. DOI: 10.11772/j.issn.1001-9081.2021050744

Asbtract ( )

HTML ( )

PDF (3015KB) ( )

Figures and Tables | References | Related Articles | Metrics

The Low Earth Orbit （LEO） satellite mega-constellation based on Information Centric Networking （ICN） is a suitable network architecture to support Internet of Things （IoT）， and the data naming is one of the basic problems in ICN. Concerning the requirements of transmission with low latency and data distribution with high throughput of IoT， a data naming mechanism of LEO satellite mega-constellation for IoT based on ICN was proposed. Firstly， a flat integrated structure fusing hierarchy， multi-component and Hash was adopted by the proposed data naming mechanism. Then， the prefix tags were used to describe hierarchical names to meet the need for fast multi-source retrieval of inner-network functions. Finally， a simulation platform of LEO satellite mega-constellation for IoT was designed and developed based on Network Simulator 3 （NS-3） to test the performance of the proposed data naming mechanism. The test and simulation results show that， compared with the traditional Internet Protocol （IP）-based system structure， the proposed data naming mechanism can provide higher Quality of Service （QoS） such as high throughput and low latency to LEO satellite mega-constellation for IoT.

Pattern mining and reuse method for user behaviors of Android applications

Qun MAO, Weiwei WANG, Feng YOU, Ruilian ZHAO, Zheng LI

2022, 42(7): 2155-2161. DOI: 10.11772/j.issn.1001-9081.2021040652

Asbtract ( )

HTML ( )

PDF (1206KB) ( )

Figures and Tables | References | Related Articles | Metrics

Software testing is an effective way to ensure the quality of Android applications. Understanding the functions of Android applications is the basis of the Android testing process. It aims to deeply explore the application’s business logic and reveal its functional defects， playing an important role in testing. User behavior patterns can assist testers in understanding an Android application’s functions， thereby improving test efficiency. Based on the idea “similar Android applications share user behavior patterns”， a user behavior pattern mining and reuse method was proposed to reduce the cost of Android application testing and improve the testing efficiency. Specifically， for the Android application under test， the user behavior patterns from a similar Android application were mined. Then， the semantic-based event fuzzy matching strategy was used to search the corresponding events for the application under test， and the Graphical User Interface （GUI） model based optimal path selection strategy was used to generate target event sequences for the application under test， thereby achieving user behavior pattern reuse across similar applications. The experiments were conducted on 32 user behavior patterns of three categories of Android applications. The results show that 87.4% of user behavior patterns can be completely reused on similar Android applications， and the reused user behavior patterns can effectively cover 90.2% of important states in applications under test. Thus， the proposed method provides effective support for the testing of Android applications.

Animation video generation model based on Chinese impressionistic style transfer

Wentao MAO, Guifang WU, Chao WU, Zhi DOU

2022, 42(7): 2162-2169. DOI: 10.11772/j.issn.1001-9081.2021050836

Asbtract ( )

HTML ( )

PDF (5691KB) ( )

Figures and Tables | References | Related Articles | Metrics

At present， Generative Adversarial Network （GAN） has been used for image animation style transformation. However， most of the existing GAN-based animation generation models mainly focus on the extraction and generation of realistic style with the targets of Japanese animations and American animations. Very little attention of the model is paid to the transfer of impressionistic style in Chinese-style animations， which limits the application of GAN in the domestic animation production market. To solve the problem， a new Chinese-style animation GAN model， namely Chinese Cartoon GAN （CCGAN）， was proposed for the automatic generation of animation videos with Chinese impressionistic style by integrating Chinese impressionistic style into GAN model. Firstly， by adding the inverted residual blocks into the generator， a lightweight deep neural network model was constructed to reduce the computational cost of video generation. Secondly， in order to extract and transfer the characteristics of Chinese impressionistic style， such as sharp image edges， abstract content structure and stroke lines with ink texture， the gray-scale style loss and color reconstruction loss were constructed in the generator to constrain the high-level semantic consistency in style between the real images and the Chinese-style sample images. Moreover， in the discriminator， the gray-scale adversarial loss and edge-promoting adversarial loss were constructed to constrain the reconstructed image for maintaining the same edge characteristics of the sample images. Finally， the Adam algorithm was used to minimize the above loss functions to realize style transfer， and the reconstructed images were combined into video. Experimental results show that， compared with the current representative style transfer models such as CycleGAN and CartoonGAN， the proposed CCGAN can effectively learn the Chinese impressionistic style from Chinese-style animations such as Chinese Choir and significantly reduce the computational cost， indicating that the proposed CCGAN is suitable for the rapid generation of animation videos with large quantities.

Low-texture monocular visual simultaneous localization and mapping algorithm based on point-line feature fusion

Gaofeng PAN, Yuan FAN, Yu RU, Yuchao GUO

2022, 42(7): 2170-2176. DOI: 10.11772/j.issn.1001-9081.2021050749

Asbtract ( )

HTML ( )

PDF (2992KB) ( )

Figures and Tables | References | Related Articles | Metrics

When the image is blurred due to rapid camera movement or in low-texture scenes， the Simultaneous Localization And Mapping （SLAM） algorithm using only point features is difficult to track and extract enough feature points， resulting in poor positioning accuracy and matching robustness. If it causes false matching， even the system cannot work. To solve the problem， a low-texture monocular SLAM algorithm based on point-line feature fusion was proposed. Firstly， the line features were added to enhance the system stability， and the problem of insufficient extraction of point feature algorithm in low texture scenes was solved. Then， the idea of weighting was introduced for the extraction number selection of point and line features， and the weight of point and line features were allocated reasonably according to the richness of the scene. The proposed algorithm ran in low-texture scenes， so the line features were set as the main features and the point features were set as the auxiliary features. Experimental results on the TUM indoor dataset show that compared with the existing point-line feature algorithms， the proposed algorithm can effectively improve the matching precision of the line features， has the trajectory error reduced by about 9 percentage points， and has the feature extraction time reduced by 30 percentage points. As the result， the added line features play a positive and effective role in low-texture scenes， and improve the overall accuracy and reliability of the data.

Non-uniform rational B spline curve fitting of particle swarm optimization algorithm solving optimal control points

Rongli GAI, Shouchuan GAO, Mingxia LI

2022, 42(7): 2177-2183. DOI: 10.11772/j.issn.1001-9081.2021050777

Asbtract ( )

HTML ( )

PDF (3931KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to maintain high precision of parameter curve fitting on the basis of compressed data， a high-precision Non-uniform Rational B Spline （NURBS） curve fitting method was proposed based on feature point extraction， least square approximation and particle swarm optimization algorithm solving optimal control points. Firstly， feature points were extracted from all discrete data points based on the inflection point and curvature extreme points. Then， the characteristic points were approximated by the least square method， and the initial control points were calculated according to the obtained linear system of equations. Finally， the initial population of particles was constructed by the position coordinates of the initial control points， and a fitness function was established to measure the errors between the discrete data point and the fitting curve. The positions of the initial control points were iteratively optimized by the particle swarm optimization algorithm until the maximum number of iterations was reached. The results of experimental verification on blade and butterfly section prototypes show that the amount of data to be fitted is compressed to the 25/117 and 120/283 respectively of the original one by using the proposed method. Compared with the method of adding auxiliary control points with high accuracy as advantage， the proposed method has the fitting accuracy 57.1% and 22.9% higher， indicating strong competitiveness of the method in the existing curve fitting research methods.

Point cloud registration algorithm based on residual attention mechanism

Tingwei QIN, Pengcheng ZHAO, Pinle QIN, Jianchao ZENG, Rui CHAI, Yongqi HUANG

2022, 42(7): 2184-2191. DOI: 10.11772/j.issn.1001-9081.2021071319

Asbtract ( )

HTML ( )

PDF (2278KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problems of low accuracy and poor robustness of traditional point cloud registration algorithms and the inability of accurate radiotherapy for cancer patients before and after radiotherapy， an Attention Dynamic Graph Convolutional Neural Network Lucas-Kanade （ADGCNNLK） was proposed. Firstly， residual attention mechanism was added to Dynamic Graph Convolutional Neural Network （DGCNN） to effectively utilize spatial information of point cloud and reduce information loss. Then， the DGCNN added with residual attention mechanism was used to extract point cloud features， this process was not only able to capture the local geometric features of the point cloud while maintaining the invariance of the point cloud replacement， but also able to semantically aggregate the information， thereby improving the registration efficiency. Finally， the extracted feature points were mapped to a high-dimensional space， and the classic image iterative registration algorithm LK （Lucas-Kanade） was used for registration of the nodes. Experimental results show that compared with Iterative Closest Point （ICP）， Globally optimal ICP （Go-ICP） and PointNetLK， the proposed algorithm has the best registration effect with or without noise. Among them， in the case without noise， compared with PointNetLK， the proposed algorithm has the rotation mean squared error reduced by 74.61%， and the translation mean squared error reduced by 47.50%； in the case with noise， compared with PointNetLK， the proposed algorithm has the rotation mean squared error reduced by 73.13%， and the translational mean squared error reduced by 44.18%， indicating that the proposed algorithm is more robust than PointNetLK. And the proposed algorithm is applied to the registration of human point cloud models of cancer patients before and after radiotherapy， assisting doctors in treatment， and realizing precise radiotherapy.

Camouflaged object detection based on progressive feature enhancement aggregation

Xiangyue TAN, Xiao HU, Jiaxin YANG, Junjiang XIANG

2022, 42(7): 2192-2200. DOI: 10.11772/j.issn.1001-9081.2021060900

Asbtract ( )

HTML ( )

PDF (2588KB) ( )

Figures and Tables | References | Related Articles | Metrics

Camouflaged Object Detection （COD） aims to detect objects hidden in complex environments. The existing COD algorithms ignore the influence of feature expression and fusion methods on detection performance when combining multi-level features. Therefore， a COD algorithm based on progressive feature enhancement aggregation was proposed. Firstly， multi-level features were extracted through the backbone network. Then， in order to improve the expression ability of features， an enhancement network composed of Feature Enhancement Module （FEM） was used to enhance the multi-level features. Finally， Adjacency Aggregation Module （AAM） was designed in the aggregation network to achieve information fusion between adjacent features to highlight the features of the camouflaged object area， and a new Progressive Aggregation Strategy （PAS） was proposed to aggregate adjacent features in a progressive way to achieve effective multi-level feature fusion while suppressing noise. Experimental results on 3 public datasets show that the proposed algorithm achieves the best performance on 4 objective evaluation indexes compared with 12 state-of-the-art algorithms， especially on COD10K dataset， the weighted F-measure and the Mean Absolute Error （MAE） of the proposed algorithm reach 0.809 and 0.037 respectively. It can be seen that the proposed algorithm achieves better performance on COD tasks.

Lightweight object detection algorithm based on improved YOLOv4

Zhifeng ZHONG, Yifan XIA, Dongping ZHOU, Yangtian YAN

2022, 42(7): 2201-2209. DOI: 10.11772/j.issn.1001-9081.2021050734

Asbtract ( )

HTML ( )

PDF (5719KB) ( )

Figures and Tables | References | Related Articles | Metrics

YOLOv4 （You Only Look Once version 4） object detection network has complex structure， many parameters， high configuration required for training and low Frames Per Second （FPS） for real-time detection. In order to solve the above problems， a lightweight object detection algorithm based on YOLOv4， named ML-YOLO （MobileNetv3Lite-YOLO）， was proposed. Firstly， MobileNetv3 was used to replace the backbone feature extraction network of YOLOv4， which greatly reduced the amount of backbone network parameters through the depthwise separable convolution in MobileNetv3. Then， a simplified weighted Bi-directional Feature Pyramid Network （Bi-FPN） structure was used to replace the feature fusion network of YOLOv4. Therefore， the object detection accuracy was optimized by the attention mechanism in Bi-FPN. Finally， the final prediction box was generated through the YOLOv4 decoding algorithm， and the object detection was realized. Experimental results on VOC （Visual Object Classes） 2007 dataset show that the mean Average Precision （mAP） of the ML-YOLO algorithm reaches 80.22%， which is 3.42 percentage points lower than that of the YOLOv4 algorithm， and 2.82 percentage points higher than that of the YOLOv5m algorithm； at the same time， the model size of the ML-YOLO algorithm is only 44.75 MB， compared with the YOLOv4 algorithm， it is reduced by 199.54 MB， and compared with the YOLOv5m algorithm， it is only 2.85 MB larger. Experimental results prove that the proposed ML-YOLO model greatly reduces the size of the model compared with the YOLOv4 model while maintaining a higher detection accuracy， indicating that the proposed algorithm can meet the lightweight and accuracy requirements of mobile or embedded devices for object detection.

Application of anisotropic non-maximum suppression in industrial target detection

Shiwen ZHANG, Chunhua DENG, Junwen ZHANG

2022, 42(7): 2210-2218. DOI: 10.11772/j.issn.1001-9081.2021040648

Asbtract ( )

HTML ( )

PDF (4149KB) ( )

Figures and Tables | References | Related Articles | Metrics

In certain fixed industrial application scenarios， the tolerance of the target detection algorithms to miss detection is very low. However， while increasing the recall， some non-overlapping virtual frames are likely to be regularly generated around the target. The traditional Non-Maximum Suppression （NMS） strategy has the main function to suppress multiple repeated detection frames of the same target， and cannot solve the above problem. To this end， an anisotropic NMS method was designed by adopting different suppression strategies for different directions around the target， and was able to effectively eliminate the regular virtual frames. The target shape and the regular virtual frame in a fixed industrial scene often have a certain relevance. In order to promote the accurate execution of anisotropic NMS in different directions， a ratio Intersection over Union （IoU） loss function was designed to guide the model to fit the shape of the target. In addition， an automatic labeling dataset augmentation method was used for the regular target， which reduced the workload of manual labeling and enlarged the scale of the dataset. Experimental results show that the proposed method has significant effects on the roll groove detection dataset， and when it is applied to the YOLO （You Only Look Once） series of algorithms， the detection precision is improved without reducing the speed. At present， the algorithm has been successfully applied to the production line of a cold rolling mill that automatically grabs rolls.

Real-time traffic sign detection algorithm based on improved YOLOv3

Dawei ZHANG, Xuchong LIU, Wei ZHOU, Zhuhui CHEN, Yao YU

2022, 42(7): 2219-2226. DOI: 10.11772/j.issn.1001-9081.2021050731

Asbtract ( )

HTML ( )

PDF (3218KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problems of slow detection and low recognition accuracy of road traffic signs in Chinese intelligent driving assistance system， an improved road traffic sign detection algorithm based on YOLOv3 （You Only Look Once version 3） was proposed. Firstly， MobileNetv2 was introduced into YOLOv3 as the basic feature extraction network to construct an object detection network module MN-YOLOv3 （MobileNetv2-YOLOv3）. And two Down-up links were added to the backbone network of MN-YOLOv3 for feature fusion， thereby reducing the model parameters， and improving the running speed of the detection module as well as information fusion performance of the multi-scale feature maps. Then， according to the shape characteristics of traffic sign objects， K-Means++ algorithm was used to generate the initial cluster center of the anchor， and the DIOU （Distance Intersection Over Union） loss function was introduced to combine DIOU and Non-Maximum Suppression （NMS） for the bounding box regression. Finally， the Region Of Interest （ROI） and the context information were unified by ROI Align and merged to enhance the object feature expression. Experimental results show that the proposed algorithm has better performance， and the mean Average Precision （mAP） of the algorithm on the dataset CSUST （ChangSha University of Science and Technology） Chinese Traffic Sign Detection Benchmark （CCTSDB） can reach 96.20%. Compared with Faster R-CNN （Region Convolutional Neural Network）， YOLOv3 and Cascaded R-CNN detection algorithms， the proposed algorithm has better real-time performance， higher detection accuracy， and is more robustness to various environmental changes.

Image character editing method based on improved font adaptive neural network

Shangwang LIU, Xinming ZHANG, Fei ZHANG

2022, 42(7): 2227-2238. DOI: 10.11772/j.issn.1001-9081.2021050882

Asbtract ( )

HTML ( )

PDF (8003KB) ( )

Figures and Tables | References | Related Articles | Metrics

In current international society， as the international language， English characters appear in many public occasions， as well as the Chinese pinyin characters in Chinese environment. When these characters appear in the image， especially in the image with complex style， it is difficult to edit and modify them directly. In order to solve the problems， an image character editing method based on improved character generation network named Font Adaptive Neural network （FANnet） was proposed. Firstly， the salience detection algorithm based on Histogram Contrast （HC） was used to improve the Character Adaptive Detection （CAD） model to accurately extract the image characters selected by the user. Secondly， the binary image of the target character that was almost consistent with the font of the source character was generated by using FANnet. Then， the color of source characters were transferred to target characters effectively by the proposed Colors Distribute-based Local （CDL） transfer model based on color complexity discrimination. Finally， the target editable characters that were highly consistent with the font structure and color change of the source character were generated， so as to achieve the purpose of character editing. Experimental results show that， on MSRA-TD500， COCO-Text and ICDAR datasets， the average values of Structural SIMilarity（SSIM）， Peak Signal-to-Noise Ratio （PSNR） and Normalized Root Mean Square Error （NRMSE） of the proposed method are 0.776 5， 18.321 1 dB and 0.435 8 respectively， which are increased by 18.59%，14.02% and decreased by 2.97% comparing with those of Scene Text Editor using Font Adaptive Neural Network（STEFANN） algorithm respectively， and increased by 30.24%，23.92% and decreased by 4.68% comparing with those of multi-modal few-shot font style transfer model named Multi-Content GAN（MC-GAN） algorithm（with 1 input character）respectively. For the image characters with complex font structure and color gradient distribution in real scene， the editing effect of the proposed method is also good. The proposed method can be applied to image reuse， image character computer automatic error correction and image text information restorage.

Pixel classification-based multiscale UAV aerial object rotational tracking algorithm

Yuanliang XUE, Guodong JIN, Lining TAN, Jiankun XU

2022, 42(7): 2239-2247. DOI: 10.11772/j.issn.1001-9081.2021040689

Asbtract ( )

HTML ( )

PDF (4732KB) ( )

Figures and Tables | References | Related Articles | Metrics

A pixel classification-based multiscale Unmanned Aerial Vehicle （UAV） aerial object rotational tracking algorithm was proposed for the UAV tracking process， in which the vertical tracking box limited the tracking accuracy when dealing with scale changes， similar objects and aspect ratio changes. Firstly， MS-ResNet （MultiScale ResNet-50） was designed to extract multiscale features of the object. Then， a pixel binary classification module was designed on the multi-channel response map with orthogonal characteristics to further refine the results of classification and regression branches accurately. Meanwhile， to improve the pixel classification accuracy， the concurrent spatial and channel “Squeeze & Excitation” （scSE） module was used to filter the object features in the spatial and channel domains. Finally， a rotational tracking box fitting the actual size of the object was generated based on pixel classification to avoid the contamination of positive samples. Experimental results show that the proposed algorithm has the success rate and precision on the UAV tracking dataset UAV123 of 60.7% and 79.5% respectively， which are 5 percentage points and 2.7 percentage points higher than those of Siamese Region Proposal Network （SiamRPN） respectively， and has the speed reached 67.5 FPS， meeting the real-time requirements. The proposed algorithm has good scale adaptation， discrimination ability and robustness， and can effectively cope with UAV tracking tasks.

Ship detection algorithm based on improved RetinaNet

Wenjun FAN, Shuguang ZHAO, Lizheng GUO

2022, 42(7): 2248-2255. DOI: 10.11772/j.issn.1001-9081.2021050831

Asbtract ( )

HTML ( )

PDF (4946KB) ( )

PDF（mobile） (3371KB) ( 53 )

Figures and Tables | References | Related Articles | Metrics

At present， the target detection technology based on deep learning algorithm has achieved the remarkable results in ship detection of Synthetic Aperture Radar （SAR） images. However， there is still the problem of poor detection effect of small target ships and densely arranged ships near shore. To solve the above problem， a new ship detection algorithm based on improved RetinaNet was proposed. On the basis of traditional RetinaNet algorithm， firstly， the convolution in the residual block of feature extraction network was improved to grouped convolution， thereby increasing the network width and improving the feature extraction ability of the network. Then， the attention mechanism was added in the last two stages of feature extraction network to make the network more focus on the target area and improve the target detection ability. Finally， the Soft Non-Maximum Suppression （Soft-NMS） was added to the algorithm to reduce the missed detection rate of the algorithm for the detection of densely arranged ships near shore. Experimental results on High-Resolution SAR Images Dataset （HRSID） and SAR Ship Detection Dataset （SSDD） show that， the proposed algorithm effectively improves the detection effect of small target ships and near-shore ships， is superior in detection precision and speed compared with the current excellent object detection models such as Faster Region-based Convolutional Neural Network （R-CNN）， You Only Look Once version 3 （YOLOv3） and CenterNet.

Credit risk prediction model based on borderline adaptive SMOTE and Focal Loss improved LightGBM

Hailong CHEN, Chang YANG, Mei DU, Yingyu ZHANG

2022, 42(7): 2256-2264. DOI: 10.11772/j.issn.1001-9081.2021050810

Asbtract ( )

HTML ( )

PDF (2136KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problem that the imbalance of datasets in credit risk assessment affects the prediction effect of the model， a credit risk prediction model based on Borderline Adaptive Synthetic Minority Oversampling TEchnique （BA-SMOTE） and Focal Loss-Light Gradient Boosting Machine （FLLightGBM） was proposed. Firstly， on the basis of Borderline Synthetic Minority Oversampling TEchnique （Borderline-SMOTE）， the adaptive idea and new interpolation method were introduced， so that different numbers of new samples were generated for each minority sample at the border， and the positions of the new samples were closer to the original minority sample， thereby balancing the dataset. Secondly， the Focal Loss function was used to improve the loss function of LightGBM （Light Gradient Boosting Machine） algorithm， and the improved algorithm was used to train a new dataset to obtain the final BA-SMOTE-FLLightGBM model constructed by BA-SMOTE method and FLLightGBM algorithm. Finally， on Lending Club dataset， the credit risk prediction was performed. Experimental results show that compared with other imbalanced classification algorithms RUSBoost （Random Under-Sampling with adaBoost）， CUSBoost （Cluster-based Under-Sampling with adaBoost）， KSMOTE-AdaBoost （K-means clustering SMOTE with AdaBoost）， and AK-SMOTE-Catboost （AllKnn-SMOTE-Catboost）， the constructed model has a significant improvement on two evaluation indicators G-mean and AUC （Area Under Curve） with 9.0%-31.3% and 5.0%-14.1% respectively. The above results verify that the proposed model has a better default prediction effect in credit risk assessment.

Stock market volatility prediction method based on graph neural network with multi-attention mechanism

Xiaohan LI, Jun WANG, Huading JIA, Liu XIAO

2022, 42(7): 2265-2273. DOI: 10.11772/j.issn.1001-9081.2021081487

Asbtract ( )

HTML ( )

PDF (2246KB) ( )

Figures and Tables | References | Related Articles | Metrics

Stock market is an essential element of financial market， therefore， the study on volatility of stock market plays a significant role in taking effective control of financial market risks and improving returns on investment. For this reason， it has attracted widespread attention from both academic circle and related industries. However， there are multiple influencing factors for stock market. Facing the multi-source and heterogeneous information in stock market， it is challenging to find how to mine and fuse multi-source and heterogeneous data of stock market efficiently. To fully explain the influence of different information and information interaction on the price changes in stock market， a graph neural network based on multi-attention mechanism was proposed to predict price fluctuation in stock market. First of all， the relationship dimension was introduced to construct heterogeneous subgraphs for the transaction data and news text of stock market， and multi-attention mechanism was adopted for fusion of the graph data. Then， the graph neural network Gated Recurrent Unit （GRU） was applied to perform graph classification. On this basis， prediction was made for the volatility of three important indexes： Shanghai Composite Index， Shanghai and Shenzhen 300 Index， Shenzhen Component Index. Experimental results show that from the perspective of heterogeneous information characteristics， compared with the transaction data of stock market， the news information of stock market has the lagged influence on stock volatility； from the perspective of heterogeneous information fusion， compared with algorithms such as Support Vector Machine （SVM）， Random Forest （RF） and Multiple Kernel k-Means （MKKM） clustering， the proposed method has the prediction accuracy improved by 17.88 percentage points， 30.00 percentage points and 38.00 percentage points respectively； at the same time， the quantitative investment simulation was performed according to the model trading strategy.

Spatial-temporal prediction model of urban short-term traffic flow based on grid division

Haiqi WANG, Zhihai WANG, Liuke LI, Haoran KONG, Qiong WANG, Jianbo XU

2022, 42(7): 2274-2280. DOI: 10.11772/j.issn.1001-9081.2021050838

Asbtract ( )

PDF (2906KB) ( )

References | Related Articles | Metrics

Accurate traffic flow prediction is very important in helping traffic management departments to take effective traffic control and guidance measures and travelers to plan routes reasonably. Aiming at the problem that the traditional deep learning models do not fully consider the spatial-temporal characteristics of traffic data， a CNN-LSTM prediction model based on attention mechanism， namely STCAL （Spatial-Temporal Convolutional Attention-LSTM network）， was established under the theoretical frameworks of Convolutional Neural Network （CNN） and Long Short-Term Memory （LSTM） unit and with the combination of the spatial-temporal characteristics of urban traffic flow. Firstly， the fine-grained grid division method was used to construct the spatial-temporal matrix of traffic flow. Secondly， CNN model was used as a spatial component to extract the spatial characteristics of urban traffic flow in different periods. Finally， the LSTM model based on attention mechanism was used as a dynamic time component to capture the temporal characteristics and trend variability of traffic flow， and the prediction of traffic flow was realized. Experimental results show that compared with Gated Recurrent Unit （GRU） and Spatio-Temporal Residual Network （ST-ResNet）， STCAL model has the Root Mean Square Error （RMSE） index reduced by 17.15% and 7.37% respectively， the Mean Absolute Error （MAE） index reduced by 22.75% and 9.14% respectively， and the coefficient of determination （R²） index increased by 11.27% and 2.37% respectively. At the same time， it is found that the proposed model has the prediction effect on weekdays with high regularity higher than that on weekends， and has the best prediction effect of morning peak on weekdays， showing that it can provide a basis for short-term urban regional traffic flow change monitoring.

Collaborative optimization of automated guided vehicle scheduling and path planning considering conflict and congestion

Houming FAN, Shuang MU, Lijun YUE

2022, 42(7): 2281-2291. DOI: 10.11772/j.issn.1001-9081.2021050819

Asbtract ( )

HTML ( )

PDF (4118KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to solve the problems of Automated Guided Vehicle （AGV） scheduling and conflict-free path planning in automated container terminals， an AGV conflict and congestion resolution strategy was proposed to generate conflict-free paths. Firstly， considering the capacity of the buffer bracket in the container yard as well as the constraints of no congestion on the operation paths and no conflict on the nodes， a two-stage mixed integer programming model was established based on the goal of the smallest maximum completion time and the shortest AGV transportation time. Then， an improved adaptive genetic algorithm and Dijkstra algorithm based on conflict and congestion resolution strategy were designed to obtain the AGV scheduling scheme and conflict-free paths. The results of numerical examples show that the improved adaptive genetic algorithm has the average solution time reduced by 13.56%， and the average gap rate of the objective function reduced by 9.01% compared to the genetic algorithm. Compared with the parking to wait strategy， the conflict and congestion resolution strategy has the congestion rate of the horizontal transportation area reduced by 67.6%， and the AGV waiting time reduced by 66.7%. It can be seen that the proposed algorithm has higher solving quality and faster speed， at the same time， the effectiveness of the proposed strategy is verified.

Fireworks algorithm for location-routing problem of simultaneous pickup and delivery with time window

Yaping LIU, Huizhen ZHANG, Li ZHANG, Youyou LIU

2022, 42(7): 2292-2300. DOI: 10.11772/j.issn.1001-9081.2021040697

Asbtract ( )

HTML ( )

PDF (2162KB) ( )

Figures and Tables | References | Related Articles | Metrics

With the rapid development of e-commerce and the popularity of the Internet， it is more convenient to exchange and return goods. Therefore， the customers’ demands for goods show the characteristics of timeliness， variety， small batch， exchanging and returning. Aiming at Location-Routing Problem with Simultaneous Pickup and Delivery （LRPSPD） with capacity and considering the characteristics of customers’ diversified demands， a mathematical model of LRPSPD & Time Window （LRPSPDTW） was established. Improved FireWorks Algorithm （IFWA） was used to solve the model， and the corresponding neighborhood operations were carried out for the fireworks explosion and mutation. The performance of the fireworks algorithm was evaluated with some benchmark LRPSPD examples. The correctness and effectiveness of the proposed model and algorithm were verified by a large number of numerical experiments. Experimental results show that compared with Branch and Cut algorithm （B&C）， the average error between the result of IFWA and the standard solution is reduced by 0.33 percentage points. The proposed algorithm shortens the time to find the optimal solution， and provides a new way of thinking for solving location-routing problems.

Artifacts sensing generative adversarial network for low-dose CT denoising

Zefang HAN, Xiong ZHANG, Hong SHANGGUAN, Xinglong HAN, Jing HAN, Gang FENG, Xueying CUI

2022, 42(7): 2301-2310. DOI: 10.11772/j.issn.1001-9081.2021040700

Asbtract ( )

HTML ( )

PDF (3473KB) ( )

Figures and Tables | References | Related Articles | Metrics

In recent years， Generative Adversarial Network （GAN） has become a new research hotspot in Low-Dose Computed Tomography （LDCT） artifact suppression because of its performance advantages. Due to irregular distribution and strong relevance to the normal tissues of artifacts， denoising performance of the existing GAN-based denoising networks is limited. Aiming at this problem， a LDCT denoising algorithm based on artifacts sensing GAN was proposed. Firstly， an artifacts direction sensing generator was designed. In this generator， on the basis of U-residual encoding and decoding structure， an Artifacts Direction Sensing Sub-module （ADSS） was added to improve the generator’s sensitivity to artifacts direction features. Secondly， the Attention Discriminator （AttD） was designed to improve the ability of distinguishing noise and artifacts. Finally， the loss functions corresponding to the network functions were designed. Through the cooperation of multiple loss functions， the denoising performance of network was improved. Experimental results show that compared to the High-Frequency Sensitive GAN （HFSGAN）， the proposed denoising algorithm has the average Peak Signal-to-Noise Ratio （PSNR） and Structural SIMilarity （SSIM） improved by 4.9% and 2.8% respectively， and has good artifact suppression effect.

Table of Content