Journal of Computer Applications

Research progress on binary code similarity search

Bing XIA, Jianmin PANG, Xin ZHOU, Zheng SHAN

2022, 42(4): 985-998. DOI: 10.11772/j.issn.1001-9081.2021071267

Asbtract ( )

HTML ( )

PDF (841KB) ( )

Figures and Tables | References | Related Articles | Metrics

With the rapid development of Internet of Things （IoT） and industrial Internet， the research of cyberspace security has been paid more and more attention by industry and academia. Because the source code cannot be obtained， binary code similarity search has become a key core technology for vulnerability mining and malware code analysis. Firstly， the basic concepts of binary code similarity search and the framework of binary code similarity search system were introduced. Secondly， the development status of binary code technology about syntax similarity search， semantic similarity search and pragmatic similarity search were discussed. Then， the existing solutions were summarized and compared from the perspectives of binary hash， instruction sequence， graph structure， basic block semantics， feature learning， debugging information recovery and advanced semantic recognition of functions. Finally， the future development direction and prospect of binary code similarity search were looked forward to.

Survey of high utility pattern mining methods based on positive and negative utility division

Ni ZHANG, Meng HAN, Le WANG, Xiaojuan LI, Haodong CHENG

2022, 42(4): 999-1010. DOI: 10.11772/j.issn.1001-9081.2021071268

Asbtract ( )

HTML ( )

PDF (1254KB) ( )

Figures and Tables | References | Related Articles | Metrics

High Utility Pattern Mining （HUPM） is one of the emerging data science research contents. The unit profit and number of items in the transaction database are considered to extract more useful information. The utility value of each item is assumed to be positive by the traditional HUPM methods， but in practical applications， the utility values of some data items may be negative （for example， the profit value of the product is negative due to a loss）， and the pattern mining with negative items is as important as the pattern mining with only positive terms. Firstly， the relevant concepts of HUPM were explained， and the examples of corresponding positive and negative utilities were given. Then， the HUPM methods were divided into positive and negative perspectives， among which the pattern mining methods with positive utility were further divided into dynamic and static database perspectives； the pattern mining methods with negative utility included priori-based， tree-based， utility list-based， and array-based key technologies. the HUPM methods were discussed and summarized from different aspects. Finally， the shortcomings of the existing HUPM methods and the next research directions were given.

Review of applications of natural language processing in text sentiment analysis

Yingjie WANG, Jiuqi ZHU, Zumin WANG, Fengbo BAI, Jian GONG

2022, 42(4): 1011-1020. DOI: 10.11772/j.issn.1001-9081.2021071262

Asbtract ( )

HTML ( )

PDF (783KB) ( )

Figures and Tables | References | Related Articles | Metrics

Text sentiment analysis has gradually become an important part of Natural Language Processing（NLP） in the fields of systematic recommendation and acquisition of user sentiment information， as well as public opinion reference for the government and enterprises. The methods in the field of sentiment analysis were compared and summarized by literature research. Firstly， literature investigation was carried out on the methods of sentiment analysis from the dimensions of time and method. Then， the main methods and application scenarios of sentiment analysis were summarized and compared. Finally， the advantages and disadvantages of each method were analyzed. According to the analysis results， in the face of different task scenarios， there are mainly three sentiment analysis methods： sentiment analysis based on emotion dictionary， sentiment analysis based on machine learning and sentiment analysis based on deep learning. The method based on multi-strategy mixture has become the trend of improvement. Literature investigation shows that there is still room for improvement in the techniques and methods of text sentiment analysis， and it has a large market and development prospects in e-commerce， psychotherapy and public opinion monitoring.

Survey of clustering based on deep learning

Yongfeng DONG, Yahan DENG, Yao DONG, Yacong WANG

2022, 42(4): 1021-1028. DOI: 10.11772/j.issn.1001-9081.2021071275

Asbtract ( )

HTML ( )

PDF (623KB) ( )

Figures and Tables | References | Related Articles | Metrics

Clustering is a technique to find the internal structure between data， which is a basic problem in many data-driven applications. Clustering performance depends largely on the quality of data representation. In recent years， deep learning is widely used in clustering tasks due to its powerful feature extraction ability， in order to learn better feature representation and improve clustering performance significantly. Firstly， the traditional clustering tasks were introduced. Then， the representative clustering methods based on deep learning were introduced according to the network structure， the existing problems were pointed out， and the applications of deep learning based clustering in different fields were presented. At last， the development of deep learning based clustering was summarized and prospected.

Review of key technology and application of wearable electroencephalogram device

Jing QIN, Fali SUN, Fang HUI, Zumin WANG, Bing GAO, Changqing JI

2022, 42(4): 1029-1035. DOI: 10.11772/j.issn.1001-9081.2021071277

Asbtract ( )

HTML ( )

PDF (725KB) ( )

Figures and Tables | References | Related Articles | Metrics

Wearable ElectroEncephaloGram （EEG） device is a wireless EEG system to daily real-time monitoring. It is developed rapidly and widely applied because of its portability， real-time performance， non-invasiveness， and low-cost advantages. This system is mainly composed of hardware parts such as signal acquisition module， signal processing module， micro-control module， communication module and power supply module， and software parts such as mobile terminal module and cloud storage module. The key technologies of wearable EEG devices were discussed. First， the improvement of EEG signal acquisition module was explained. In addition， the comparisons of wearable EEG device signal preprocessing module， signal noise reduction， artifact processing and feature extraction technology were performed. Then， the advantages and disadvantages of machine learning and deep learning classification algorithms were analyzed， and the application fields of wearable EEG device were summarized. Finally， future development trends of the key technologies of wearable EEG device were proposed.

Review of mechanical fault diagnosis technology based on convolutional neural network

Zumin WANG, Zhihao ZHANG, Jing QIN, Changqing JI

2022, 42(4): 1036-1043. DOI: 10.11772/j.issn.1001-9081.2021071266

Asbtract ( )

HTML ( )

PDF (532KB) ( )

Figures and Tables | References | Related Articles | Metrics

In view of the difficulty of traditional mechanical fault diagnosis methods to solve the problem of the uncertainty of manual extraction， a large number of deep learning feature extraction methods have been proposed， which greatly promotes the development of mechanical fault diagnosis. As a typical representative of deep learning， convolution neural networks have made significant developments in image classification， target detection， image semantic segmentation and other fields. There is also a lot of literature in the field of mechanical fault diagnosis. In view of the published literature， in order to further understand the problem of mechanical fault diagnosis by using the method of convolutional neural network， on the basis of a brief introduction to the relevant theories of convolution neural network， and then from the aspects such as data input type， transfer learning， and prediction， the applications of convolution neural network in mechanical fault diagnosis were summarized. Finally， the development directions of convolution neural network and its applications in mechanical fault diagnosis were prospected.

Review of image classification algorithms based on convolutional neural network

Changqing JI, Zhiyong GAO, Jing QIN, Zumin WANG

2022, 42(4): 1044-1049. DOI: 10.11772/j.issn.1001-9081.2021071273

Asbtract ( )

HTML ( )

PDF (605KB) ( )

Figures and Tables | References | Related Articles | Metrics

Convolutional Neural Network （CNN） is one of the important research directions in the field of computer vision based on deep learning at present. It performs well in applications such as image classification and segmentation， target detection. Its powerful feature learning and feature representation capability are admired by researchers increasingly. However， CNN still has problems such as incomplete feature extraction and overfitting of sample training. Aiming at these issues， the development of CNN， classical CNN network models and their components were introduced， and the methods to solve the above issues were provided. By reviewing the current status of research on CNN models in image classification， the suggestions were provided for further development and research directions of CNN.

Knowledge representation learning method incorporating entity description information and neighbor node features

Shoulong JIAO, Youxiang DUAN, Qifeng SUN, Zihao ZHUANG, Chenhao SUN

2022, 42(4): 1050-1056. DOI: 10.11772/j.issn.1001-9081.2021071227

Asbtract ( )

HTML ( )

PDF (671KB) ( )

Figures and Tables | References | Related Articles | Metrics

Knowledge graph representation learning aims to map entities and relations into a low-dimensional dense vector space. Most existing related models pay more attention to learn the structural features of the triples while ignoring the semantic information features of the entity relationships within the triples and the entity description information features outside the triples， so that the abilities of knowledge expression of these models are poor. In response to the above problem， a knowledge representation learning method BAGAT （knowledge representation learning based on BERT model And Graph Attention Network） was proposed by fusing multi-source information. First， the entity target nodes and neighbor nodes of the triples were constructed by combining knowledge graph features， and Graph Attention Network （GAT） was used to aggregate the semantic information representation of the triple structure. Then， the Bidirectional Encoder Representations from Transformers （BERT） word vector model was used to perform the embedded representation of entity description information. Finally， the both representation methods were mapped to the same vector space for joint knowledge representation learning. Experimental results show that BAGAT has a large improvement compared to other models. Among the indicators Hits@1 and Hits@10 on the public dataset FB15K-237， compared with the translation model TransE （Translating Embeddings）， BAGAT is increased by 25.9 percentage points and 22.0 percentage points respectively， and compared with the graph neural network model KBGAT （Learning attention-based embeddings for relation prediction in knowledge graphs）， BAGAT is increased by 1.8 percentage points and 3.5 percentage points respectively， indicating that the multi-source information representation method incorporating entity description information and semantic information of the triple structure can obtain stronger representation learning capability.

Recommendation system based on non-sampling collaborative knowledge graph network

Wenjing JIANG, Xi XIONG, Zhongzhi LI, Binyong LI

2022, 42(4): 1057-1064. DOI: 10.11772/j.issn.1001-9081.2021071255

Asbtract ( )

HTML ( )

PDF (679KB) ( )

Figures and Tables | References | Related Articles | Metrics

Knowledge Graph （KG） can effectively extract information by efficiently organizing massive data. Therefore， recommendation methods based on knowledge graph have been widely studied and applied. Aiming at the sampling error problem of graph neural network in knowledge graph modeling， a method of Non-sampling Collaborative Knowledge graph Network （NCKN） was proposed. Firstly， a non-sampling knowledge dissemination module was designed， in which linear aggregators with different sizes were used in a single convolutional layer to capture deep-level information and achieve efficient non-sampling pre-computation. Then， in order to distinguish the contribution degrees of neighbor nodes， attention mechanism was introduced in the dissemination process. Finally， the collaboration signal of user interaction and knowledge embedding were combined in the collaborative dissemination module to better describe user preferences. Based on three real datasets， the performance of NCKN in CTR （Click Through Rate） prediction and Top-k was evaluated. The experimental results show that compared with the mainstream algorithms RippleNet （Ripple Network） and KGCN （Knowledge Graph Convolutional Network）， the accuracy of NCKN in CTR prediction increases by 2.71% and 4.60%， respectively； in the Top-k forecast， prediction， the accuracy of NCKN increases by 5.26% and 3.91% on average respectively. The proposed method not only solves the sampling error problem of graph neural network in knowledge map modeling， but also improves the accuracy of the recommended model.

Knowledge graph embedding model based on improved Inception structure

Xiaopeng YU, Ruhan HE, Jin HUANG, Junjie ZHANG, Xinrong HU

2022, 42(4): 1065-1071. DOI: 10.11772/j.issn.1001-9081.2021071265

Asbtract ( )

HTML ( )

PDF (570KB) ( )

Figures and Tables | References | Related Articles | Metrics

KGE(Knowledge Graph Embedding） maps entities and relationships into a low-dimensional continuous vector space， uses machine learning methods to implement relational data applications， such as knowledge analysis， reasoning， and completion. Taking ConvE （Convolution Embedding） as a representative， CNN （Convolutional Neural Network） is applied to knowledge graph embedding to capture the interactive information of entities and relationships， but the ability of the standard convolutional to capture feature interaction information is insufficient， and its feature expression ability is low. Aiming at the problem of insufficient feature interaction ability， an improved Inception structure was proposed， based on which a knowledge graph embedding model named InceE was constructed. Firstly， hybrid dilated convolution replaced standard convolution to improve the ability to capture feature interaction information. Secondly， the residual network structure was used to reduce the loss of feature information. The experiments were carried out on the datasets Kinship， FB15k， WN18 to verify the effectiveness of link prediction by InceE. Compared with ArcE and QuatRE models on the Kinship and FB15k datasets， the Hit@1 of InceE increased by 1.6 and 1.5 percentage points； compared with ConvE on the three datasets， the Hit@1 of InceE increased by 6.3， 20.8， and 1.0 percentage points. The experimental results show that InceE has a stronger ability to capture feature interactive information.

Popular science text classification model enhanced by knowledge graph

Wangjing TANG, Bin XU, Meihan TONG, Meihuan HAN, Liming WANG, Qi ZHONG

2022, 42(4): 1072-1078. DOI: 10.11772/j.issn.1001-9081.2021071278

Asbtract ( )

HTML ( )

PDF (1056KB) ( )

Figures and Tables | References | Related Articles | Metrics

Popular science text classification aims to classify the popular science articles according to the popular science classification system. Concerning the problem that the length of popular science articles often exceeds 1 000 words， which leads to the model hard to focus on key points and causes poor classification performance of the traditional models， a model for long text classification combining knowledge graph to perform two-level screening was proposed to reduce the interference of topic-irrelevant information and improve the performance of model classification. First， a four-step method was used to construct a knowledge graph for the domains of popular science. Then， this knowledge graph was used as a distance monitor to filter out irrelevant information through training sentence filters. Finally， the attention mechanism was used to further filter the information of the filtered sentence set， and the attention-based topic classification model was completed. Experimental results on the constructed Popular Science Classification Dataset （PSCD） show that the text classification algorithm model based on the domain knowledge graph information enhancement has higher F1-Score. Compared with the TextCNN model and the BERT （Bidirectional Encoder Representations from Transformers） model， the proposed model has the F1-Score increased by 2.88 percentage points and 1.88 percentage points respectively， verifying the effectiveness of knowledge graph to long text information screening.

Long- and short-term recommendation model and updating method based on knowledge graph preference attention network

Junhua GU, Shuai FAN, Ningning LI, Suqi ZHANG

2022, 42(4): 1079-1086. DOI: 10.11772/j.issn.1001-9081.2021071242

Asbtract ( )

HTML ( )

PDF (785KB) ( )

Figures and Tables | References | Related Articles | Metrics

Current research on knowledge graph recommendation mainly focus on model establishment and training. However， in practical applications， it is necessary to update the model regularly by using incremental updating method to adapt to the changes of preferences of new and old users. Because most of these models only use the users’ long-term interest representations for recommendation， do not consider the users’ short-term interests， and during the aggregation of neighborhood entities to obtain the item vector representation， the interpretability of the aggregation methods is insufficient， and there is the problem of catastrophic forgetting in the process of updating the model， a Knowledge Graph Preference ATtention network based Long- and Short-term recommendation （KGPATLS） model and its updating method were proposed. Firstly， the aggregation method of preference attention network and the user representation method combining users’ long- and short-term interests were proposed through KGPATLS model. Then， in order to alleviate the catastrophic forgetting problem during model update， an incremental updating method Fusing Predict Sampling and Knowledge Distillation （FPSKD） was proposed. The proposed model and incremental updating method were tested on MovieLens-1M and Last.FM datasets. Compared with the optimal baseline model Knowledge Graph Convolutional Network （KGCN）， KGPATLS has the Area Under Curve （AUC） increased by 2.2% and 1.4% respectively and the Accuracy （Acc） increased by 2.5% and 2.9% on the two datasets respectively. Compared with three baseline incremental updating methods on the two datasets， the AUC and Acc indexes of FPSKD are better than those of Fine Tune and Random Sampling respectively， the training time index of FPSKD is reduced to about one eighth and one quarter of that of Full Batch respectively. Experimental results verify the performance of KGPATLS model and that FPSKD can update the model efficiently while maintaining the model performance.

Knowledge graph attention network fusing collaborative filtering information

Junhua GU, Rui WANG, Ningning LI, Suqi ZHANG

2022, 42(4): 1087-1092. DOI: 10.11772/j.issn.1001-9081.2021071269

Asbtract ( )

HTML ( )

PDF (558KB) ( )

Figures and Tables | References | Related Articles | Metrics

Since Knowledge Graph（KG） can alleviate the problems of data sparsity and cold start in collaborative filtering algorithm， it has been widely studied and applied in the recommendation field. Many existing recommendation models based on KG confuse the collaborative filtering information in user-item bipartite graph and the association information between entities in KG， resulting in the learned user vector and item vector cannot accurately express the characteristics of users and items， and even introducing wrong information to interfere with recommendation. Regarding the issues above， a model called KG Attention Network fusing Collaborative Filtering information （KGANCF） was proposed. Firstly， the collaborative filtering information of users and items was dug out by the collaborative filtering layer of the network from the user-item bipartite graph， avoiding the interference of the entity information of KG. Then， the graph attention mechanism was applied in the KG attention embedding layer， the attribute information closely related to users and items was extracted from KG. Finally， the collaborative filtering information and the attribute information in KG were merged at the prediction layer to obtain the final vector representations of users and items， and then the scores of users to items were predicted. The experiments were carried out on MovieLens-20M and Last.FM datasets. Compared with the results of Collaborative Knowledge-aware Attentive Network （CKAN）， on Movielens-20M， F1-score of KGANCF improves by 1.1 percentage points while Area Under Curve （AUC） improves by 0.6 percentage points； on Last.FM， F1-score improves by 3.3 percentage points and AUC improves by 8.5 percentage points. Experimental results show that KGANCF can effectively improve the accuracy of recommendation results， and is significantly better than CKE （Collaborative Knowledge base Embedding），KGCN （Knowledge Graph Convolutional Network），KGAT （Knowledge Graph Attention Network） and CKAN models on datasets with sparse KG.

Knowledge graph recommendation model with multiple time scales and feature enhancement

Suqi ZHANG, Xinxin WANG, Shiyao SHE, Junhua GU

2022, 42(4): 1093-1098. DOI: 10.11772/j.issn.1001-9081.2021071241

Asbtract ( )

HTML ( )

PDF (582KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problems that the existing knowledge graph recommendation models do not consider the periodic features of the user and the items to be recommended will affect the recent interests of the user， a knowledge graph recommendation model with Multiple Time scales and Feature Enhancement （MTFE） was proposed. Firstly， Long Short-Term Memory （LSTM） network was used to mine the user’s periodic features on different time scales and integrate them into user representation. Then， attention mechanism was used to mine the features strongly correlated with the user’s recent features in the items to be recommended and integrate them into the item representation after enhancement. Finally， the scoring function was used to calculate user’s ratings of items to be recommended. The proposed model was compared with PER（Personalized Entity Recommendation）， CKE（Collaborative Knowledge base Embedding）， LibFM， RippleNet， KGCN（Knowledge Graph Convolutional Network）， CKAN（Collaborative Knowledge-aware Attentive Network） knowledge graph recommendation models on real datasets Last.FM， MovieLens-1M and MovieLens-20M. Experimental results show that compared with the model with the best prediction performance， MTFE model has the F1 value improved by 0.78 percentage points， 1.63 percentage points and 1.92 percentage points and the Area Under Curve of ROC （AUC）metric improved by 3.94 percentage points， 2.73 percentage points and 1.15 percentage points on three datasets respectively. In summary， compared with comparative knowledge graph recommendation models， the proposed knowledge graph recommendation model has better recommendation effect.

Sentiment analysis based on sentiment lexicon and stacked residual Bi-LSTM network

Haoran LUO, Qing YANG

2022, 42(4): 1099-1107. DOI: 10.11772/j.issn.1001-9081.2021071179

Asbtract ( )

HTML ( )

PDF (887KB) ( )

Figures and Tables | References | Related Articles | Metrics

Sentiment analysis， as a subdivision of Natural Language Processing（NLP）， has experienced the development of using sentiment lexicon， machine learning and deep learning to analyze. According to the problem of low accuracy， over fitting phenomenon in training process and low coverage， large workload when compiling the sentiment lexicon when using the generalized deep learning model as a text classifier to analysis of Web text reviews in a specific field， a sentiment analysis model based on sentiment lexicon and stacked residual Bidirectional Long Short-Term Memory （Bi-LSTM） network was proposed. Firstly， the sentiment words in the sentiment lexicon were designed to cover the professional words in the research field of "educational robot"， thereby making up for the lack of accuracy of Bi-LSTM model in analyzing such texts. Then， Bi-LSTM and SnowNLP were used to reduce the volume of compilation of the sentiment lexicon. The memory gate and forget gate structures of Long Short-Term Memory （LSTM） network were able to ensure that the relevance of the words before and after in the comment text were fully considered with some analyzed words selected to be forgotten at the same time， thereby avoiding the problem of gradient explosion during the back propagation. After the introduction of the stacked residual Bi-LSTM， not only the number of layers of the model was deepened to 8， but also the "degradation" problem caused by the residual network stacking LSTM was avoided. Finally， by setting and adjusting the score weights of the two parts appropriately， and the sigmoid activation function was used to normalize the total score to the interval of ［0，1］. According to the interval division of ［0，0.5］ and （0.5，1］， negative and positive emotions were represented respectively， and sentiment classification was completed. Experimental results show that the sentiment classification accuracy of the proposed classification model for the reviews dataset about "educational robot" is improved by about 4.5 percentage points compared with the standard LSTM model and by about 2.0 percentage points compared with the BERT （Bidirectional Encoder Representation from Transformers）. In conclusion， the sentiment classification model based on sentiment lexicon and deep learning classification model was generalized by the proposed model， and by modifying the sentiment words in the lexicon and appropriately adjusting the layer number and the structure of the deep learning model， the proposed model can be applied to accurate sentiment analysis of shopping reviews of all kinds of goods in e-commerce platform， thereby helping enterprises to understand the consumers’ shopping psychology and the market demand， as well as providing consumers with a reference standard for the quality of goods.

Text sentiment analysis method combining generalized autoregressive pre-training language model and recurrent convolutional neural network

Lie PAN, Cheng ZENG, Haifeng ZHANG, Chaodong WEN, Rusong HAO, Peng HE

2022, 42(4): 1108-1115. DOI: 10.11772/j.issn.1001-9081.2021071180

Asbtract ( )

HTML ( )

PDF (728KB) ( )

Figures and Tables | References | Related Articles | Metrics

Traditional machine learning methods fail to fully dig out semantic information and association information when classifying the sentiment polarity of online comment text. Although the existing deep learning methods can extract the semantic information and contextual information， the process is often one-way and there are some deficiencies in the process of obtaining the deep semantic information of comment text. Aiming at the above problems， a text sentiment analysis method was proposed by combining generalized autoregressive pretraining for language understanding model （XLNet） and RCNN （Recurrent Convolutional Neural Network）. Firstly， XLNet was used to represent the text features. And by introducing the segment-level recurrence mechanism and relative position information encoding， the contextual information of comment text was fully considered， thereby improving the expression ability of text features effectively. Then， RCNN was used to train the text features in both directions and extract the context semantic information of the text at a deeper level， thereby improving the comprehensive performance in the sentiment analysis task. The experiments with the proposed method were carried out on three public datasets weibo-100k， waimai-10k and ChnSentiCorp. The results show that the accuracy reaches 96.4%， 91.8% and 92.9% respectively， which proves the effectiveness of the proposed method in the sentiment analysis task.

News topic text classification method based on BERT and feature projection network

Haifeng ZHANG, Cheng ZENG, Lie PAN, Rusong HAO, Chaodong WEN, Peng HE

2022, 42(4): 1116-1124. DOI: 10.11772/j.issn.1001-9081.2021071257

Asbtract ( )

HTML ( )

PDF (1536KB) ( )

Figures and Tables | References | Related Articles | Metrics

Concerning the problems of the lack of standard words， fuzzy semantics and feature sparsity in news topic text， a news topic text classification method based on Bidirectional Encoder Representations from Transformers（BERT） and Feature Projection network（FPnet） was proposed. The method includes two implementation modes. In mode 1： the multiple-layer fully connected layer features were extracted from the output of news topic text at BERT model， and the final extracted text features were purified with the combination of feature projection method， thereby strengthening the classification effect. In mode 2， the feature projection network was fused in the hidden layer inside the BERT model for feature projection， so that the classification features were enhanced and purified through the hidden layer feature projection. Experimental results on Toutiao， Sohu News， THUCNews-L、THUCNews-S datasets show that the two above modes have better performance in accuracy and macro-averaging F1 value than baseline BERT method with the highest accuracy reached 86.96%， 86.17%， 94.40% and 93.73% respectively， which proves the feasibility and effectiveness of the proposed method.

Attention mechanism based Stack-CNN model to support Chinese medical questions and answers

Teng TENG, Haiwei PAN, Kejia ZHANG, Xuelian MU, Ximing ZHANG, Weipeng CHEN

2022, 42(4): 1125-1130. DOI: 10.11772/j.issn.1001-9081.2021071272

Asbtract ( )

HTML ( )

PDF (726KB) ( )

Figures and Tables | References | Related Articles | Metrics

Most of the current Chinese questions and answers matching technologies require word segmentation first， and the word segmentation problem of Chinese medical text requires maintenance of medical dictionaries to reduce the impact of segmentation errors on subsequent tasks. However， maintaining dictionaries requires a lot of manpower and knowledge， making word segmentation problem always be a great challenge. At the same time， the existing Chinese medical questions and answers matching methods all model the questions and the answers separately， and do not consider the relationship between the keywords contained in the questions and the answers respectively. Therefore， an Attention mechanism based Stack Convolutional Neural Network （Att-StackCNN） model was proposed to solve the problem of Chinese medical questions and answers matching. Firstly， character embedding was used to encode the questions and answers to obtain the respective character embedding matrices. Then， the respective feature attention mapping matrices were obtained by constructing the attention matrix using the character embedding matrices of the questions and answers. After that， Stack Convolutional Neural Network （Stack-CNN） model was used to perform convolution operation to the above matrices at the same time to obtain the respective semantic representations of the questions and answers. Finally， the similarity was calculated， and the max-margin loss was calculated by using the similarity to update the network parameters. On the cMedQA dataset， the Top-1 accuracy of proposed model was about 1 percentage point higher than that of Stack-CNN model and about 0.5 percentage point higher than that of Multi-CNNs model. Experimental results show that Att-StackCNN model can improve the matching effect of Chinese medical questions and answers.

Improved federated weighted average algorithm

Changyin LUO, Junyu WANG, Xuebin CHEN, Chundi MA, Shufen ZHANG

2022, 42(4): 1131-1136. DOI: 10.11772/j.issn.1001-9081.2021071264

Asbtract ( )

HTML ( )

PDF (468KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problem that the improved federated average algorithm based on analytic hierarchy process was affected by subjective factors when calculating its data quality， an improved federated weighted average algorithm was proposed to process multi-source data from the perspective of data quality. Firstly， the training samples were divided into pre-training samples and pre-testing samples. Then， the accuracy of the initial global model on the pre-training data was used as the quality weight of the data source. Finally， the quality weight was introduced into the federated average algorithm to reupdate the weights in the global model. The simulation results show that the model trained by the improved federal weighted average algorithm get the higher accuracy compared with the model trained by the traditional federal average algorithm， which is improved by 1.59% and 1.24% respectively on equally divided and unequally divided datasets. At the same time， compared with the traditional multi-party data retraining method， although the accuracy of the proposed model is slightly reduced， the security of data and model is improved.

Ensemble classification algorithm based on dynamic weighting function

Le WANG, Meng HAN, Xiaojuan LI, Ni ZHANG, Haodong CHENG

2022, 42(4): 1137-1147. DOI: 10.11772/j.issn.1001-9081.2021071259

Asbtract ( )

HTML ( )

PDF (838KB) ( )

Figures and Tables | References | Related Articles | Metrics

In data stream ensemble classification， to make the classifiers adapt to the constantly changing data stream and adjust the weights of base classifiers to select an appropriate set of classifiers， an ensemble classification algorithm based on dynamic weighting function was proposed. Firstly， a new weighting function was proposed to adjust the weights of the base classifiers， and the classifiers were trained with constantly updated data blocks. Then a weight function was used to make a reasonable selection of candidate classifiers. Finally， the incremental nature of decision tree was applied to the base classifiers， and the classification of data stream was realized. Through a large amount of experiments， it is found that the performance of the proposed algorithm is not affected by block size. Compared with AUE2 algorithm， the average number of leaves is reduced by 681.3， the average number of nodes is reduced by 1 192.8， and the average depth of the tree is reduced by 4.42. At the same time， the accuracy is relatively improved and the time-consuming is reduced. Experimental results show that the algorithm can not only guarantee the accuracy but also save a lot of memory and time when classifying data stream.

Sparse subspace clustering method based on random blocking

Qi ZHANG, Bochuan ZHENG, Zheng ZHANG, Huanhuan ZHOU

2022, 42(4): 1148-1154. DOI: 10.11772/j.issn.1001-9081.2021071271

Asbtract ( )

HTML ( )

PDF (734KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problem of big clustering error of the Sparse Subspace Clustering （SSC） methods， an SSC method based on random blocking was proposed. First， the original problem dataset was divided into several subsets randomly to construct several sub-problems. Then， after obtaining the coefficient matrices of several sub-problems by the sparse subspace Alternating Direction Method of Multipliers （ADMM） respectively， these coefficient matrices were expanded into coefficient matrices of the same size as the original problem and integrated into a coefficient matrix. Finally， a similarity matrix was calculated according to the coefficient matrix obtained by the integration， and the clustering result of the original problem was obtained by using the Spectral Clustering （SC） algorithm. The SSC method based on random blocking has the subspace clustering error reduced by 3.12 percentage points on average compared with the optional algorithm among SSC， Stochastic Sparse Subspace Clustering via Orthogonal Matching Pursuit with Consensus （S³COMP-C）， scalable Sparse Subspace Clustering by Orthogonal Matching Pursuit （SSCOMP）， SC and K-Means algorithms， and has all the mutual information， Rand index and entropy significantly better than comparison algorithms. Experimental results show that the SSC method based on random blocking can significantly reduce subspace clustering error， and improve the clustering performance.

Influence maximization algorithm based on node coverage and structural hole

Jie YANG, Mingyang ZHANG, Xiaobin RUI, Zhixiao WANG

2022, 42(4): 1155-1161. DOI: 10.11772/j.issn.1001-9081.2021071256

Asbtract ( )

HTML ( )

PDF (829KB) ( )

Figures and Tables | References | Related Articles | Metrics

Influence maximization is one of the important issues in social network analysis， which aims to identify a small group of seed nodes. When these nodes act as initial spreaders， information can be spread to the remaining nodes as much as possible in the network. The existing heuristic algorithms based on network topology usually only consider one single network centrality， failing to comprehensively combine node characteristics and network topology； thus， their performance is unstable and can be easily affected by the network structure. To solve the above problem， an influence maximization algorithm based on Node Coverage and Structural Hole （NCSH） was proposed. Firstly， the coverages and grid constraint coefficients of all nodes were calculated. Then the seed was selected according to the principle of maximum coverage gain. Secondly， if there were multiple nodes with the same gain， the seed was selected according to the principle of minimum grid constraint coefficient. Finally， the above steps were performed repeatedly until all seeds were selected. The proposed NCSH maintains good performance on six real networks under different numbers of seeds and different spreading probabilities. NCSH achieves 3.8% higher node coverage than to the similar NCA （Node Coverage Algorithm） on average， and 43% lower time consumption than the similar SHDD （maximization algorithm based on Structure Hole and DegreeDiscount）. The experimental results show that the NCSH can effectively solve the problem of influence maximization.

Overlapping community detection algorithm combining K-shell and label entropy

Jing CHEN, Jiangchuan LIU, Nana WEI

2022, 42(4): 1162-1169. DOI: 10.11772/j.issn.1001-9081.2021071183

Asbtract ( )

HTML ( )

PDF (616KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to solve the problems of insufficient stability and poor accuracy of label propagation algorithms， a label propagation overlapping community detection algorithm OCKELP （Overlapping Community detection algorithm combining K-shell and label Entropy in Label Propagation） was proposed， which combined K-shell and label entropy. Firstly， the K-shell algorithm was used to reduce the label initialization time， and the update sequence of label entropy was used to improve the stability of the algorithm. Secondly， the comprehensive influence was introduced for label selection， and the community level information and node local information were fused to improve the accuracy of the algorithm. Compared with Community Overlap PRopagation Algorithm （COPRA）， Overlapping community detection in complex networks based on Multi Kernel Label Propagation（OMKLP） and Speaker-listener Label Propagation Algorithm （SLPA）， OCKELP algorithm has the greatest modularity improvement of about 68.64%， 53.99% and 42.29% respectively on the real network datasets. It also has obvious advantages over the other three algorithms in the Normalized Mutual Information （NMI） value of the artificial network datasets， and with the increase of the number of communities to which overlapping nodes belong， the real structures of the communities can also be excavated.

Enterprise portrait construction method based on label layering and deepening modeling

Xingshuo DING, Xiang LI, Qian XIE

2022, 42(4): 1170-1177. DOI: 10.11772/j.issn.1001-9081.2021071248

Asbtract ( )

HTML ( )

PDF (1076KB) ( )

Figures and Tables | References | Related Articles | Metrics

Label modeling is the basic task of label system construction and portrait construction. Traditional label modeling methods have problems such as difficulty in processing fuzzy labels， unreasonable label extraction， and ineffective integration of multi-modal entities and multi-dimensional relationships. Aiming at these problems， an enterprise profile construction method based on label layering and deepening modeling， called EPLLD （Enterprise Portrait of Label Layering and Deepening）， was proposed. Firstly， the multi-characteristic information was extracted through multi-source information fusion， and the fuzzy labels of enterprises （such as labels in wholesale and retail industries that cannot fully summarize the characteristics of enterprises） were counted and screened. Secondly， the professional domain lexicon was established for feature expansion， and the BERT （Bidirectional Encoder Representation from Transformers） language model was combined for multi-feature extraction. Thirdly， Bi-directional Long Short-Term Memory （BiLSTM） was used to obtain fuzzy label deepening results. Finally， the keywords were extracted through TF-IDF （Term Frequency-Inverse Document Frequency）， TextRank， and Latent Dirichlet Allocation （LDA） model to achieve label layering and deepening modeling. Experimental analysis on the same enterprise dataset shows that the precision of EPLLD in the fuzzy label deepening task is 91.11%， which is higher than those of 8 label processing methods such as BiLSTM+Attention and BERT+Deep CNN.

Method for discovering important nodes in food safety standard reference network based on multi-attribute comprehensive evaluation

Zhigang HAO, Li QIN

2022, 42(4): 1178-1185. DOI: 10.11772/j.issn.1001-9081.2021071245

Asbtract ( )

HTML ( )

PDF (838KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at how to use the food safety standard reference network to find the key standards that have a great impact on food safety inspection and detection from many national food safety standards， a method of finding the important nodes in the food safety standard reference network based on multi-attribute comprehensive evaluation was proposed. Firstly， the importance of standard nodes were evaluated by using the degree centrality， closeness centrality and betweenness centrality in social network analysis as well as the Web page importance evaluation algorithm PageRank respectively. Secondly， the Analytic Hierarchy Process （AHP） was used to calculate the weight of each evaluation index in the importance evaluation， and multi-attribute decision-making method based on TOPSIS （Technique for Order Preference by Similarity to an Ideal Solution） was used to comprehensively evaluate the importance of standard nodes and found out the important nodes. Thirdly， the important nodes obtained from the comprehensive evaluation and the important nodes obtained from the degree based evaluation were deleted from their own reference network respectively， and the connectivity of the reference networks after deleting the important nodes was tested. The worse the connectivity was， the more important the nodes were. Finally， the Louvain community discovery algorithm was used to test the network connectivity， that is to find the communities of the network nodes. The more nodes not included in the communities， the worse the network connectivity. Experimental results show that after deleting the important nodes found by the comprehensive evaluation method based on multi-attribute， more nodes cannot be included in the communities than those of the evaluation method based on degree， proving that the proposed method can better find the important nodes in the reference network. The proposed method helps standard makers quickly grasp the core contents and key nodes when revising and updating standards， and plays a guiding role in the construction of the system of national food safety standards.

Harris hawks optimization algorithm based on chemotaxis correction

Cheng ZHU, Xuhua PAN, Yong ZHANG

2022, 42(4): 1186-1193. DOI: 10.11772/j.issn.1001-9081.2021071244

Asbtract ( )

HTML ( )

PDF (786KB) ( )

Figures and Tables | References | Related Articles | Metrics

Focused on the disadvantages of slow convergence and easy to fall into local optimum of Harris Hawks Optimization （HHO） algorithm， an improved HHO algorithm called Chemotaxis Correction HHO （CC-HHO） algorithm was proposed. Firstly， the state of convergence curve was identified by calculating the rate of decline and change weight of the optimal solution. Secondly， the CC mechanism of the Bacterial Foraging Optimization （BFO） algorithm was introduced into the local search stage to improve the accuracy of optimization. Thirdly， the law of energy consumption was integrated into the updating process of the escape energy factor and the jump distance to balance the exploration and exploitation. Fourthly， elite selection for different combinations of optimal solution and sub-optimal solution was used to improve the universality of global search of the algorithm. Finally， when the search was falling into local optimum， the escape energy was disturbed to realize the forced jumping out. The performance of the improved algorithm was tested by ten benchmark functions. The results show that the search accuracy of CC-HHO algorithm on unimodal functions is better than those of Gravitational Search Algorithm （GSA）， Particle Swarm Optimization （PSO） algorithm， Whale Optimization Algorithm （WOA） and other four improved HHO algorithms for more than ten orders of magnitude； there is also more than one order of magnitude superiority on multimodal functions； on the premise that search stability is improved by more than 10% on average， the proposed algorithm has faster convergence speed significantly than the above-mentioned several comparative optimization algorithms with more obvious convergence trend. Experimental results show that CC-HHO algorithm effectively improves the efficiency and robustness of the original algorithm.

Solving dynamic traveling salesman problem by deep reinforcement learning

Haojie CHEN, Jiangting FAN, Yong LIU

2022, 42(4): 1194-1200. DOI: 10.11772/j.issn.1001-9081.2021071253

Asbtract ( )

HTML ( )

PDF (795KB) ( )

Figures and Tables | References | Related Articles | Metrics

Designing a unified solution to the combinational optimization problems of undesigned heuristic algorithms has become a research hotspot in the field of machine learning. At present， mature technologies are mainly aiming at static combinatorial optimization problems， but the combinational optimization problems with dynamic changes are not fully solved. In order to solve above problems， a lightweight model called Dy4TSP （Dynamic model for Traveling Salesman Problems） was proposed， which combined multi-head-attention mechanism with distributed reinforcement learning to solve the traveling salesman problem on a dynamic graph. Firstly， the node representation vector from graph convolution neural network was processed by the prediction network based on multi-head-attention mechanism. Then， the distributed reinforcement learning algorithm was used to quickly predict the possibility that each node in the graph was output as the optimal solution， and the optimal solution space of the problems in different possibilities were comprehensively explored. Finally， the action decision sequence which could meet the specific reward function in real time was generated by the trained model. The model was evaluated on three typical combinatorial optimization problems， and the experimental results showed that the solution qualities of the proposed model are 0.15 to 0.37 units higher than those of the open source solver LKH3 （Lin-Kernighan-Helsgaun 3）， and are significantly better than those of the latest algorithms such as Graph Attention Network with Edge Embedding （EGATE）. The proposed model can reach an optimal path gap of 0.1 to 1.05 in other dynamic traveling salesman problems， and the results are slightly better.

Network intrusion detection algorithm based on sparrow search algorithm and improved particle swarm optimization algorithm

Bing GAO, Ya ZHENG, Jing QIN, Qijie ZOU, Zumin WANG

2022, 42(4): 1201-1206. DOI: 10.11772/j.issn.1001-9081.2021071276

Asbtract ( )

HTML ( )

PDF (616KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problem of insufficient adaptive ability of network intrusion detection models， the large-scale fast search ability of Sparrow Search Algorithm （SSA） was introduced into Particle Swarm Optimization （PSO） algorithm， and a network intrusion detection algorithm based on Sparrow Search Algorithm and improved Particle Swarm Optimization Algorithm （SSAPSO） was proposed. In the algorithm， by optimizing the parameters that are difficult to set in Light Gradient Boosting Machine （LightGBM） algorithm， PSO algorithm converged quickly while ensuring the optimization accuracy， and an optimal network intrusion detection model was obtained. Simulation results show that on the four benchmark functions， SSAPSO converged faster than basic PSO algorithm. Compared with Categorical features+gradient Boosting （CatBoost） algorithm， SSAPSO optimized LightGBM （SSAPSO-LightGBM） has the accuracy， recall， precision and F1_score improved by 15.12%， 3.25%， 21.26% and 12.25% respectively on KDDCUP99 dataset. Compared with LightGBM algorithm， SSAPSO-LightGBM has the detection accuracy for Normal， Remote-to-Login （R2L） attack， User-to-Root （U2R） attack and Probeing （PROBE） attack on the above dataset improved by 0.61%， 3.14%， 4.24%， 1.04% and 5.03% respectively.

Fault diagnosis method based on improved one-dimensional convolutional and bidirectional long short-term memory neural networks

Yongfeng DONG, Yuehua SUN, Lichao GAO, Peng HAN, Haipeng JI

2022, 42(4): 1207-1215. DOI: 10.11772/j.issn.1001-9081.2021071243

Asbtract ( )

HTML ( )

PDF (2185KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problems of the slow model convergence and low diagnosis accuracy due to the time-series fault diagnosis data with strong noise in the industrial field， an improved one-Dimensional Convolutional and Bidirectional Long Short-Term Memory（1DCNN-BiLSTM） Neural Network fault diagnosis method was proposed. The method includes preprocessing of fault vibration signals， automatic feature extraction and vibration signal classification. Firstly， Complete Ensemble Empirical Mode Decomposition with Adaptive Noise （CEEMDAN） technology was used to preprocess the original vibration signal. Secondly， the 1DCNN-BiLSTM dual channel model was constructed， and the processed signal was input into the Bidirectional Long Short-Term Memory （BiLSTM） model channel and the One-dimensional Convolution Neural Network （1DCNN） model channel to fully extract the timing correlation characteristics， the non-correlation characteristics of the local space and the weak periodic laws of the signal. Thirdly， in response to the problem of strong noise in the signal， the Squeeze and Excitation Network （SENet） module was improved and applied to the two different channels. Finally， the features extracted from the two channels were fused by putting them into the fully connected layer， and the accurate identification of equipment faults was realized by the help of the Softmax classifier. The bearing dataset of Case Western Reserve University was used for experimental comparison and verification. The results show that after applying the improved SENet module to the 1DCNN channel and the stacked BiLSTM channel at the same time， the 1DCNN-BiLSTM dual channel model performs the highest diagnosis accuracy 96.87% with fast convergence， which is better than traditional one-channel models， thereby effectively improving the efficiency of equipment fault diagnosis.

Secure offloading optimization of wireless powered mobile edge computing system

Xuling ZENG, Taoshen LI, Jian GONG, Lijun DU

2022, 42(4): 1216-1224. DOI: 10.11772/j.issn.1001-9081.2021071254

Asbtract ( )

HTML ( )

PDF (827KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problem of malicious eavesdropping nodes in the energy limited multi-user Mobile Edge Computing （MEC） system， a joint Wireless Power Transfer （WPT） and MEC secure partial computing offloading programme was proposed. In order to minimize the energy consumption of the system Access Point （AP）， the AP energy transmission covariance matrix， local CPU frequency， user unloading bits， user offloading time allocation and user transmission power were jointly optimized under the constraints of computing delay， secure offloading and energy capture. For the AP energy consumption minimization was a non-convex problem， firstly， the original non-convex problem was transformed into a convex problem by Difference of Convex Algorithm （DCA）. Then， the optimal solution of the problem was obtained in semi-closed form by Lagrange duality method. When the number of computing tasks is 5 × 10⁵ bits， compared with local computing offloading and secure full computing offloading， the energy consumption of secure partial offloading scheme was reduced by 61.3% and 84.4%， respectively； when the distance between eavesdropping nodes exceeds 25 m， the energy consumed by the secure partial offloading scheme is much less than those of local computing offloading and secure full computing offloading. The simulation results show that the proposed scheme can effectively reduce AP power consumption and enhance system performance gain while ensuring the secure offloading of the physical layer.

Transmission control protocol congestion control switching scheme based on scenario change

Hanguang LAI, Qing LI, Yong JIANG

2022, 42(4): 1225-1234. DOI: 10.11772/j.issn.1001-9081.2021050722

Asbtract ( )

HTML ( )

PDF (1097KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problem that the performance of lightweight learning-based congestion control algorithms will fall off a cliff in some scenarios， a transmission control protocol congestion control switching scheme based on scenario change was proposed. Firstly， the real-time network environment was simulated by this scheme. Then， the scenario was identified according to the real-time environment parameters. Finally， the current congestion control algorithm was switched to the relatively optimal lightweight learning-based congestion control algorithm in this scenario. Experimental results prove that the proposed scheme is able to significantly improve network performance compared to the original schemes using a single congestion control algorithm， such as congestion control based on measuring Bottleneck Bandwidth and Round-trip propagation time （BBR） and Performance-oriented Congestion Control （PCC） with a total throughput increase of more than 5% and a total delay drop of more than 10%.

Data sharing method of industrial internet of things based on federal incremental learning

Jing LIU, Zhihong DONG, Zheyu ZHANG, Zhigang SUN, Haipeng JI

2022, 42(4): 1235-1243. DOI: 10.11772/j.issn.1001-9081.2021071182

Asbtract ( )

HTML ( )

PDF (763KB) ( )

Figures and Tables | References | Related Articles | Metrics

In view of the large amount of new data in the Industrial Internet Of Things（IIOT） and the imbalance of data at the factory sub-ends， a data sharing method of IIOT based on Federal Incremental Learning （FIL-IIOT） was proposed. Firstly， the industry federation model was distributed to the factory sub-end as the local initial model. Then， the federal sub-end optimization algorithm was proposed to dynamically adjust the participating subset. Finally， the incremental weight of the factory sub-end was calculated through the federal incremental learning algorithm， thereby integrating the new state data with the original industry federation model quickly. Experimental results the Case Western Reserve University （CWRU） bearing failure dataset show that the proposed FIL-IIOT makes the accuracy of bearing fault diagnosis reached 93.15%， which is 6.18 percentage points and 2.59 percentage points higher than those of Federated Averaging （FedAvg） algorithm and FIL-IIOT of Non Increment （FIL-IIOT-NI） method， respectively. The proposed method meets the needs of continuous optimization of industry federation model based on industrial incremental data.

Fast failure recovery method based on local redundant hybrid code

Jingyu LIU, Qiuxia NIU, Xiaoyan LI, Qiaoshuo SHI, Youxi WU

2022, 42(4): 1244-1252. DOI: 10.11772/j.issn.1001-9081.2021111917

Asbtract ( )

HTML ( )

PDF (926KB) ( )

Figures and Tables | References | Related Articles | Metrics

The parity blocks of the Maximum-Distance-Separable （MDS） code are all global parity blocks. The length of the reconstruction chain increases with the expansion of the storage system， and the reconstruction performance gradually decreases. Aiming at the above problems， a new type of Non-Maximum-Distance-Separable （Non-MDS） code called local redundant hybrid code Code-LM（s，c） was proposed. Firstly， two types of local parity blocks called horizontal parity block in the strip-set and horizontal-diagonal parity block were added in any strip-sets to reduce the length of the reconstruction chain， and the parity layout of the local redundant hybrid code was designed. Then， four reconstruction formulations of the lost data blocks were designed according to the generation rules of the parity blocks and the common block existed in the reconstruction chains of different data blocks. Finally， double-disk failures were divided into three situations depending on the distances of the strip-sets where the failed disks located and the corresponding reconstruction methods were designed. Theoretical analysis and experimental results show that with the same storage scale， compared with RDP （Row-Diagonal Parity）， the reconstruction time of CodeM（s，c） for single-disk failure and double-disk failure can be reduced by 84% and 77% respectively； compared with V²-Code， the reconstruction time of Code-LM（s，c） for single-disk failure and double-disk failure can be reduced by 67% and 73% respectively. Therefore， local redundant hybrid code can support fast recovery from failed disks and improve reliability of storage system.

Facial expression recognition algorithm based on combination of improved convolutional neural network and support vector machine

Guifang QIAO, Shouming HOU, Yanyan LIU

2022, 42(4): 1253-1259. DOI: 10.11772/j.issn.1001-9081.2021071270

Asbtract ( )

HTML ( )

PDF (1504KB) ( )

Figures and Tables | References | Related Articles | Metrics

In view of the problems of the current Convolutional Neural Network （CNN） using end layer features to recognize facial expression， such as complex model structure， too many parameters and unsatisfactory recognition， an optimization algorithm based on the combination of improved CNN and Support Vector Machine （SVM） was proposed. First， the network model was designed by the idea of continuous convolution to obtain more nonlinear activations. Then， the adaptive Global Average Pooling （GAP） layer was used to replace the fully connected layer in traditional CNN to reduce the network parameters. Finally， in order to improve generalization ability of the model， SVM classifier instead of the traditional Softmax function was used to realize expression recognition. Experimental results show that the proposed algorithm achieves 73.4% and 98.06% recognition accuracy on Fer2013 and CK+ datasets， which is 2.2 percentage points higher than the traditional LeNet-5 algorithm on Fer2013 dataset. Moreover， this network model has simple structure， less parameters and good robustness.

Homologous spectrogram feature fusion with self-attention mechanism for bird sound classification

Zhihua LIU, Wenjie CHEN, Aibin CHEN

2022, 42(4): 1260-1268. DOI: 10.11772/j.issn.1001-9081.2021071258

Asbtract ( )

HTML ( )

PDF (1376KB) ( )

Figures and Tables | References | Related Articles | Metrics

At present， most deep learning models are difficult to deal with the classification of bird sound under complex background noise. Because bird sound has the continuity characteristic in time domain and high-low characteristic in frequency domain， a fusion model of homologous spectrogram features was proposed for bird sound classification under complex background noise. Firstly， Convolutional Neural Network （CNN） was used to extract Mel-spectrogram features of bird sound. Then， the time domain and frequency domain dimensions of the same Mel-spectrogram feature were compressed to 1 by specific convolution and down-sampling operations， so that frequency domain feature with only high-low characteristics and the time domain feature with only continuous characteristics were obtained. Based on the above operation to extract frequency domain and time domain features， the features of Mel-spectrogram were extracted both in time domain and frequency domain， the time-frequency domain features with continuity and high-low characteristics were obtained. Then the self-attention mechanism was applied to the obtained time domain， frequency domain and time-frequency domain features， strengthening their own characteristics. Finally， the results of these three homologous spectrogram features after decision fusion were used for bird sound classification. The proposed model was used for audio classification of 8 bird species on Xeno-canto website， achieved the better result in the comparison experiment with the Mean Average Precision （MAP） of 0.939. The experimental results show that the proposed model can deal with the problem of the poor classification effect of bird sound under complex background noise.

Cascaded cross-domain feature fusion for virtual try-on

Xinrong HU, Junyu ZHANG, Tao PENG, Junping LIU, Ruhan HE, Kai HE

2022, 42(4): 1269-1274. DOI: 10.11772/j.issn.1001-9081.2021071274

Asbtract ( )

HTML ( )

PDF (1058KB) ( )

Figures and Tables | References | Related Articles | Metrics

The virtual try-on technologies based on image synthesis mask strategy can better retain details of the clothing when the warped clothing is fused with the human body. However， because the position and structure of the human body and the clothing are difficult to align during the try-on process， the try-on result is likely to produce severe occlusion， affecting visual effect. In order to solve the occlusion in the try-on process， a U-Net based generator was proposed. In the generator， a cascaded spatial attention module and a channel attention module were added to the U-Net decoder， thereby achieving the cross-domain fusion between local features of warped clothes and global features of the human body. Formally， first， by predicting the Thin Plate Spline （TPS） conversion using the convolutional network， the clothing was distorted according to the target human body pose. Then， the dressed-on person representation information and the warped clothing were input into the proposed generator， and the mask image of the corresponding clothing area was obtained to render the intermediate result. Finally， the strategy of mask synthesis was used to synthesize the warped clothing with the intermediate result through mask processing to obtain the final try-on result. Experimental results show that the proposed method can not only reduce occlusion， but also enhance image details. Compared with Characteristic-Preserving Virtual Try-On Network （CP-VTON） method， the proposed method has the generated image with the average Peak Signal-to-Noise Ratio （PSNR） increased by 10.47%， the average Fréchet Inception Distance （FID） decreased by 47.28%， and the average Structural SIMilarity （SSIM） increased by 4.16%.

Automatic detection algorithm for underground target based on adaptive double threshold

Haifeng LI, Bifan ZHAO, Jinyi HOU, Huaichao WANG, Zhongcheng GUI

2022, 42(4): 1275-1283. DOI: 10.11772/j.issn.1001-9081.2021071263

Asbtract ( )

HTML ( )

PDF (1999KB) ( )

Figures and Tables | References | Related Articles | Metrics

When using the Bscan image generated by Ground Penetrating Radar （GPR） to detect underground targets， the current target detection network models based on deep learning have some problems， such as high demand of training samples， long time consuming， unable to distinguish the significance of targets， and difficult to identify complex targets. To solve the above problems， a double threshold segmentation algorithm based on histogram was proposed. Firstly， based on the distribution characteristics of GPR image histogram of underground target， two thresholds for underground target segmentation were calculated quickly from the histogram. Then， a combination classifier model with Support Vector Machine （SVM） and LeNet was used to classify the segmentation results. Finally， classification results were integrated and the accuracy values were counted. Compared with the traditional threshold segmentation algorithms such as Ostu and iterative methods， the structure of the underground target segmentation results obtained by the proposed algorithm was more complete and almost free of noise. On the real dataset， the average recognition accuracy of the proposed algorithm reached more than 90%， which was more than 40% higher than that of the algorithm using a single classifier. The experimental results show that the salient and non-salient underground targets can be effectively segmented at the same time， and the combination classifier can obtain better classification results. It is suitable for automatic detection and recognition of underground targets with small sample datasets.

Wildlife object detection combined with solving method of long-tail data

Qianzhou CAI, Bochuan ZHENG, Xiangyin ZENG, Jin HOU

2022, 42(4): 1284-1291. DOI: 10.11772/j.issn.1001-9081.2021071279

Asbtract ( )

HTML ( )

PDF (4784KB) ( )

Figures and Tables | References | Related Articles | Metrics

Wild animal object detection based on infrared camera images is conducive to the research and protection of wild animals. Because of the large difference in the number of different species of wildlife， there is the long-tail data problem of uneven distribution of numbers of species in the wildlife dataset collected by infrared cameras. This problem affects the overall performance improvement of the object detection neural network models. In order to solve the problem of low accuracy of object detection caused by long-tail data of wild animals， a method based on two-stage learning and re-weighting to solve long-tail data was proposed， and the method was applied to wildlife object detection based on YOLOv4-Tiny. Firstly， a new wildlife dataset with obvious long-tail data characteristics was collected， labelled and constructed. Secondly， a two-stage method based on transfer learning was used to train the neural network. In the first stage， the classification loss function was trained without weighting. In the second stage， two improved re-weighting methods were proposed， and the weights obtained in the first stage were used as the pre-training weights for re-weighting training. Finally， the wildlife test set was used to tested. Experimental results showed that the proposed long-tail data solving method achieved 60.47% and 61.18% mAP （mean Average Precision） with cross-entropy loss function and focal loss function as classification loss respectively， which was 3.30 percentage points and 5.16 percentage points higher than that the no-weighting method under the two loss functions， and 2.14 percentage points higher than that of the proposed improved effective sample weighting method under focus loss function. It shows that the proposed method can improve the object detection performance of YOLOv4-Tiny network for wildlife datasets with long-tail data characteristics.

Safety helmet wearing detection algorithm based on improved YOLOv5

Jin ZHANG, Peiqi QU, Cheng SUN, Meng LUO

2022, 42(4): 1292-1300. DOI: 10.11772/j.issn.1001-9081.2021071246

Asbtract ( )

HTML ( )

PDF (7633KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problems of strong interference and low detection precision of the existing safety helmet wearing detection， an algorithm of safety helmet detection based on improved YOLOv5 （You Only Look Once version 5） model was proposed. Firstly， for the problem of different sizes of safety helmets， the K-Means++ algorithm was used to redesign the size of the anchor box and match it to the corresponding feature layer. Secondly， the multi-spectral channel attention module was embedded in the feature extraction network to ensure that the network was able to learn the weight of each channel autonomously and enhance the information dissemination between the features， thereby strengthening the network ability to distinguish foreground and background. Finally， images of different sizes were input randomly during the training iteration process to enhance the generalization ability of the algorithm. Experimental results show as follows： on the self-built safety helmet wearing detection dataset， the proposed algorithm has the mean Average Precision （mAP） reached 96.0%， the the Average Precision （AP） of workers wearing safety helmet reached 96.7%， and AP of workers without safety helmet reached 95.2%. Compared with the YOLOv5 algorithm， the proposed algorithm has the mAP of helmet safety-wearing detection increased by 3.4 percentage points， and it meets the accuracy requirement of helmet safety-wearing detection in construction scenarios.

Low-rate denial-of-service attack detection method under software defined network environment

Xiangju LIU, Xiaobao LU, Xianjin FANG, Linsong SHANG

2022, 42(4): 1301-1307. DOI: 10.11772/j.issn.1001-9081.2021061100

Asbtract ( )

HTML ( )

PDF (610KB) ( )

Figures and Tables | References | Related Articles | Metrics

Low-rate Denial of Service （LDoS） attack is an improved form of Denial of Service （DoS） attack， which is difficult to detect due to its low average attack rate and strong concealment. To solve the above difficulty， a LDoS attack detection method based on Weighted Mean-Shift K-Means algorithm （WMS-Kmeans） under the architecture of Software-Defined Network （SDN） was proposed. Firstly， by obtaining the flow table information of OpenFlow switch， the six-tuple characteristics of LDoS attack traffic in SDN environment were analyzed and extracted. Then， the percentage error of average absolute value was used as the weight of the Euclidean distance in the mean shift clustering， and the resulting cluster center was used as the initial center of K-Means to cluster the flow table， so as to realize the detection of LDoS attacks. The experimental results show that the proposed method has high detection performance against LDoS attacks in the SDN environment， with an average detection rate of 99.29%， an average false alarm rate of 1.97% and an average missing alarm rate of 0.69%.

Time-frequency domain CT reconstruction algorithm based on convolutional neural network

Kunpeng LI, Pengcheng ZHANG, Hong SHANGGUAN, Yanling WANG, Jie YANG, Zhiguo GUI

2022, 42(4): 1308-1316. DOI: 10.11772/j.issn.1001-9081.2021050876

Asbtract ( )

HTML ( )

PDF (3307KB) ( )

Figures and Tables | References | Related Articles | Metrics

Concerning the problems of artifacts and loss of image details in the analytically reconstructed image by time-domain filters， a new time-frequency domain Computed Tomography （CT） reconstruction algorithm based on Convolutional Neural Network （CNN） was proposed. Firstly， a filter network based on a convolutional neural network was constructed in the frequency domain to achieve the frequency-domain filtering of the projection data. Secondly， the back-projection operator was used to perform domain conversion on the frequency-domain filtered result to obtain a reconstructed image. A network was constructed in the image domain to process the image from the back-projection layer. Finally， a multi-scale structural similarity loss function was introduced on the basis of the minimum mean square error loss function to form a composite loss function， which reduced the blur effect of the neural network on the result image and preserved the details of the reconstructed image. The image domain network and the projection domain filter network worked together to finally get the reconstructed result. The effectiveness of the proposed algorithm was verified on the clinical dataset. Compared with the Filtered Back Projection （FBP） algorithm， the Total Variation （TV） algorithm and the image domain Residual Encoder-Decoder CNN （RED-CNN） algorithm， when the number of projections is respectively 180 and 90， the proposed algorithm achieved the reconstructed result image with highest Peak Signal-to-Noise Ratio （PSNR） and Structural Similarity （SSIM）， and the least Normalized Mean Square Error （NMSE）.When the number of projections is 360，the proposed algorithm is second only to TV algorithm. The experimental results show that the proposed algorithm can improve the reconstructed image quality of CT image， and it is feasible and effective.

Table of Content