Journal of Computer Applications

Summarization of natural language generation

LI Xueqing, WANG Shi, WANG Zhujun, ZHU Junwu

2021, 41(5): 1227-1235. DOI: 10.11772/j.issn.1001-9081.2020071069

Asbtract ( )

PDF (1165KB) ( )

References | Related Articles | Metrics

Natural Language Generation (NLG) technologies use artificial intelligence and linguistic methods to automatically generate understandable natural language texts. The difficulty of communication between human and computer is reduced by NLG, which is widely used in machine news writing, chatbot and other fields, and has become one of the research hotspots of artificial intelligence. Firstly, the current mainstream methods and models of NLG were listed, and the advantages and disadvantages of these methods and models were compared in detail. Then, aiming at three NLG technologies:text-to-text, data-to-text and image-to-text, the application fields, existing problems and current research progresses were summarized and analyzed respectively. Furthermore, the common evaluation methods and their application scopes of the above generation technologies were described. Finally, the development trends and research difficulties of NLG technologies were given.

Review of pre-trained models for natural language processing tasks

LIU Ruiheng, YE Xia, YUE Zengying

2021, 41(5): 1236-1246. DOI: 10.11772/j.issn.1001-9081.2020081152

Asbtract ( )

PDF (1296KB) ( )

References | Related Articles | Metrics

In recent years, deep learning technology has developed rapidly. In Natural Language Processing (NLP) tasks, with text representation technology rising from the word level to the document level, the unsupervised pre-training method using a large-scale corpus has been proved to be able to effectively improve the performance of models in downstream tasks. Firstly, according to the development of text feature extraction technology, typical models were analyzed from word level and document level. Secondly, the research status of the current pre-trained models was analyzed from the two stages of pre-training target task and downstream application, and the characteristics of the representative models were summed up. Finally, the main challenges faced by the development of pre-trained models were summarized and the prospects were proposed.

Review of event causality extraction based on deep learning

WANG Zhujun, WANG Shi, LI Xueqing, ZHU Junwu

2021, 41(5): 1247-1255. DOI: 10.11772/j.issn.1001-9081.2020071080

Asbtract ( )

PDF (1460KB) ( )

References | Related Articles | Metrics

Causality extraction is a kind of relation extraction task in Natural Language Processing (NLP), which mines event pairs with causality from text by constructing event graph, and play important role in applications of finance, security, biology and other fields. Firstly, the concepts such as event extraction and causality were introduced, and the evolution of mainstream methods and the common datasets of causality extraction were described. Then, the current mainstream causality extraction models were listed. Based on the detailed analysis of pipeline based models and joint extraction models, the advantages and disadvantages of various methods and models were compared. Furthermore, the experimental performance and related experimental data of the models were summarized and analyzed. Finally, the research difficulties and future key research directions of causality extraction were given.

Event description generation based on generative adversarial network

SUN Heli, SUN Yuzhu, ZHANG Xiaoyun

2021, 41(5): 1256-1261. DOI: 10.11772/j.issn.1001-9081.2020081242

Asbtract ( )

PDF (971KB) ( )

References | Related Articles | Metrics

In Event-Based Social Networks (EBSNs), generating the event description of social events automatically is helpful for the organizer, so as to avoid the problems of poor description, descripting too much and low accuracy, and be easy to form rich, accurate and attractive event description. In order to automatically generate text that is sufficiently similar to true event description, a Generative Adversarial Network (GAN) model named GAN_PG was proposed to generate event description. In the GAN_PG model, the Variational Auto-Encoder (VAE) was used as the generator, and the neural network with the Gated Recurrent Unit (GRU) was used as the discriminator. In the model training, the Policy Gradient (PG) decline in reinforcement learning was used as reference, and a reasonable reward function was designed to train the generator to generate event description. Experimental results showed that the BLEU-4 value of the event description generated by GAN_PG reached 0.67, which proved that the event description generation model GAN_PG can generate event descriptions sufficiently similar to natural language in an unsupervised way.

Image description generation algorithm based on improved attention mechanism

LI Wenhui, ZENG Shangyou, WANG Jinjin

2021, 41(5): 1262-1267. DOI: 10.11772/j.issn.1001-9081.2020071078

Asbtract ( )

PDF (1413KB) ( )

References | Related Articles | Metrics

Image description is to express the global information contained in the image in sentences. It requires that the image description generation model can extract image information and express the extracted image information in sentences. The traditional model is based on Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN), which can realize the function of image-to-sentence translation to a certain extent. However, this model has low accuracy and training speed when extracting key information of the image. To solve this problem, an improved attention mechanism image description generation model based on CNN and Long Short-Term Memory (LSTM) network was proposed. VGG19 and ResNet101 were used as the feature extraction networks, and group convolution was introduced into the attention mechanism to replace the traditional fully connected operation, so as to improve the evaluation indices.The model was trained by public datasets Flickr8K and Flickr30K and validated by various evaluation indices (BLEU(Bilingual Evaluation Understudy), ROUGE_L(Recall-Oriented Understudy for Gisting Evaluation), CIDEr(Consensus-based Image Description Evaluation), METEOR(Metric for Evaluation of Translation with Explicit Ordering)). Experimental results show that compared with the model with traditional attention mechanism, the proposed improved image description generation model with attention mechanism improves the accuracy of the image description task, and this model is better than the traditional model on all the four evaluation indices.

Multimodal sentiment analysis based on feature fusion of attention mechanism-bidirectional gated recurrent unit

LAI Xuemei, TANG Hong, CHEN Hongyu, LI Shanshan

2021, 41(5): 1268-1274. DOI: 10.11772/j.issn.1001-9081.2020071092

Asbtract ( )

PDF (960KB) ( )

References | Related Articles | Metrics

Aiming at the problem that the cross-modality interaction and the impact of the contribution of each modality on the final sentiment classification results are not considered in multimodal sentiment analysis of video, a multimodal sentiment analysis model of Attention Mechanism based feature Fusion-Bidirectional Gated Recurrent Unit (AMF-BiGRU) was proposed. Firstly, Bidirectional Gated Recurrent Unit (BiGRU) was used to consider the interdependence between utterances in each modality and obtain the internal information of each modality. Secondly, through the cross-modality attention interaction network layer, the internal information of the modalities were combined with the interaction between modalities. Thirdly, an attention mechanism was introduced to determine the attention weight of each modality, and the features of the modalities were effectively fused together. Finally, the sentiment classification results were obtained through the fully connected layer and softmax layer. Experiments were conducted on open CMU-MOSI (CMU Multimodal Opinion-level Sentiment Intensity) and CMU-MOSEI (CMU Multimodal Opinion Sentiment and Emotion Intensity) datasets. The experimental results show that compared with traditional multimodal sentiment analysis methods (such as Multi-Attention Recurrent Network (MARN)), the AMF-BiGRU model has the accuracy and F1-Score on CMU-MOSI dataset improved by 6.01% and 6.52% respectively, and the accuracy and F1-Score on CMU-MOSEI dataset improved by 2.72% and 2.30% respectively. AMF-BiGRU model can effectively improve the performance of multimodal sentiment classification.

Opinion spam detection based on hierarchical heterogeneous graph attention network

ZHANG Rong, ZHANG Xianguo

2021, 41(5): 1275-1281. DOI: 10.11772/j.issn.1001-9081.2020081190

Asbtract ( )

PDF (1116KB) ( )

References | Related Articles | Metrics

Aiming at the problem that the non-semantic features of reviews cannot be fully utilized in opinion spam detection, a hierarchical attention mechanism and heterogeneous graph attention network based model, Hierarchical Heterogeneous Graph Attention Network (HHGAN), was proposed. Firstly, the hierarchical attention mechanism was used to learn the word-level and sentence-level document representations to focus on the capturing of the words and sentences that were important to the opinion spam detection. Then, the learned document representations were used as nodes, and the non-semantic features in reviews were selected as meta-paths to construct a heterogeneous graph attention network with a double-layer attention mechanism. Finally, a Multi-Layer Perceptron (MLP) was designed to distinguish the categories of reviews. Experimental results on datasets of restaurant and hotel extracted from yelp.com show that the F1 values of the HHGAN model reach 0.942 and 0.923 respectively, which are better than those of the traditional Convolutional Neural Network (CNN) model and other benchmark models of neural network.

Self-adaptive multi-measure unsupervised feature selection method with structured graph optimization

LIN Junchao, WAN Yuan

2021, 41(5): 1282-1289. DOI: 10.11772/j.issn.1001-9081.2020071099

Asbtract ( )

PDF (1843KB) ( )

References | Related Articles | Metrics

Unsupervised feature selection attracts much attention in the field of machine learning, and is very important for dimensionality reduction and classification of high-dimensional data. The similarity between data points can be measured by several different criteria, which results in the inconsistency of the similarity measure criteria between different data points. At the same time, in existing methods, the similarity matrices are most obtained by allocation of neighbors, so that the number of the connected components is usually not ideal. To address the two problems, a Self-Adaptive Multi-measure unsupervised feature selection with Structured Graph Optimization (SAM-SGO) method was proposed with regarding the similarity matrix as a variable instead of a preset thing. By fusing different measure functions into a unified measure adaptively, various measure methods could be synthesized, the similarity matrix of data was obtained adaptively, and the relationships between data points were captured more accurately. In order to obtain an ideal graph structure, a constraint was imposed on the rank of similarity matrix to optimize the local structure of the graph and simplify the calculation. In addition, the graph based dimensionality reduction problem was incorporated into the proposed adaptive multi-measure problem, and the sparsity-inducing l_2,0 regularization constraint was introduced to obtain the sparse projection used for feature selection. Experiments on several standard datasets demonstrate the effectiveness of SAM-SGO. Compared with Local Learning-based Clustering Feature Selection (LLCFS), Dependence Guided Unsupervised Feature Selection (DGUFS) and Structured Optimal Graph Feature Selection (SOGFS) methods proposed in recent years, the clustering accuracy of this method is improved by about 3.6 percentage points averagely.

Simultaneous feature selection optimization based on improved spotted hyena optimizer algorithm

JIA Heming, JIANG Zichao, LI Yao, SUN Kangjian

2021, 41(5): 1290-1298. DOI: 10.11772/j.issn.1001-9081.2020081192

Asbtract ( )

PDF (1335KB) ( )

References | Related Articles | Metrics

Aiming at the disadvantages of traditional Support Vector Machine (SVM) in the wrapper feature selection:low classification accuracy, redundant feature subset selection and poor computational efficiency, the meta-heuristic optimization algorithm was used to simultaneously optimize SVM and feature selection. In order to improve the classification effect of SVM and the ability of feature subset selection, firstly, the Spotted Hyena Optimizer (SHO) algorithm was improved by using the adaptive Differential Evolution (DE) algorithm, chaotic initialization and tournament selection strategy, so as to enhance its local search ability as well as improve its optimization efficiency and solution accuracy; secondly, the improved algorithm was applied to the simultaneous optimization of feature selection and SVM parameter adjustment; finally, a feature selection simulation experiment was carried out on the UCI datasets, and the classification accuracy, the number of selected features, the fitness value and the running time were used to comprehensively evaluate the optimization performance of the proposed algorithm. Experimental results show that the simultaneous optimization mechanism of the improved algorithm can reduce the number of selected features with high classification accuracy, and compared to the traditional algorithms, this algorithm is more suitable for solving the problem of wrapper feature selection, which has good application value.

Mixed precision neural network quantization method based on Octave convolution

ZHANG Wenye, SHANG Fangxin, GUO Hao

2021, 41(5): 1299-1304. DOI: 10.11772/j.issn.1001-9081.2020071106

Asbtract ( )

PDF (2485KB) ( )

References | Related Articles | Metrics

Deep neural networks with 32-bit weights require a lot of computing resources, making it difficult for large-scale deep neural networks to be deployed in limited computing power scenarios (such as edge computing). In order to solve this problem, a plug-and-play neural network quantification method was proposed to reduce the computational cost of large-scale neural networks and keep the model performance away from significant reduction. Firstly, the high-frequency and low-frequency components of the input feature map were separated based on Octave convolution. Secondly, the convolution kernels with different bits were respectively applied to the high- and low-frequency components for convolution operation. Thirdly, the high- and low-frequency convolution results were quantized to the corresponding bits by using different activation functions. Finally, the feature maps with different precisions were mixed to obtain the output of the layer. Experimental results verify the effectiveness of the proposed method on model compression. When the model was compressed to 1+8 bit(s), the proposed method had the accuracy dropped less than 3 percentage points on CIFAR-10/100 dataset; moreover, the proposed method made the ResNet50 structure based model compressed to 1+4 bit(s) with the accuracy higher than 70% on ImageNet dataset.

Data augmentation method based on improved deep convolutional generative adversarial networks

GAN Lan, SHEN Hongfei, WANG Yao, ZHANG Yuejin

2021, 41(5): 1305-1313. DOI: 10.11772/j.issn.1001-9081.2020071059

Asbtract ( )

PDF (1499KB) ( )

References | Related Articles | Metrics

In order to solve the training difficulty of small sample data in deep learning and increase the training efficiency of DCGAN (Deep Convolutional Generative Adversarial Network), an improved DCGAN algorithm was proposed to perform the augmentation of small sample data. In the method, Wasserstein distance was used to replace the loss model in the original model at first. Then, spectral normalization was added in the generation network, and discrimination network to acquire a stable network structure. Finally, the optimal noise input dimension of sample was obtained by the maximum likelihood estimation and experimental estimation, so that the generated samples became more diversified. Experimental result on three datasets MNIST, CelebA and Cartoon indicated that the improved DCGAN could generate samples with higher definition and recognition rate compared to that before improvement. In particular, the average recognition rate on these datasets were improved by 8.1%, 16.4% and 16.7% respectively, and several definition evaluation indices on the datasets were increased with different degrees, suggesting that the method can realize the small sample data augmentation effectively.

Convolution robust principal component analysis

WANG Xin, ZHU Haohua, LIU Guangcan

2021, 41(5): 1314-1318. DOI: 10.11772/j.issn.1001-9081.2020081169

Asbtract ( )

PDF (2704KB) ( )

References | Related Articles | Metrics

Robust Principal Component Analysis (RPCA) is a classical high-dimensional data analysis method, which can recover original data from noisy observation samples. However, the premise that RPCA can work is that the target data has a low rank matrix structure, so that RPCA cannot effectively deal with non-low rank data in practical applications. It is found that the convolution matrices of image and video are usually low rank, although the data matrices of them may not be low rank. According to this principle, a new method called Convolution Robust Principal Component Analysis (CRPCA) was proposed to use the low rank property of convolution matrix to constrain the original data structure, so as to achieve accurate data recovery. The calculation process of CRPCA model is a convex optimization problem, which was solved by Alternating Direction Method of Multipliers (ADMM). Experimental results on synthetic data vectors, real data images and video sequences show that the proposed method is superior to other algorithms such as RPCA, Generalized Robust Principal Component Analysis (GRPCA) and Kernel Robust Principal Component Analysis (KRPCA) in dealing with non-low rank problems.

Weakly supervised fine-grained image classification algorithm based on attention-attention bilinear pooling

LU Xinwei, YU Pengfei, LI Haiyan, LI Hongsong, DING Wenqian

2021, 41(5): 1319-1325. DOI: 10.11772/j.issn.1001-9081.2020071105

Asbtract ( )

PDF (1945KB) ( )

References | Related Articles | Metrics

With the rapid development of artificial intelligence, the purpose of image classification is not only to identify the major categories of objects, but also to classify the images of the same category into more detailed subcategories. In order to effectively discriminate small differences between categories, a fine-grained classification algorithm was proposed based on Attention-Attention Bilinear Pooling (AABP). Firstly, the Inception V3 pre-training model was applied to extract the global image features, and the local attention region on the feature mapping was forecasted with the deep separable convolution. Then, the Weakly Supervised Data Augmentation Network (WS-DAN) was applied to feed the augmented image back into the network, so as to enhance the generalization ability of the network to prevent overfitting. Finally, the linear fusion of the further extracted attention features was performed in AABP network to improve the accuracy of the classification. Experimental results show that this method achieves accuracy of 88.51% and top5 accuracy of 97.65% on CUB-200-2011 dataset, accuracy of 89.77% and top5 accuracy of 99.27% on Stanford Cars dataset, and accuracy of 93.5% and top5 accuracy of 97.96% on FGVC-Aircraft dataset.

Light-weight road image semantic segmentation algorithm based on deep learning

HU Die, FENG Ziliang

2021, 41(5): 1326-1331. DOI: 10.11772/j.issn.1001-9081.2020081181

Asbtract ( )

PDF (1085KB) ( )

References | Related Articles | Metrics

In order to solve the problem that the road image semantic segmentation model has huge parameter number and complex calculation in deep learning, and is not suitable for deployment on mobile terminals for real-time segmentation, a light-weighted symmetric U-shaped encoder-decoder image semantic segmentation network constructed by depthwise separable convolution was introduced, namely MUNet. First, a U-shaped encoder-decoder network was designed; then, the sparse short connection design was added in the convolution blocks; at last, the attention mechanism and Group Normalization (GN) method were introduced to reduce the amount of model parameters and calculation while improving the segmentation accuracy. For the CamVid dataset of road images, after 1 000 rounds of training, the Mean Intersection over Union (MIoU) of the segmentation results of the MUNet was 61.92% when the test image was cropped to a size of 720×720. Experimental results show that compared with the common image semantic segmentation networks such as Pyramid Scene Parsing Network (PSPNet), RefineNet, Global Convolutional Network (GCN) and DeepLabv3+, MUNet has fewer parameters and calculation with better network segmentation performance.

Yak face recognition algorithm of parallel convolutional neural network based on transfer learning

CHEN Zhengtao, HUANG Can, YANG Bo, ZHAO Li, LIAO Yong

2021, 41(5): 1332-1336. DOI: 10.11772/j.issn.1001-9081.2020071126

Asbtract ( )

PDF (842KB) ( )

References | Related Articles | Metrics

In order to realize accurate management of yaks during the process of yak breeding, it is necessary to recognize the identities of the yaks. Yak face recognition is a feasible method of yak identification. However, the existing yak face recognition algorithms based on neural networks have the problems such as too many features in the yak face dataset and long training time of neural networks. Therefore, based on the method of transfer learning and combined with the Visual Geometry Group (VGG) network and Convolutional Neural Network (CNN), a Parallel CNN (Parallel-CNN) algorithm was proposed to identify the facial information of yaks. Firstly, the existing VGG16 network was used to perform transfer learning to the yak face image data and extract the yaks' facial information features for the first time. Then, the dimensional transformation was performed to the extracted features at different levels, and the processed features were inputted into the parallel-CNN for the secondary feature extraction. Finally, two separated fully connected layers were used to classify the yak face images. Experimental results showed that Parallel-CNN was able to recognize yak faces with different angles, illuminations and poses. On the test dataset with 90 000 yak face images of 300 yaks, the recognition accuracy of the proposed algorithm reached 91.2%. The proposed algorithm can accurately recognize the identities of the yaks, and can help the yak farm to realize the intelligent management of the yaks.

Adaptive affinity propagation clustering algorithm based on universal gravitation

WANG Zhihe, CHANG Xiaoqing, DU Hui

2021, 41(5): 1337-1342. DOI: 10.11772/j.issn.1001-9081.2020071130

Asbtract ( )

PDF (1267KB) ( )

References | Related Articles | Metrics

Focused on the problem that Affinity Propagation (AP) clustering algorithm is sensitive to parameter Preference, which is not suitable for sparse data, and has the incorrectly clustered sample points in the clustering results, an algorithm named Adaptive Affinity Propagation clustering based on universal gravitation (GA-AP) was proposed. Firstly, the gravitational search mechanism was introduced into the traditional AP algorithm in order to perform the global optimization to the sample points. Secondly, on the basis of global optimization, the correctly clustered and incorrectly clustered sample points in each cluster were found through the information entropy and Adaptive Boosting (AdaBoost) algorithm, the weights of the sample points were calculated. Each sample point was updated by the corresponding weight, so that the similarity, Preference value, attractiveness and membership degree were updated, and the re-clustering was performed. The above steps were continuously operated until the maximum number of iterations was reached. Through simulation experiments on nine datasets, it can be seen that compared to Affinity Propagation clustering based on Adaptive Attribute Weighting (AFW_AP) algorithm, AP algorithm, K-means clustering (K-means) algorithm and Fuzzy C-Means (FCM) algorithm, the proposed algorithm has the average values of Purity, F-measure and Accuracy (ACC) increased by 0.69, 71.74% and 98.5% respectively at most. Experimental results show that the proposed algorithm reduces the dependence on Preference and improves the clustering effect, especially the accuracy of clustering results for sparse datasets.

Time series clustering based on new robust similarity measure

LI Guorong, YE Jimin, ZHEN Yuanting

2021, 41(5): 1343-1347. DOI: 10.11772/j.issn.1001-9081.2020071142

Asbtract ( )

PDF (683KB) ( )

References | Related Articles | Metrics

For time series data with outliers, a Robust Generalized Cross-Correlation measure (RGCC) between time series based on robust estimation of correlation coefficient was proposed. First, a robust correlation coefficient was introduced to replace Pearson correlation coefficient to calculate the covariance matrix between time series data. Second, the determinant of the new covariance matrix was used to construct a similarity measure between two time series named RGCC. Finally, the distance matrix between the time series was calculated based on this measure, and the matrix was used as the input of the clustering algorithm to cluster the data. Time series clustering simulation experiments showed that for time series data with outliers, the clustering results based on RGCC were obviously closer to the real ones compared to the clustering results based on the original Generalized Cross-Correlation measure (GCC). It can be seen that the proposed new robust similarity measure is fully applicable to time series data with outliers.

Hybrid recommendation model based on heterogeneous information network

LIN Yixing, TANG Hua

2021, 41(5): 1348-1355. DOI: 10.11772/j.issn.1001-9081.2020081340

Asbtract ( )

PDF (1265KB) ( )

References | Related Articles | Metrics

The current personalized recommendation platform has the characteristics of a wide range of data sources and many data types. With the data sparsity of the platform as an important reason for affecting the performance of the recommendation system, there are many challenges faced by the recommendation system:how to mine structured data and unstructured data of the platform to discover more features, improve the accuracy of recommendations in data-sparse scenarios, alleviate the cold start problem, and make recommendations interpretable. Therefore, for the personalized scenario of recommending Items for Users, the Heterogeneous Information Network (HIN) was used to build the association relationships between objects in the recommendation platform, and the Meta-Graph was used to describe the association paths between objects and calculate the User-Item similarity matrices under different paths; the FunkSVD matrix decomposition algorithm was adopted to calculate the implicit features of Users and Items, and for the unstructured data with text as an example, the Convolutional Neural Network (CNN) technology was used to mine the text features of the data; after splicing the features obtained by the two methods, a Factorization Machine (FM) incorporating historical average scores of Users and Items was used to predict Users' scores for Items. In the experiment, based on the public dataset Yelp, the proposed hybrid recommendation model, the single recommendation model based on Meta-Graph, the FM Recommendation model (FMR) and the FunkSVD based recommendation model were established and trained. Experimental results show that the proposed hybrid recommendation model has good validity and interpretability, and compared with the comparison models, the recommendation accuracy of this model has been greatly improved.

PageRank-based talent mining algorithm based on Web of Science

LI Chong, WANG Yuchen, DU Weijing, HE Xiaotao, LIU Xuemin, ZHANG Shibo, LI Shuren

2021, 41(5): 1356-1360. DOI: 10.11772/j.issn.1001-9081.2020081206

Asbtract ( )

PDF (775KB) ( )

References | Related Articles | Metrics

The high-level paper is one of the symbolic achievements of excellent scientific talents. Focusing on the "Web of Science (WOS)" hot research disciplines, on the basis of constructing the Neo4j semantic network graph of academic papers and mining active scientific research communities, the PageRank-based talent mining algorithm was used to realize the mining of outstanding scientific research talents in the scientific research communities. Firstly, the existing talent mining algorithms were studied and analyzed in detail. Secondly, combined with the WOS data, the PageRank-based talent mining algorithm was optimized and implemented by adding consideration factors such as the paper publication time factor, the author's order descending model, the influence of surrounding author nodes on this node, the number of citations of the paper. Finally, experiments and verifications were carried out based on the paper data of the communities of the hot discipline computer science in the past five years. The results show that community-based mining is more targeted, and can quickly find representative excellent and potential talents in various disciplines, and the improved algorithm is more effective and objective.

Patent clustering method based on functional effect

MA Jianhong, CAO Wenbin, LIU Yuangang, XIA Shuang

2021, 41(5): 1361-1366. DOI: 10.11772/j.issn.1001-9081.2020081203

Asbtract ( )

PDF (916KB) ( )

References | Related Articles | Metrics

At present, patents are divided according to their domains, and cross-domain patent clustering can be realized based on the functional effect, which is of great significance in enterprise innovation design. Accurate extraction of patent functional effect and fast acquisition of optimal clustering results are the key tasks in it. Therefore, a Functional Effect Information-Joint (FEI-Joint) model combining Enhanced Language Representation with Informative Entities (ERNIE) and Convolutional Neural Network (CNN) was proposed to extract the functional effects of patent documents, and the Self-Organizing Map (SOM) algorithm was improved, so as to propose an Early Reject based Class Merge Self-Organizing Map (ERCM-SOM) to realize the patent clustering based on functional effect. FEI-Joint model was compared with Term-Frequency-Inverse-Document-Frequency (TF-IDF), Latent Dirichlet Allocation (LDA) and CNN in the clustering effect after feature extraction, and the results show that the F-measure value of the proposed model was obviously improved than those of other models. Compared with K-Means algorithm and SOM algorithm, ERCM-SOM algorithm has higher F-measure value while has significantly shorter time than that of SOM algorithm. Compared with the patent classification using International Patent Classification (IPC), the clustering method based on functional effect can achieve cross-domain patent clustering effect, which lays a foundation for designers to learn from design methods in other domains.

Verifiable and secure outsourcing for large matrix full rank decomposition

DU Zhiqiang, ZHENG Dong, ZHAO Qinglan

2021, 41(5): 1367-1371. DOI: 10.11772/j.issn.1001-9081.2020081237

Asbtract ( )

PDF (695KB) ( )

References | Related Articles | Metrics

Focused on the problems of no protection for the number of zero elements in original matrix and no verification for the result returned by cloud in outsourcing algorithm of matrix full rank decomposition, a verifiable and secure outsourcing scheme of matrix full rank decomposition was proposed. Firstly, in the phase of encryption, a dense invertible matrix was constructed by using the Sherman-Morrison formula for encryption. Secondly, in the phase of cloud computing, the cloud computing of the full rank decomposition for the encryption matrix was required. And when the results of full rank decomposition for encryption matrix (a column full rank matrix and a row full rank matrix) were obtained, the cloud computing of the left inverse of the column full rank matrix and the right inverse of the row full rank matrix was required respectively. Thirdly, in the phase of verification, the client not only needed to verify whether these two matrices returned by cloud are row-full-rank or column-full-rank respectively, but also needed to verify whether the multiplication of these two matrices is equal to the encryption matrix. Finally, if the verification was passed, the client was able to use the private key to perform the decryption. In the protocol analysis, the proposed scheme is proved to satisfy correctness, security, efficiency, and verifiability. At the same time, when the dimension of the selected original matrix is 512×512, with different densities of non-zero elements in the matrix, the entropy of the encryption matrix calculated by this scheme is identically equal to 18, indicating that the scheme can protect the number of zero elements effectively. Experimental results show the effectiveness of the proposed scheme.

Intrusion detection model based on combination of dilated convolution and gated recurrent unit

ZHANG Quanlong, WANG Huaibin

2021, 41(5): 1372-1377. DOI: 10.11772/j.issn.1001-9081.2020071082

Asbtract ( )

PDF (936KB) ( )

References | Related Articles | Metrics

Intrusion detection model based on machine learning plays a vital role in the security protection of network environment. Aiming at the problem that the existing network intrusion detection model cannot fully learn the data features of network intrusion, the deep learning theory was applied to intrusion detection, and a deep network model with automatic feature extraction function was proposed. In this model, the dilated convolution was used to increase the receptive field of information and extract high-level features from it, the Gated Recurrent Unit (GRU) model was used to extract long-term dependencies between retained features, then the Deep Neural Network (DNN) was used to fully learn the data features. Compared with the classical machine learning classifier, this model has a higher detection rate. Experiments conducted on the famous KDD CUP99, NSL-KDD and UNSW-NB15 datasets show that the model has the performance better than other classifiers. Specifically, the model has the accuracy of 99.78% on KDD CUP99 dataset, the accuracy of 99.53% on NSL-KDD dataset, and the accuracy of 93.12% on UNSW-NB15 dataset.

End-to-end security solution for message queue telemetry transport protocol based on proxy re-encryption

GU Zhengchuan, GUO Yuanbo, FANG Chen

2021, 41(5): 1378-1385. DOI: 10.11772/j.issn.1001-9081.2020060985

Asbtract ( )

PDF (1130KB) ( )

References | Related Articles | Metrics

Aiming at the lack of built-in security mechanism in Message Queue Telemetry Transport (MQTT) protocol to protect communication information between the Internet of Things (IoT) devices, as well as the problem that the credibility of MQTT broker is questioned in the new concept of zero trust security, a new solution based on proxy re-encryption for implementing secure end-to-end data transmission between publisher and subscriber in MQTT communication was proposed. Firstly, the Advanced Encryption Standard (AES) was used to symmetrically encrypt the transmitted data for ensuring the confidentiality of the data during the transmission process. Secondly, the proxy re-encryption algorithm that defines the MQTT broker as a semi-honest participant was adopted to encrypt the session key used by the AES symmetric encryption, so as to eliminate the implicit trust of the MQTT broker. Thirdly, the computation of re-encryption key generation was transferred from clients to a trusted third party for the applicability of the proposed scheme in resource-constrained IoT devices. Finally, Schnorr signature algorithm was employed to digitally sign the messages for the authenticity, integrity and non-repudiation of the data source. Compared with the existing MQTT security schemes, the proposed scheme acquires the end-to-end security features of MQTT communication at the expense of the computation and communication overhead equivalent to that of the lightweight security scheme without end-to-end security.

Encrypted traffic classification method based on data stream

GUO Shuai, SU Yang

2021, 41(5): 1386-1391. DOI: 10.11772/j.issn.1001-9081.2020071073

Asbtract ( )

PDF (948KB) ( )

References | Related Articles | Metrics

Aiming at the problems of fast classification and accurate identification of encrypted traffic in current network, a new feature extraction method for data stream was proposed. Based on the characteristics of sequential data and the law of the SSL (Secure Sockets Layer) handshake protocol, an end-to-end one-dimensional convolutional neural network model was adopted, and five-tuples were used to label the data stream. By selecting the data stream representation manner, the number of data packets, and the length of feature bytes, the key field positions of sample classification were located more accurately, and the features with little impact on sample classification were removed, so that the 784 bytes used by a single data stream during the original input were reduced to 529 bytes, which reduced 32% of the original length, and the classification of 12 encrypted traffic service types was implemented with the accuracy of 95.5%. These results show that the proposed method can reduce the original input feature dimension and improve the efficiency of data processing on the basis of ensuring the accuracy of the current research.

Two-stage task offloading strategy based on game theory in cloud-edge environment

WANG Yijie, FAN Jiafei, WANG Chenyu

2021, 41(5): 1392-1398. DOI: 10.11772/j.issn.1001-9081.2020071091

Asbtract ( )

PDF (910KB) ( )

References | Related Articles | Metrics

Mobile Edge Computing (MEC) provides an effective solution to the conflict between computationally intensive applications and resource constrained mobile devices. However, most studies on the MEC offloading only consider the resource allocation between mobile devices and MEC servers, and ignore the huge computing resources in the cloud computing centers. In order to make full use of cloud and MEC resources, a task offloading strategy of cloud-edge collaboration was proposed. Firstly, the task offloading problem of the cloud-edge servers was transformed into a game problem. Then, the existence and uniqueness of Nash Equilibrium (NE) in this game were proved, and the solution to this game problem was obtained. Finally, a two-stage task offloading algorithm based on game theory was proposed to solve the task offloading scheme, and the performance of this algorithm was evaluated by performance indicators. The simulation results show that the total overhead of using the proposed algorithm is reduced by 72.8%, 47.9%, and 2.65% compared with those of local execution, cloud server execution and MEC server execution, respectively. The numerical results confirm that the proposed strategy can achieve higher energy efficiency and lower task offloading overhead, and extend scale well with the number of mobile devices increases.

Consensus of two-layer multi-agent systems subjected to cyber-attack

WANG Yunyan, HU Aihua

2021, 41(5): 1399-1405. DOI: 10.11772/j.issn.1001-9081.2020081159

Asbtract ( )

PDF (1150KB) ( )

References | Related Articles | Metrics

The consensus problem of the two-layer multi-agent systems subjected to cyber-attacks was studied. Aiming at the two-layer multi-agent systems composed of the leaders' layer and the followers' layer, the situation as the following was given:the neighboring agents in the leaders' layer were cooperative, the adjacent agents in the followers' layer were cooperative or competitive, and there was a restraining relationship between some corresponding agents in the leaders' layer and the followers' layer. The consensus problem among the nodes of leaders' layer, followers' layer and two-layer multi-agent systems subjected to cyber-attack was discussed. Based on the related knowledge such as Linear Matrix Inequality (LMI), Lyapunov stability theory and graph theory, the sufficient criteria for consensus between the nodes in the leaders' layer multi-agent system, bipartite consensus between the nodes in the followers' layer multi-agent system and node-to-node bipartite consensus between the nodes in the two-layer multi-agent systems were given. Finally, the numerical simulation examples were given, and the consensus of the two-layer multi-agent systems subjected to cyber-attack was realized, which verified the validity of the proposed criteria.

Vehicle number optimization approach of autonomous vehicle fleet driven by multi-spatio-temporal distribution task

ZHENG Liping, WANG Jianqiang, ZHANG Yuzhao, DONG Zuofan

2021, 41(5): 1406-1411. DOI: 10.11772/j.issn.1001-9081.2020081183

Asbtract ( )

PDF (1248KB) ( )

References | Related Articles | Metrics

A stochastic optimization method was proposed in order to solve the vehicle number allocation problem of the minimum autonomous vehicle fleet driven by spatio-temporal multi-tasks of terminal delivery. Firstly, the influence of service time and waiting time on the route planning of autonomous vehicle fleet was analyzed to build the shortest route model, and the service sequence network was constructed based on the two-dimensional spatio-temporal network. Then, the vehicle number allocation problem of the minimum autonomous vehicle fleet was converted into a network maximum flow problem through the network transformation, and a minimum fleet model was established with the goal of minimizing the vehicle number of the fleet. Finally, the Dijkstra-Dinic algorithm combining Dijkstra algorithm and Dinic algorithm was designed according to the model features in order to solve the vehicle number allocation problem of the minimum autonomous vehicle fleet. Simulation experiments were carried out in four different scales of service networks, the results show that:under different successful service rates, the minimum size of autonomous vehicle fleet is positively correlated with the scale of service network, and it decreases with the increase of waiting time and gradually tends to be stable, the One-stop operator introduced into the proposed algorithm greatly improves the search efficiency, and the proposed model and algorithm are suitable for the calculation of the minimum vehicle fleet in large-scale service network.

Coevolutionary ant colony optimization algorithm for mixed-variable optimization problem

WEI Mingyan, CHEN Yu, ZHANG Liang

2021, 41(5): 1412-1418. DOI: 10.11772/j.issn.1001-9081.2020081200

Asbtract ( )

PDF (2082KB) ( )

References | Related Articles | Metrics

For Mixed-Variable Optimization Problem (MVOP) containing both continuous and categorical variables, a coevolution strategy was proposed to search the mixed-variable decision space, and a Coevolutionary Ant Colony Optimization Algorithm for MVOP (CACOA_MV) was developed. In CACOA_MV, the continuous and categorical sub-populations were generated by using the continuous and discrete Ant Colony Optimization (ACO) strategies respectively, the sub-vectors of continuous and categorical variables were evaluated with the help of cooperators, and the continuous and categorical sub-populations were respectively updated to realize the efficient coevolutionary search in the mixed-variable decision space. Furthermore, the ability of global exploration to the categorical variable solution space was improved by introducing a smoothing mechanism of pheromone, and a "best+random cooperators" restart strategy facing the coevolution framework was proposed to enhance the efficiency of coevolutionary search. By comparing with the Mixed-Variable Ant Colony Optimization (ACO_MV) algorithm and the Success History-based Adaptive Differential Evolution algorithm with linear population size reduction and Ant Colony Optimization (L-SHADE_ACO), it is demonstrated that CACOA_MV is able to perform better local exploitation, so as to improve approximation quality of the final results in the target space; the comparison with the set-based Differential Evolution algorithm with Mixed-Variables (DE_MV) shows that CACOA_MV is able to better approximate the global optimal solutions in the decision space and has better global exploration ability. In conclusion, CACOA_MV with the coevolutionary strategy can keep a balance between global exploration and local exploitation, which results in better optimization ability.

Orthogonal matching pursuit hybrid precoding algorithm based on improved intelligent water drop

LIU Ziyan, MA Shanshan, BAI He

2021, 41(5): 1419-1424. DOI: 10.11772/j.issn.1001-9081.2020071116

Asbtract ( )

PDF (956KB) ( )

References | Related Articles | Metrics

Focused on the problems of high hardware cost and high system overhead in the millimeter-Wave Massive Multi-Input Multi-Output (mmWave Massive MIMO) system, an Orthogonal Matching Pursuit based on improved Intelligent Water Drop (IWD-OMP) hybrid precoding algorithm was proposed. Firstly, based on Orthogonal Match Pursuit (OMP) algorithm, the precoding matrix was solved. Secondly, the improved Intelligent Water Drop (IWD) algorithm was adopted to calculate the global optimal index vector in the matrix. Finally, the matrix solved by this method did not need to construct the candidate matrix in advance, which was able to save the system resources and reduce the complexity of matrix calculation. Experimental results demonstrate that when the number of transmitting antennas is 128 and the signal-to-noise ratio is 28 dB, compared with the OMP algorithm, the proposed method has the system achievable sum rate performance improved by about 7.71%, when the signal-to-noise ratio is 8 dB, the proposed method has the bit error rate reduced by about 19.77%. In addition, the proposed precoding algorithm has strong robustness to the imperfect Channel State Information (CSI) in the real channel environment. When the signal-to-noise ratio value is 28 dB, the proposed method has the system achievable sum rate decreased by about 1.08% for imperfect CSI compared with that for perfect CSI.

Selected mapping method with embedded side information to reduce PAPR of FBMC signals

XIA Yujie, SHI Yongpeng, GAO Ya, SUN Peng

2021, 41(5): 1425-1431. DOI: 10.11772/j.issn.1001-9081.2020081346

Asbtract ( )

PDF (1102KB) ( )

References | Related Articles | Metrics

To solve the problems of the poor reduction performance of Filter Bank MultiCarrier (FBMC) signals' Peak-to-Average Power Ratio (PAPR) and the high Side Information Error Rate (SIER) of the existing Selected Mapping (SLM) method to reduce PAPR signals, an SLM method with embedded Side Information (SI) was presented to reduce PAPR. At the transmitter, a group of phase rotation vectors with embedded SI were designed, and the candidate data blocks were generated by multiplying the phase rotation vectors with the transmitting data blocks. By using the outputs of Inverse Discrete Fourier Transform (IDFT) of the real and imaginary components of the candidate data blocks, the candidate FBMC signals based on cyclic time shift were designed and the candidate signal with the lowest PAPR was selected and transmitted. At the receiver, by using the difference between the phase rotations of the SI subcarrier data, a low-complexity SI detector unrelated to modulation order of transmitted symbols was proposed. Simulation results show that the proposed method can effectively reduce the PAPR of FBMC signals at the transmitter and obtain good SI detection and Bit Error Rate (BER) performances at the receiver.

Image generation based on conditional-Wassertein generative adversarial network

GUO Maozu, YANG Qiannan, ZHAO Lingling

2021, 41(5): 1432-1437. DOI: 10.11772/j.issn.1001-9081.2020071138

Asbtract ( )

PDF (2259KB) ( )

References | Related Articles | Metrics

Generative Adversarial Network (GAN) can automatically generate target images, and is of great significance to the generation of building arrangement of similar blocks. However, there are problems in the existing process of model training such as the low accuracy of generated images, the mode collapse, and the too low efficiency of model training. To solve these problems, a Conditional-Wassertein Generative Adversarial Network (C-WGAN) model for image generation was proposed. First, the feature correspondence between the real sample and the target sample was needed to be identified by this model, and then the target sample was generated according to the identified feature correspondence. The Wassertein distance was used to measure the distance between the distributions of two image features in the model, the GAN training environment was stablized, and mode collapse was avoided during model training, so as to improve the accuracy of the generated images and the training efficiency. Experimental results show that compared with the original Conditional Generative Adversarial Network (CGAN) and the pix2pix models, the proposed model has the Peak Signal-to-Noise Ratio (PSNR) increased by 6.82% and 2.19% at most respectively; in the case of the same number of training rounds, the proposed model reaches the convergence state faster. It can be seen that the proposed model can not only effectively improve the accuracy of image generation, but also increase the convergence speed of the network.

Image super-resolution reconstruction method based on accelerated residual network

LIANG Min, WANG Haorong, ZHANG Yao, LI Jie

2021, 41(5): 1438-1444. DOI: 10.11772/j.issn.1001-9081.2020091520

Asbtract ( )

PDF (2387KB) ( )

References | Related Articles | Metrics

To solve the problems of multiple network parameters and high computational complexity in image super-resolution reconstruction of deep network architecture, an image super-resolution reconstruction method based on accelerated residual network was proposed. Firstly, a residual network was constructed to reconstruct the high-frequency residual information between low-resolution image and high-resolution image, so as to reduce the deep network transmission process of redundant information and improve the reconstruction efficiency. Secondly, the dimensionality of the extracted low-resolution feature map was reduced by the feature shrinking layer to realize fast mapping with fewer network parameters. Thirdly, the dimensionality of the high-resolution feature map was increased by the feature expanding layer to reconstruct the high-frequency residual information with the rich information. Finally, the residual and low-resolution images were summed to obtain the reconstructed high-resolution image. Experimental results show that the Peak Signal-to-Noise Ratio (PSNR) and the Structural SIMilarity (SSIM) mean results obtained by the proposed method are 0.57 dB and 0.013 3 higher than those obtained by Super-Resolution using Convolutional Neural Network (SRCNN) respectively, and 0.45 dB and 0.006 7 higher than those obtained by Intermediate Supervision Convolutional Neural Network (ISCNN). In terms of reconstruction speed, using dataset Urban100 as example, the proposed method is 1.5 to 42 times faster than the existing methods. In addition, when this method is applied to the super-resolution reconstruction of motion blur images, it has the performance better than image Super-Resolution using Very Deep convolutional network (VDSR). The proposed method achieves better reconstruction quality with fewer network parameters and provides a new idea for image super-resolution reconstruction.

Fractal image compression based on gray-level co-occurrence matrix and simultaneous orthogonal matching pursuit

YANG Mengmeng, ZHANG Aihua

2021, 41(5): 1445-1449. DOI: 10.11772/j.issn.1001-9081.2020071132

Asbtract ( )

PDF (968KB) ( )

References | Related Articles | Metrics

Focused on the high computational complexity and long encoding time problems in the traditional fractal image compression, an orthogonalized fractal encoding algorithm based on texture features of gray-level co-occurrence matrix was proposed. Firstly, from the perspective of feature extraction and image retrieval, the similarity measurement matrix between range blocks and domain blocks was established to transform the global search into the local search, so as to reduce the codebook. Then, by defining a new normalized block as the new gray-level description feature, the transformation process between blocks was simplified. Finally, the concept of Simultaneous Orthogonal Matching Pursuit (SOMP) sparse decomposition orthogonalized fractal encoding was introduced, so that the gray-level matching between blocks was transformed into solving the corresponding sparse coefficient matrix, which realized the matching relationship between one range block and multiple domain blocks. Experimental results show that compared with Sparse Fractal Image Compression (SFIC) algorithm, the proposed algorithm can save about 88% of the encoding time on average without reducing the quality of image reconstruction; compared with the sum of double cross eigenvalues algorithm, the proposed algorithm can significantly shorten coding time while maintaining better reconstruction quality.

Human behavior recognition algorithm based on skeletal temporal divergence feature

TIAN Zhiqiang, DENG Chunhua, ZHANG Junwen

2021, 41(5): 1450-1457. DOI: 10.11772/j.issn.1001-9081.2020081178

Asbtract ( )

PDF (2089KB) ( )

References | Related Articles | Metrics

Human behavior recognition is an important basic technology in the fields such as intelligent monitoring, human-computer interaction and robotics. Graph Convolutional Neural Network (GCN) achieve excellent performance in skeleton-based human behavior recognition. The following problems exist in the research of human behavior recognition using GCNs:1) the human skeleton points are represented by coordinates, which lacks detailed information about the movement of the skeleton points; 2) in some videos, the motion amplitude of the human skeleton is too small, so that the representation information of the key skeleton points is not obvious. Aiming at the above problems, firstly, a temporal divergence model of skeleton points was designed to describe the movement states of the skeleton points, which amplified the between-class variances of different human behaviors. In addition, the attention mechanism of temporal divergence features was designed to highlight the key skeleton points and further expand the between-class variances. Finally, a two-stream fusion model was constructed based on the complementarity between the spatial data characteristics of the original skeleton and the temporal divergence characteristics. The proposed algorithm achieved the accuracy of 82.9% and 83.7% under two partitioning strategies of authoritative human behavior dataset NTU-RGB+D respectively, which were 1.3 percentage points and 0.5 percentage points higher than those of Adaptive Graph Convolutional Network (AGCN) respectively. The improvement of the accuracy of the proposed algorithm on the dataset proves the effectiveness of this algorithm.

Fitness action recognition method based on human skeleton feature encoding

GUO Tianxiao, HU Qingrui, LI Jianwei, SHEN Yanfei

2021, 41(5): 1458-1464. DOI: 10.11772/j.issn.1001-9081.2020071113

Asbtract ( )

PDF (1143KB) ( )

References | Related Articles | Metrics

Fitness action recognition is the core of the intelligent fitness system. In order to improve the accuracy and speed of fitness action recognition algorithm, and reduce the influence of the global displacement of fitness actions on the recognition results, a fitness action recognition method based on human skeleton feature encoding was proposed which included three steps:firstly, the simplified human skeleton model was constructed, and the information of skeleton model's joint point coordinates was extracted through the human pose estimation technology; secondly, the action feature region was extracted by using the human central projection method in order to eliminate the influence of the global displacement on action recognition; finally, the feature region was encoded as the feature vector and input to a multi-classifier to realize the action recognition, at the same time the length of the feature vector was optimized for improving the recognition rate and speed. Experiment results showed that the proposed method achieved the recognition rate of 97.24% on the self-built fitness dataset with 28 types of fitness actions, which verified the effectiveness of this method to recognize different types of fitness actions; on the public KTH and Weizmann datasets, the recognition rates of the proposed method were 91.67% and 90% respectively, higher than those of other similar methods.

Multi-threshold segmentation of forest fire images based on modified symbiotic organisms search algorithm

JIA Heming, LI Yao, JIANG Zichao, SUN Kangjian

2021, 41(5): 1465-1470. DOI: 10.11772/j.issn.1001-9081.2020081221

Asbtract ( )

PDF (1606KB) ( )

References | Related Articles | Metrics

To solve the problems that the traditional multi-threshold segmentation methods have the computational complexity increased with the increase of the number of thresholds, and have very low efficiency of multi-threshold segmentation for a given image, a multi-threshold segmentation method based on Symbiotic Organisms Search (SOS) algorithm combined with Kapur entropy threshold was proposed. Firstly, the Elite Opposition-Based Learning (EOBL) was added into the symbiotic stage of SOS algorithm, so as to solve the problem that the traditional SOS algorithms tend to fall into local optimum when dealing with complex optimization problems. Then, the Levy flight mechanism was introduced to expand the search range of SOS algorithm and enhance the randomness of the algorithm's search trajectory. Finally, the obtained Modified Symbiotic Organisms Search (MSOS) algorithm was applied to find the optimal threshold values for forest fire images. Experimental results show that compared with other optimization algorithms such as Particle Swarm Optimization (PSO) algorithm,Harmony Search Algorithm (HSA) and Bat Algorithm (BA), the MSOS algorithm has the superiority in segmenting images, so it is practical and valuable in practical engineering problems.

Application of improved DeepLabV3+ model in mural segmentation

CAO Jianfang, TIAN Xiaodong, JIA Yiming, YAN Minmin

2021, 41(5): 1471-1476. DOI: 10.11772/j.issn.1001-9081.2020071101

Asbtract ( )

PDF (1126KB) ( )

References | Related Articles | Metrics

Aiming at the problems of blurred target boundaries and low image segmentation efficiency in the image segmentation process of ancient murals, a multi-class image segmentation model fused with a lightweight convolutional neural network named MC-DM (Multi-Class DeepLabV3+MobileNetV2 (Mobile Networks Vision 2)) was proposed. In the model, DeepLabV3+ architecture and MobileNetV2 network were combined together, and the unique spatial pyramid structure of DeepLabV3+ was utilized to perform multi-scale fusion of the convolutional features of the mural to reduce the loss of image details during the mural segmentation. First of all, the features of the input image were extracted by MobileNetV2 to ensure the accurate extraction of image information and reduce the time consumption at the same time. Secondly, the image features were processed through the dilated convolution, so that the receptive field was expanded, and more semantic information was obtained without changing the number of parameters. Finally, the bilinear interpolation method was utilized to up-sample the output feature image to obtain a pixel-level prediction segmentation map, so that the accuracy of image segmentation was ensured to the greatest extent. In the JetBrains PyCharm Community Edition 2019 environment, a dataset made of 1 000 mural scanning pictures was used for testing. Experimental results showed that the MC-DM model had a 1% improvement in training accuracy compared with the traditional SegNet (Segment Network)-based image segmentation model, and had a 2% improvement in accuracy compared with the image segmentation model based on PSPNet (Pyramid Scene Parsing Network), and the Peak Signal-to-Noise Ratio (PSNR) of the MC-DM model was 3 to 8 dB higher than those of the experimental comparison models on average, which verified the effectiveness of the model in the field of mural segmentation. The proposed model provides a new idea for the segmentation of ancient mural images.

Survey of unmanned aerial vehicle cooperative control

MA Ziyu, HE Ming, LIU Zujun, GU Lingfeng, LIU Jintao

2021, 41(5): 1477-1483. DOI: 10.11772/j.issn.1001-9081.2020081314

Asbtract ( )

PDF (1364KB) ( )

References | Related Articles | Metrics

Unmanned Aerial Vehicle (UAV) cooperative control means that a group of UAVs based on inter-aircraft communication complete a common mission with rational division of labor and cooperation by using swarm intelligence as the core. UAV swarm is a multi-agent system in which many UAVs with certain independence ability carry out various tasks based on local rules. Compared with a single UAV, UAV swarm has great advantages such as high efficiency, high flexibility and high reliability. In view of the latest developments of UAV cooperative control technology in recent years, firstly, the application prospect of multi-UAV technology was illustrated by giving examples from the perspectives of civil use and military use. Then, the differences and development statuses of the three mainstream cooperative control methods:consensus control, flocking control and formation control were compared and analyzed. Finally, some suggestions on delay, obstacle avoidance and endurance of cooperative control were given to provide some help for the research and development of UAV collaborative control in the future.

High-accuracy localization algorithm based on fusion of two-dimensional code vision and laser lidar

LUAN Jianing, ZHANG Wei, SUN Wei, ZHANG Ao, HAN Dong

2021, 41(5): 1484-1491. DOI: 10.11772/j.issn.1001-9081.2020081162

Asbtract ( )

PDF (2182KB) ( )

References | Related Articles | Metrics

Traditional laser localization algorithms such as Monte Carlo localization algorithm have the problems of low accuracy and poor anti-robot kidnapping performance, and traditional two-dimensional code localization algorithms have complex environmental layout and strict limitation to robot's trajectory. In order to solve these problems, a mobile robot localization algorithm based on two-dimensional code vision and laser lidar data was proposed. Firstly, the computer vision technology was used by the robot to detect two-dimensional codes in the test environment, and the poses of detecting two-dimensional codes were transformed to map coordinates respectively, and they were fused to generate the prior pose information. Then the optimized pose was obtained by the point cloud alignment with the generated information as the initial poses. At the same time, the odometry-vision supervising mechanism was introduced to effectively solve the problems brought by the environmental factors such as the information lack of two-dimensional codes and the wrong recognition of the two-dimensional codes as well as ensure the smoothness of the poses. Finally, experimental results based on mobile robot show that, the proposed algorithm has the average error of lidar sampling points reduced by 92%, the average time spent per pose calculation reduced by 88% compared with the classical Adaptive Monto Carlo Localization (AMCL) algorithm, and it solves robot kidnapping problem effectively. This algorithm can be applied to the indoor robots such as storage robot.

Self-generated deep neural network based 4D trajectory prediction

LI Xujuan, PI Jianyong, HUANG Feixiang, JIA Haipeng

2021, 41(5): 1492-1499. DOI: 10.11772/j.issn.1001-9081.2020081198

Asbtract ( )

PDF (1396KB) ( )

References | Related Articles | Metrics

Since 4-Dimensional (4D) trajectory prediction is not real-time and has the iterative error, an Automatically generated Conditional Variational Auto-Encoder (AutoCVAE) was proposed. It is in the form of encoding-decoding to predict the future trajectory directly, and can select observation number and prediction step flexibly. The method was guided by the preprocessed Automatic Dependent Surveillance-Broadcast (ADS-B) data, and with the reduction of the prediction error as the goal. By means of Bayesian optimization, the model structure was searched within the predefined search space. The hyper parameter values of each time were chosen by referencing the previous evaluation results, so that the structure of the new model obtained in each time was able to be closer to the target, and ultimately, a high precision 4D trajectory prediction model based on ADS-B data was completed. In the experiments, the proposed model was able to predict the trajectory quickly and accurately in real time with the Mean Absolute Error (MAE) of both latitude and longitude less than 0.03 degrees, the altitude MAE under 30 m, the time error at each time point not exceeded 10 s, and each batch trajectory prediction delay within 0.2 s.

Selection of express freight transportation schemes based on rough set over two universes

WANG Xiaorong, ZHANG Yuzhao, ZHANG Zhenjiang

2021, 41(5): 1500-1505. DOI: 10.11772/j.issn.1001-9081.2020071123

Asbtract ( )

PDF (759KB) ( )

References | Related Articles | Metrics

Aiming at the problem of express freight scheme decision under multiple uncertain factors, the express freight scheme decision model and decision rule based on intuitionistic fuzzy rough set over two universes were proposed. Based on the intuitionistic fuzzy rough set theory over two universes, a fuzzy approximate space over two universes for express freight scheme decision was determined. The consumption degrees of fixed cost, transportation cost, transfer cost, carbon emission, transfer time and other transportation indices were regarded as intuitionistic fuzzy numbers, and the intuitionistic fuzzy relation between evaluation indices and transportation schemes were used to calculate the lower approximation set and upper approximation set, and the maximum intuitionistic index and Hamming closeness degree were introduced to determine the transportation scheme decision rules. Taking an express freight transportation line from Lanzhou to Beijing as the example, the optimal transportation scheme was selected from the 9 modes of transportation combined by road, ordinary speed railway and air according to the decision rules. Sensitivity analysis of transportation cost and transfer cost was performed to verify the accuracy of the results. The two optimal transportation schemes finally selected show the applicability of the intuitionistic fuzzy rough set over two universes on such problems.

Collaborative scheduling of rail-mounted gantry crane and container truck in hybrid operation mode of rail-water intermodal terminal

LI Shuyi, HAN Xiaolong

2021, 41(5): 1506-1513. DOI: 10.11772/j.issn.1001-9081.2020071075

Asbtract ( )

PDF (1096KB) ( )

References | Related Articles | Metrics

In the container rail-water terminal, the railway operation area is the essential node linking rail transportation and water transportation and its efficiency can influence the efficiency of container rail-water transportation. Firstly, the features of the "ships, trains" operation mode and the "ships, yard, trains" operation mode were analyzed and compared, and a hybrid operation mode was proposed by combining the actual operation of container rail-water intermodal terminal. Next, with the goal of minimizing the completion time of rail-mounted gantry crane, a mixed integer programming model was developed. The model considered the allowable operating time window constraints of trains and ships, and the realistic constraints such as the interference and safety margin between the rail-mounted gantry cranes as well as the continuous operation and waiting time of rail-mounted gantry crane and container truck. Aiming at the insufficient local search ability of genetic algorithm, a Hybrid Genetic Algorithm (HGA) was proposed by combining the heuristic rules with genetic algorithm to solve the collaborative scheduling problem of rail-mounted gantry crane and container truck, and experiments were conducted. Experimental results verified the effectiveness of the proposed model and the hybrid algorithm. Finally, some experiments were designed to analyze the impact of the number of containers, the proportion of quayside containers, the number of rail-mounted gantry cranes and the number of container trucks on the completion time of rail-mounted gantry crane and container truck. It is found that under the same number of containers, the number of rail-mounted gantry crane should be increased with the increase of proportion of quayside containers to reduce the completion time.

Battery state-of-charge prediction method based on one-dimensional convolutional neural network combined with long short-term memory network

NI Shuiping, LI Huifang

2021, 41(5): 1514-1521. DOI: 10.11772/j.issn.1001-9081.2020071097

Asbtract ( )

PDF (2218KB) ( )

References | Related Articles | Metrics

Focused on the issues of accuracy and stability of battery State-Of-Charge (SOC) prediction and gradient disappearance of deep neural network, a battery SOC prediction method based on the combination of one-Dimensional Convolutional Neural Network (1D CNN) and Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN) named 1D CNN-LSTM (1D CNN combined with LSTM) model was proposed. The current, voltage and resistance of the battery were mapped to the target value SOC by 1D CNN-LSTM model. Firstly, a one-dimensional convolutional layer was used to extract the high-level data features from the sample data and make full use of the feature information of the input data. Secondly, a LSTM layer was used to save the historical input information, so as to effectively prevent the loss of important information. Finally, the prediction results of the battery SOC were outputted through a fully connected layer. The proposed model was trained with the experimental data of multiple cycles of charge-discharge of the battery, the prediction effects of the 1D CNN-LSTM model under different hyperparameter settings were analyzed and compared, and the weight coefficients and bias parameters of the model were adjusted through training the model, so that the optimal model setting was determined. Experimental results show that the 1D CNN-LSTM model has accurate and stable prediction effect of battery SOC. The Mean Absolute Error (MAE), Mean Square Error (MSE) and maximum prediction error of this model are 0.402 7%, 0.002 9% and 0.99% respectively.

Drug-target association prediction algorithm based on graph convolutional network

XU Guobao, CHEN Yuanxiao, WANG Ji

2021, 41(5): 1522-1526. DOI: 10.11772/j.issn.1001-9081.2020081186

Asbtract ( )

PDF (892KB) ( )

References | Related Articles | Metrics

Traditional drug-target association prediction based on biological experiments is difficult to meet the demand of pharmaceutical research because its low efficiency and high cost. In order to solve the problem, a novel Graph Convolution for Drug-Target Interactions (GCDTI) algorithm was proposed. In GCDTI, the graph convolution and auto-encoder technology were combined by using semi-supervised learning to construct an encoding layer for integrating node features and a decoding layer for predicting full-link interactive networks respectively. At the same time, the graph convolution was used to build a latent factor model and effectively utilize the high-dimensional attribute information of drugs and targets for end-to-end learning. In this method, the input characteristic information was able to be combined with the known interaction network without preprocessing, which proved that the graph convolution layer of the model was able to effectively fuse the input data and node characteristics. Compared with other advanced methods, GCDTI has the highest prediction accuracy and average Area Under Receiver Operating Characteristic (ROC) Curve (AUC) (0.924 6±0.004 8), and has strong robustness. Experimental results show that GCDTI with the model architecture of end-to-end learning has the potential to be a reliable predictive method when large amounts of drug and target data need to be predicted.

Remaining useful life prediction of DA40 aircraft carbon brake pads based on bidirectional long short-term memory network

XU Meng, WANG Yakun

2021, 41(5): 1527-1532. DOI: 10.11772/j.issn.1001-9081.2020071125

Asbtract ( )

PDF (1636KB) ( )

References | Related Articles | Metrics

Aircraft brake pads play a very important role in the process of aircraft braking. It is of great significance to accurately predict the Remaining Useful Life (RUL) of aircraft brake pads for reducing braking faults and saving human and material resources. Aiming at the non-stationary and nonlinear characteristics of the aircraft brake pads wear sequence, a model for predicting the RUL of the aircraft brake pads based on Bidirectional Long Short-Term Memory (BiLSTM) network was proposed, namely VMD-BiLSTM model. Firstly, the method of Variational Mode Decomposition (VMD) was used to decompose the original wear sequence into several sub-sequences with different frequencies and bandwidths to reduce the non-stationarity of the sequence. Then, the BiLSTM neural network prediction models were constructed for the decomposed subsequences. Finally, the prediction values of the sub-sequences were superimposed to obtain the final prediction result of brake pads wear value, so as to realize the life prediction of the brake pads. The simulation results show that the Root Mean Square Error (RMSE) and the Mean Absolute Percentage Error (MAPE) of VMD-BiLSTM model are 0.466 and 0.898% respectively, both of which are better than those of the comparison models, verifying the superiority of VMD-BiLSTM model.

Construction of fracture reduction robot system based on cyber-physical systems

FU Zhuoxin, SUN Hao, CHEN Jianwen, GUO Yue, CHEN Jin

2021, 41(5): 1533-1538. DOI: 10.11772/j.issn.1001-9081.2020071133

Asbtract ( )

PDF (1497KB) ( )

References | Related Articles | Metrics

To solve the problems of secondary injury, muscle dysfunction, stiffness of the affected limb, damage to the blood supply, and poor dynamic performance of postoperative correction in traditional reduction methods (such as manual reduction, traction reduction, and surgical reduction) of fracture treatment, a Cyber-Physical and Human System (CPHS) was proposed to guide the reduction movement of the robot. First of all, the composition of the cyber-physical system of the parallel robot was illustrated from the aspects of CPHS system such as digital twin, information perception, system integration, surgical procedure, and simulated reduction. The high positioning accuracy and repeatability of the robot were combined with minimally invasive methods effectively to guide doctors to complete a series of operations such as simulation planning and intraoperative monitoring. Secondly, according to the clinical fracture reduction process, the reduction experiments were performed on simulated fracture cases of 5 groups of different fracture postures under robot operation. Finally, the remaining placements and angle errors in each experimental group were calculated after reduction operation and were compared with the corresponding data of the traditional reduction methods. Experimental results show that the CPHS fracture reduction robot has obvious advantages in fracture reduction and patient postoperative rehabilitation compared with the traditional reduction methods.

Table of Content