Artificial intelligence

Select

Online federated incremental learning algorithm for blockchain

LUO Changyin, CHEN Xuebin, MA Chundi, WANG Junyu

Journal of Computer Applications 2021, 41 (2): 363-371. DOI: 10.11772/j.issn.1001-9081.2020050609

Abstract （692）

PDF （2197KB）（983）

Save

As generalization ability of the out-dated traditional data processing technology is weak, and the technology did not take into account the multi-source data security issues, a blockchain oriented online federated incremental learning algorithm was proposed. Ensemble learning and incremental learning were applied to the framework of federated learning, and stacking ensemble algorithm was used to integrate the local models and the model parameters in model training phase were uploaded to the blockchain with fast synchronization. This made the accuracy of the constructed global model only fall by 1%, while the safety in the stage of training and the stage of storage was improved, so that the costs of the data storage and the transmission of model parameters were reduced, and at the same time, the risk of data leakage caused by model gradient updating was reduced. Experimental results show that the accuracy of the model is over 91.5% and the variance of the model is lower than 10 ^-5, and compared with the traditional integrated data training model, the model has the accuracy slightly reduced, but has the security of data and model improved with the accuracy of the model guaranteed.

Reference | Related Articles | Metrics

Select

Path planning of mobile robots based on ion motion-artificial bee colony algorithm

WEI Bo, YANG Rong, SHU Sihao, WAN Yong, MIAO Jianguo

Journal of Computer Applications 2021, 41 (2): 379-383. DOI: 10.11772/j.issn.1001-9081.2020060794

Abstract （415）

PDF （950KB）（738）

Save

Aiming at the path planning of mobile robots in storage environment, a path planning method based on Ion Motion-Artificial Bee Colony (IM-ABC) algorithm was proposed. In order to improve the convergence speed and searching ability of the traditional Artificial Bee Colony (ABC) algorithm in path planning, a strategy of simulating ion motion was used to update the swarm in this method. Firstly, at the early stage of the algorithm, the anion-cation cross search in ion motion algorithm was used to update the leading bees and following bees, so as to guide the direction of population evolution and greatly improve the development ability of population. Secondly, at the late stage of the algorithm, in order to avoid the local optimum caused by premature convergence in the early stage, random search was adopted by the leading bees and reverse roulette was used by the following bees to select honey sources and expand population diversity. Finally, an adaptive floral fragrance concentration was proposed in the global update mechanism to improve the sampling method, and then the IM-ABC algorithm was obtained. Benchmark function test and simulation experiment results show that the IM-ABC algorithm can not only rapidly converge, but also reduce the number of iterations by 58.3% and improve the optimization performance by 12.6% compared to the traditional ABC algorithm, indicating the high planning efficiency of IM-ABC algorithm.

Reference | Related Articles | Metrics

Select

Survey of sentiment analysis based on image and text fusion

MENG Xiangrui, YANG Wenzhong, WANG Ting

Journal of Computer Applications 2021, 41 (2): 307-317. DOI: 10.11772/j.issn.1001-9081.2020060923

Abstract （808）

PDF （1277KB）（1715）

Save

With the continuous improvement of information technology, the amount of image-text data with orientation on various social platforms is growing rapidly, and the sentiment analysis with image and text fusion is widely concerned. The single sentiment analysis method can no longer meet the demand of multi-modal data. Aiming at the technical problems of image and text sentiment feature extraction and fusion, firstly, the widely used image and text emotional analysis datasets were listed, and the extraction methods of text features and image features were introduced. Then, the current fusion modes of image features and text features were focused on and the problems existing in the process of image-text sentiment analysis were briefly described. Finally, the research directions of sentiment analysis in the future were summarized and prospected for. In order to have a deeper understanding of image-text fusion technology, literature research method was adopted to review the study of image-text sentiment analysis, which is helpful to compare the differences between different fusion methods and find more valuable research schemes.

Reference | Related Articles | Metrics

Select

Multi-graph neural network-based session perception recommendation model

NAN Ning, YANG Chengyi, WU Zhihao

Journal of Computer Applications 2021, 41 (2): 330-336. DOI: 10.11772/j.issn.1001-9081.2020060805

Abstract （547）

PDF （1052KB）（520）

Save

The session-based recommendation algorithms mainly rely on the information from the target session, but fail to fully utilize the collaborative information from other sessions. In order to solve this problem, a Multi-Graph neural network-based Session Perception recommendation (MGSP) model was proposed. Firstly, according to the target session and all sessions in the training set, Item-Transition Graph (ITG) and Collaborative Relation Graph (CRG) were constructed. Based on these two graphs, the Graph Neural Network (GNN) was applied to aggregate the information of the nodes in order to obtain two types of node representations. Then, after the two-layer attention module modelling two type node representations, the session-level representation was obtained. Finally, by using the attention mechanism to fuse the information, the ultimate session representation was gained, and the next interaction item was predicted. The comparison experiments were carried out in two scenarios of e-commerce and civil aviation. Experimental results show that, the proposed algorithm is superior to the optimal benchmark model, with an increase of more than 1 percentage point and 3 percentage point in the indicators on the e-commerce and civil aviation datasets respectively, verifying the effectiveness of the proposed model.

Reference | Related Articles | Metrics

Select

Knowledge reasoning method based on differentiable neural computer and Bayesian network

SUN Jianqiang, XU Shaohua

Journal of Computer Applications 2021, 41 (2): 337-342. DOI: 10.11772/j.issn.1001-9081.2020060843

Abstract （315）

PDF （1252KB）（415）

Save

Aiming at the problem that Artificial Neural Network (ANN) has limited memory capability for knowledge reasoning oriented to Knowledge Graph (KG) and the KG cannot deal with uncertain knowledge, a reasoning method named DNC-BN was propsed based on Differentiable Neural Computer (DNC) and Bayesian Network. Firstly, using Long Short-Term Memory (LSTM) network as the controller, the output vector and the interface vector of network were obtained by processing the input vector and the read vector obtained from the memory at each moment. Then, the read and write heads were used to realize the interaction between the controller with the memory, the read weights were used to calculate the weighted average of data to obtain the read vector, and the write operation was performed by combining the erase vector and write vector with the write weights, so as to modify the memory matrix. Finally, based on the probabilistic inference mechanism, the BN was used to judge the inference relationship between the nodes, and the KG was completed. In the experiments, on the WN18RR dataset, DNC-BN has the Mean Rank of 2 615 and the Hits@10 of 0.528; on the FB15k-237 dataset, DNC-BN has the Mean Rank of 202, and the Hits@10 of 0.519. Experimental results show that the proposed method has good application effect on knowledge reasoning oriented to KG.

Reference | Related Articles | Metrics

Select

Obstacle avoidance path planning algorithm of quad-rotor helicopter based on Bayesian estimation and region division traversal

WANG Jialiang, LI Shuhua, ZHANG Haitao

Journal of Computer Applications 2021, 41 (2): 384-389. DOI: 10.11772/j.issn.1001-9081.2020060962

Abstract （346）

PDF （1767KB）（759）

Save

In order to improve the real-time ability of obstacle avoidance using image processing technology for quad-rotor helicopter, an obstacle avoidance path planning algorithm was proposed based on Bayesian estimation and region division traversal. Firstly, Bayesian estimation was used to preprocess the video images collected by quad-rotor helicopter. Secondly, obstacle probability analysis was performed to obtain key frames from video images, so as to maximize the real-time performance of the helicopter. Finally, the background difference was carried out on these selected image frames to identify the obstacles, and the pixel point traversal algorithm based on region division was implemented in order to improve the accuracy of obstacle identification. Experimental results show that with the use of the proposed algorithm, the real-time performance of quad-rotor helicopter obstacle avoidance is improved with guaranteeing the obstacle avoidance identification ability, and the maximum distance between the ideal trajectory and the actual flight trajectory of the quad-rotor helicopter is 25.6 cm, while the minimum distance is 0.2 cm. The proposed obstacle avoidance path plan algorithm can provide an efficient solution for quad-rotor helicopter to avoid obstacles by using video images collected by camera.

Reference | Related Articles | Metrics

Select

Unmanned aerial vehicle path planning based on improved genetic algorithm

HUANG Shuzhao, TIAN Junwei, QIAO Lu, WANG Qin, SU Yu

Journal of Computer Applications 2021, 41 (2): 390-397. DOI: 10.11772/j.issn.1001-9081.2020060797

Abstract （858）

PDF （1487KB）（1174）

Save

In order to solve the problems such as slow convergence speed, falling into local optimum easily, unsmooth planning path and high cost of traditional genetic algorithm, an Unmanned Aerial Vehicle (UAV) path planning method based on improved Genetic Algorithm (GA) was proposed. The selection operator, crossover operator and mutation operator of genetic algorithm were improved to planning a smooth and effective flight path. Firstly, an environment model suitable for the field information acquisition of UAV was established, and a more complex and accurate mathematical model suitable for this scene was established by considering the objective function and constraints of UAV. Secondly, the hybrid non-multi-string selection operator, asymmetric mapping crossover operator and heuristic multi-mutation operator were proposed to find the optimal path and expand the search range of the population. Finally, a cubic B-spline curve was used to smooth the planned path to obtain a smooth flight path and reduce the calculation time of the algorithm. Experimental results show that, compared with the traditional GA, the cost value of the proposed algorithm was reduced by 68%, and the number of convergence iterations was reduced by 67%; compared with the Ant Colony Optimization (ACO) algorithm, its cost value was reduced by 55% and the number of convergence iterations was reduced by 58%. Through a large number of comparison experiments, it is concluded that when the value of the crossover rate is the reciprocal of chromosome size, the proposed algorithm has the best convergence effect. After testing the algorithm performance in different environments, it can be seen that the proposed algorithm has good environmental adaptability and is suitable for path planning in complex environments.

Reference | Related Articles | Metrics

Select

Pedestrian attribute recognition based on two-domain self-attention mechanism

WU Rui, LIU Yu, FENG Kai

Journal of Computer Applications 2021, 41 (2): 372-378. DOI: 10.11772/j.issn.1001-9081.2020060850

Abstract （389）

PDF （1165KB）（777）

Save

Focusing on the issue that different attributes have different requirements for feature granularity and feature dependence in pedestrian attribute recognition tasks, a pedestrian attribute recognition model based on two-domain self-attention mechanism composed of spatial self-attention mechanism and channel self-attention mechanism was proposed. Firstly, ResNet50 was used as the backbone network to extract the features with certain semantic information. Then, the features were input into the two-branch network respectively to extract the self-attention features with spatial dependence and semantic relevance as well as the global features of overall information. Finally, the features of two branches were concatenated, and the strategies of Batch Normalization (BN) and weighted loss were used to reduce the impact of imbalanced pedestrian attribute samples. Experimental results on two pedestrian attribute datasets PETA and RAP show that the proposed model improves the mean accuracy index by 3.91 percentage points and 4.05 percentage points respectively compared with the benchmark model, and has strong competitiveness in the existing pedestrian attribute recognition models. The proposed pedestrian attribute recognition based on two-domain self-attention mechanism can be used to perform the structural description of pedestrians in monitoring scenarios, so as to improve the accuracy and efficiency of pedestrian analysis and retrieval tasks.

Reference | Related Articles | Metrics

Select

Relation extraction model via attention-based graph convolutional network

WANG Xiaoxia, QIAN Xuezhong, SONG Wei

Journal of Computer Applications 2021, 41 (2): 350-356. DOI: 10.11772/j.issn.1001-9081.2020081310

Abstract （411）

PDF （995KB）（1699）

Save

Aiming at the problem of low information utilization rate of sentence dependency tree and poor feature extraction effect in relation extraction task, an Attention-guided Gate perceptual Graph Convolutional Network (Att-Gate-GCN) model was proposed. Firstly, a soft pruning strategy based on the attention mechanism was used to assign weights to the edges in the dependency tree through the attention mechanism, thus mining the effective information in the dependency tree and filtering the useless information at the same time. Secondly, a gate perceptual Graph Convolutional Network (GCN) structure was constructed, thus increasing the feature perception ability through the gating mechanism to obtain more robust relationship features, and combining the local and non-local dependency features in the dependency tree to further extract key information. Finally, the key information was input into the classifier, then the relationship category label was got. Experimental results indicate that, compared with the original graph convolutional network relation extraction model, the proposed model has the F1 score increased by 2.2 percentage points and 3.8 percentage points on SemEval2010-Task8 dataset and KBP37 dataset respectively, which makes full use of effective information, and improves the relation extraction ability of the model.

Reference | Related Articles | Metrics

Select

Analysis of double-channel Chinese sentiment model integrating grammar rules

QIU Ningjia, WANG Xiaoxia, WANG Peng, WANG Yanchun

Journal of Computer Applications 2021, 41 (2): 318-323. DOI: 10.11772/j.issn.1001-9081.2020050723

Abstract （408）

PDF （1093KB）（1049）

Save

Concerning the problem that ignoring the grammar rules reduces the accuracy of classification when using Chinese text to perform sentiment analysis, a double-channel Chinese sentiment classification model integrating grammar rules was proposed, namely CB_Rule (grammar Rules of CNN and Bi-LSTM). First, the grammar rules were designed to extract information with more explicit sentiment tendencies, and the semantic features were extracted by using the local perception feature of Convolutional Neural Network (CNN). After that, considering the problem of possible ignorance of the context when processing rules, Bi-directional Long Short-Term Memory (Bi-LSTM) network was used to extract the global features containing contextual information, and the local features were fused and supplemented, so that the sentimental feature tendency information of CNN model was improved. Finally, the improved features were input into the classifier to perform the sentiment tendency judgment, and the Chinese sentiment model was constructed. The proposed model was compared with R-Bi-LSTM (Bi-LSTM for Chinese sentiment analysis combined with grammar Rules) and SCNN model (a travel review sentiment analysis model that combines Syntactic rules and CNN) on the Chinese e-commerce review text dataset. Experimental results show that the accuracy of the proposed model is increased by 3.7 percentage points and 0.6 percentage points respectively, indicating that the proposed CB_Rule model has a good classification effect.

Reference | Related Articles | Metrics

Select

Biomedical named entity recognition with graph network based on syntactic dependency parsing

XU Li, LI Jianhua

Journal of Computer Applications 2021, 41 (2): 357-362. DOI: 10.11772/j.issn.1001-9081.2020050738

Abstract （403）

PDF （845KB）（959）

Save

The existing biomedical named entity recognition methods do not use the syntactic information in the corpus, resulting in low precision. To solve this problem, a biomedical named entity recognition model with graph network based on syntactic dependency parsing was proposed. Firstly, the Convolutional Nerual Network (CNN) was used to generate character vectors which were concatenated with word vectors, then they were sent to Bidirectional Long Short-Term Memory (BiLSTM) network for training. Secondly, syntactic dependency parsing to the corpus was conducted with a sentence as a unit, and the adjacency matrix was constructed. Finally, the output of BiLSTM and the adjacency matrix constructed by syntactic dependency parsing were sent to Graph Convolutional Network (GCN) for training, and the graph attention mechanism was introduced to optimize the feature weights of adjacency nodes to obtain the model output. On JNLPBA dataset and NCBI-disease dataset, the proposed model reached F1 score of 76.91% and 87.80% respectively, which were 2.62 and 1.66 percentage points higher than those of the baseline model respectively. Experimental results prove that the proposed method can effectively improve the performance of the model in the biomedical named entity recognition task.

Reference | Related Articles | Metrics

Select

Personalized social event recommendation method integrating user historical behaviors and social relationships

SUN Heli, XU Tong, HE Liang, JIA Xiaolin

Journal of Computer Applications 2021, 41 (2): 324-329. DOI: 10.11772/j.issn.1001-9081.2020050666

Abstract （379）

PDF （919KB）（615）

Save

In order to improve the recommendation effect of social events in Event-based Social Network (EBSN), a personalized social event recommendation method combining historical behaviors and social relationships of users was proposed. Firstly, deep learning technology was used to build a user model from two aspects:the user's historical behaviors and the potential social relationships between users. Then, when modeling user preferences, the negative vector representation of user preferences was introduced, and the attention weight layer was used to assign different weights to different events in the user's historical behaviors and different friends in the user's social relationships according to different candidate recommendation events, at the same time, the various characteristics of events and groups were considered. Finally, a lot of experiments were carried out on the real datasets. Experimental results show that this personalized social event recommendation method is better than the comparative Deep User Modeling framework for Event Recommendation (DUMER) and DIN (Deep Interest Network) model combined with attention mechanism in terms of Hits Ratio (HR), Normalized Discounted Cumulative Gain (NDCG) and Mean Reciprocal Rank (MRR) evaluation indicators.

Reference | Related Articles | Metrics

Select

Semantic segmentation method based on edge attention model

SHE Yulong, ZHANG Xiaolong, CHENG Ruoqin, DENG Chunhua

Journal of Computer Applications 2021, 41 (2): 343-349. DOI: 10.11772/j.issn.1001-9081.2020050725

Abstract （480）

PDF （1372KB）（633）

Save

Liver is the main organ of human metabolic function. At present, the main problems of machine learning in the semantic segmentation of liver images are as follows:1) there are inferior vena cava, soft tissue and blood vessels in the middle of the liver, and even some necrosis or hepatic fissures; 2) the boundary between the liver and some adjacent organs is blurred and difficult to distinguish. In order to solve the problems mentioned above, the Edge Attention Model (EAM)and the Edge Attention Net (EANet) were proposed by using Encoder-Decoder framework. In the encoder, the residual network ResNet34 pre-trained on ImageNet and the EAM were utilized, so as to fully obtain the detailed feature information of liver edge; in the decoder, the deconvolution operation and the proposed EAM were used to perform the feature extraction to the useful information, thereby obtaining the semantic segmentation diagram of liver image. Finally, the smoothing was performed to the segmentation images with a lot of noise. Comparison experiments with AHCNet were conducted on three datasets, and the results showed that:on 3Dircadb dataset, the Volumetric Overlap Error (VOE) and Relative Volume Difference (RVD) of EANet were decreased by 1.95 percentage points and 0.11 percentage points respectively, and the DICE accuracy was increased by 1.58 percentage points; on Sliver07 dataset, the VOE, Maximum Surface Distance (MSD) and Root Mean Square Surface Distance (RMSD) of EANet were decreased approximately by 1 percentage points, 3.3 mm and 0.2 mm respectively; on clinical MRI liver image dataset of a hospital, the VOE and RVD of EANet were decreased by 0.88 percentage points and 0.31 percentage points respectively, and the DICE accuracy was increased by 1.48 percentage points. Experimental results indicate that the proposed EANet has good segmentation effect of liver image.

Reference | Related Articles | Metrics

Select

Imputation algorithm for hybrid information system of incomplete data analysis approach based on rough set theory

PENG Li, ZHANG Haiqing, LI Daiwei, TANG Dan, YU Xi, HE Lei

Journal of Computer Applications 2021, 41 (3): 677-685. DOI: 10.11772/j.issn.1001-9081.2020060894

Abstract （398）

PDF （1135KB）（644）

Save

Concerning the problem of the poor imputation capability of the ROUgh Set Theory based Incomplete Data Analysis Approach (ROUSTIDA) for the Hybrid Information System (HIS) containing multiple attributes such as discrete (e.g., integer, string, and enumeration), continuous (e.g., floating) and missing attributes in the real-world application, a Rough Set Theory based Hybrid Information System for Missing Data Imputation Approach (RSHISMIS) was proposed. Firstly, according to the idea of decision attribute equivalence class partition, HIS was divided to solve the problem of decision rule conflict problem that might occurs after imputation. Secondly, a hybrid distance matrix was defined to reasonably quantify the similarity between objects in order to filter the samples with imputation capability and to overcome the shortcoming of ROUSTIDA that cannot handle with continuous attributes. Thirdly, the nearest-neighbor idea was combined to solve the problem of ROUSTIDA that it cannot impute the data with the same missing attribute in the case of conflict between the attribute values of non-discriminant objects. Finally, experiments were conducted on 10 UCI datasets, and the proposed method was compared with classical algorithms including ROUSTIDA, K Nearest Neighbor Imputation (KNNI), Random Forest Imputation (RFI), and Matrix Factorization (MF). Experimental results show that the proposed method outperforms ROUSTIDA by 81% in recall averagely and 5% to 53% in precision. Meanwhile, the method has the maximal 0.12 reduction of Normalized Root Mean Square Error (NRMSE) compared with ROUSTIDA. Besides, the classification accuracy of the method is 7% higher on average than that of ROUSTIDA, and is also better than those of the imputation algorithms KNNI, RFI and MF.

Reference | Related Articles | Metrics

Select

Reinforced automatic summarization model based on advantage actor-critic algorithm

DU Xixi, CHENG Hua, FANG Yiquan

Journal of Computer Applications 2021, 41 (3): 699-705. DOI: 10.11772/j.issn.1001-9081.2020060837

Abstract （377）

PDF （975KB）（843）

Save

The extractive summary model is relatively redundant and the abstractive summary model often loses key information and has inaccurate summary and repeated generated content in long text automatic summarization task. In order to solve these problems, a Reinforced Automatic Summarization model based on Advantage Actor-Critic algorithm (A2C-RLAS) for long text was proposed. Firstly, the key sentences of the original text were extracted by the extractor based on the hybrid neural network of Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN). Then, the key sentences were refined by the rewriter based on the copy mechanism and the attention mechanism. Finally, the Advantage Actor-Critic (A2C) algorithm in reinforcement learning was used to train the entire network, and the semantic similarity between the rewritten summary and the reference summary (BERTScore (Evaluating Text Generation with Bidirectional Encoder Representations from Transformers) value) was used as a reward to guide the extraction process, so as to improve the quality of sentences extracted by the extractor. The experimental results on CNN/Daily Mail dataset show that, compared with models such as Reinforcement Learning-based Extractive Summarization (Refresh) model, a Recurrent Neural Network based sequence model for extractive summarization (SummaRuNNer) and Distributional Semantics Reward (DSR) model, the A2C-RLAS has the final summary with content more accurate, language more fluent and redundant content effectively reduced, at the same time, A2C-RLAS has both the ROUGE (Recall-Oriented Understudy for Gisting Evaluation) and BERTScore indicators improved. Compared to the Refresh model and the SummaRuNNer model, the ROUGE-L value of the A2C-RLAS model is increased by 6.3% and 10.2% respectively; compared with the DSR model, the F1 value of the A2C-RLAS model is increased by 30.5%.

Reference | Related Articles | Metrics

Select

Salient object detection based on difference of Gaussian feature network

HOU Yunlong, ZHU Lei, CHEN Qin, LYU Suidong

Journal of Computer Applications 2021, 41 (3): 706-713. DOI: 10.11772/j.issn.1001-9081.2020060957

Abstract （385）

PDF （1463KB）（832）

Save

As a clue with physiological basis, the center-surround contrast theory has been widely used in traditional saliency detection models. However, this theory is rarely applied to models based on deep Convolutional Neural Network (CNN) explicitly. In order to introduce the classic center-surround contrast theory into deep CNN, a salient object detection model based on Difference of Gaussian (DoG) feature network was proposed. Firstly, a Difference of Gaussian Pyramid (DGP) structure was constructed on the deep features of multiple scales to perceive the local prominent features of salient object in an image. Then, the obtained differential feature were used to perform weighted selection to the deep features with rich semantic information. Finally, the accurate extraction of the salient object was realized. In addition, the Gaussian smoothing process was implemented by using standard one-dimensional convolution in the proposed network design, so as to reduce the computational complexity and realize the end-to-end training of the network at the same time. Through comparison of the proposed model and six salient object detection algorithms on four public datasets, it can be seen that the results obtained by the proposed model achieve the best performance in the quantitative evaluation of Mean Absolute Error (MAE) and maximum F-measure. Especially on the DUTS-TE dataset the maximum F-measure and the mean absolute error of the results of the proposed model reach 0.885 and 0.039 respectively. Experimental results show that the proposed model has good detection performance for salient objects in complex natural scenes.

Reference | Related Articles | Metrics

Select

Human action recognition method based on low-rank action information and multi-scale convolutional neural network

JIANG Li, HUANG Shijian, YAN Wenjuan

Journal of Computer Applications 2021, 41 (3): 721-726. DOI: 10.11772/j.issn.1001-9081.2020060958

Abstract （348）

PDF （1376KB）（919）

Save

In view of the problem that traditional methods of action information acquisition in human action recognition need cumbersome steps and various assumptions, and considering the superior performance of Convolutional Neural Network (CNN) in image and video processing, a human action recognition method based on Low-rank Action Information (LAI) and Multi-scale Convolutional Neural Network (MCNN) was proposed. Firstly, the action video was divided into several segments, and the LAI of each segment was extracted by the low-rank learning of this segment, then the LAI of all segments was connected together on the time axis to obtain the LAI of the whole video, which effectively captured the action information in the video, so as to avoid cumbersome extraction steps and various assumptions. Secondly, according to the characteristics of LAI, an MCNN model was designed. In the model, the multi-scale convolution kernels were used to obtain the action characteristics of LAI under different receptive fields, and the reasonable design of each convolution layer, pooling layer and fully connected layer were utilized to further refine the characteristics and finally output the action categories. The performance of the proposed method was verified on two benchmark databases KTH and HMDB51, and three groups of comparison experiments were designed and carried out. Experimental results show that the recognition rates of the proposed method are 97.33% and 72.05% respectively on the two databases, which are at least increased by 0.67 and 1.15 percentage points respectively compared with those of the methods of Two-Fold Transformation (TFT) and Deep Temporal Embedding Network (DTEN). The proposed method can further promote the wide application of action recognition technology in security, human-computer interaction and other fields.

Reference | Related Articles | Metrics

Select

Face frontalization generative adversarial network algorithm based on face feature map symmetry

LI Hongxia, QIN Pinle, YAN Hanmei, ZENG Jianchao, BAO Qianyue, CHAI Rui

Journal of Computer Applications 2021, 41 (3): 714-720. DOI: 10.11772/j.issn.1001-9081.2020060779

Abstract （598）

PDF （1432KB）（688）

Save

At present, the research of face frontalization mainly solves the face yaw problem, and pays less attention to the face frontalization of the side face affected by yaw and pitch at the same time in real scenes such as surveillance video. Aiming at this problem and the problem of incomplete identity information retained in front face image generated by multi-angle side faces, a Generative Adversarial Network (GAN) based on feature map symmetry and periocular feature preserving loss was proposed. Firstly, according to the prior of face symmetry, a symmetry module of the feature map was proposed. The face key point detector was used to detect the position of nasal tip point, and mirror symmetry was performed to the feature map extracted by the encoder according to the nasal tip, so as to alleviate the lack of facial information at the feature level. Finally, benefiting from the idea of periocular recognition, the periocular feature preserving loss was added in the existing identity preserving method of generated image to train the generator to generate realistic and identity-preserving front face image. Experimental results show that the facial details of the images generated by the proposed algorithm were well preserved, and the average Rank-1 recognition rate of faces with all angles under the pitch of CAS-PEAL-R1 dataset is 99.03%, which can effectively solve the frontalization problem of multi-angle side faces.

Reference | Related Articles | Metrics

Select

Text correction and completion method in continuous sign language recognition

LONG Guangyu, CHEN Yiqiang, XING Yunbing

Journal of Computer Applications 2021, 41 (3): 694-698. DOI: 10.11772/j.issn.1001-9081.2020060798

Abstract （390）

PDF （877KB）（915）

Save

Aiming at the problem that the text results of continuous sign language recognition based on video have problems of semantic ambiguity and chaotic word order, a two-step method was proposed to convert the sign language text of the continuous sign language recognition result into a fluent and understandable Chinese text. In the first step, the natural sign language rules and N-gram language model ( N-gram) were used to perform the text ordering of the continuous sign language recognition results. In the second step, a Bidirectional Long-Term Short-Term Memory (Bi-LSTM) network model was trained by using the Chinese universal quantifier dataset to solve the quantifier-free problem of the sign language grammar, so as to improve the fluency of texts. The absolute accuracy and the proportion of the longest correct subsequences were adopted as the evaluation indexes of text ordering. Experimental results showed that the text ordering results of the proposed method had the absolute accuracy of 77.06%, the proportion of the longest correct subsequences of 86.55%, and the accuracy of quantifier completion of 97.23%. The proposed method can effectively improve the smoothness and intelligibility of text results of continuous sign language recognition. It has been successfully applied to the video-based continuous sign language recognition, which improves the barrier-free communication experience between the hearing-impaired and the normal-hearing people.

Reference | Related Articles | Metrics

Select

YOLOv3 compression and acceleration based on ZYNQ platform

GUO Wenxu, SU Yuanqi, LIU Yuehu

Journal of Computer Applications 2021, 41 (3): 669-676. DOI: 10.11772/j.issn.1001-9081.2020060994

Abstract （931）

PDF （1391KB）（1226）

Save

The object detection networks with high accuracy are hard to be directly deployed on end-devices such as vehicles and drones due to their significant increase of parameters and computational cost. In order to solve the problem, by considering network compression and computation acceleration, a new compression scheme for residual networks was proposed to compress YOLOv3 (You Only Look Once v3), and this compressed network was then accelerated on ZYNQ platform. Firstly, a network compression algorithm containing both network pruning and network quantization was proposed. In the aspect of network pruning, a strategy for residual structure was introduced to divide the network pruning into two granularities:channel pruning and residual connection pruning, which overcame the limitations of the channel pruning on residual connections and further reduced the parameter number of the model. In the aspect of network quantization, a relative entropy-based simulated quantization was utilized to quantize the parameters channel by channel, and perform the online statistics of the parameter distribution and the information loss caused by the parameter quantization, so as to assist to choose the best quantization strategy to reduce the precision loss during the quantization process. Secondly, the 8-bit convolution acceleration module was designed and optimized on ZYNQ platform, which optimized the on-chip cache structure and accelerate the compressed YOLOv3 with combining the Winograd algorithm. Experimental results show that the proposed solution can achieve smaller model scale and higher accuracy (7 percent points increased) compared to YOLOv3 tiny. Meanwhile, the hardware acceleration method on ZYNQ platform achieves higher energy efficiency ratio than other platforms, thus helping the actual deployment of YOLOv3 and other residual networks on the end sides of ZYNQ.

Reference | Related Articles | Metrics

Select

Co-training algorithm combining improved density peak clustering and shared subspace

LYU Jia, XIAN Yan

Journal of Computer Applications 2021, 41 (3): 686-693. DOI: 10.11772/j.issn.1001-9081.2020071095

Abstract （307）

PDF （2185KB）（368）

Save

There would be lack of useful information in added unlabeled samples during the iterations of co-training algorithm, meanwhile, the labels of the samples labeled by multiple classifiers may happen to be inconsistent, which would lead to accumulation of classification errors. To solve the above problems, a co-training algorithm combining improved density peak clustering and shared subspace was proposed. Firstly, the two base classifiers were obtained by the complementation of attribute sets. Secondly, an improved density peak clustering was performed based on the siphon balance rule. And beginning from the cluster centers, the unlabeled samples with high mutual neighbor degrees were selected in a progressive manner, then they were labeled by the two base classifiers. Finally, the final categories of the samples with inconsistent labels were determined by the shared subspace obtained by the multi-view non-negative matrix factorization algorithm. In the proposed algorithm, the unlabeled samples with better representation of spatial structure were selected by the improved density peak clustering and mutual neighbor degree, and the same sample labeled by different labels was revised via shared subspace, solving the low classification accuracy problem caused by sample misclassification. The algorithm was validated by comparisons in multiple experiments on 9 UCI datasets, and experimental results show that the proposed algorithm has the highest classification accuracy rate in 7 data sets, and the second highest classification accuracy rate in the other 2 data sets.

Reference | Related Articles | Metrics

Select

Accurate object tracking algorithm based on distance weighting overlap prediction and ellipse fitting optimization

WANG Ning, SONG Huihui, ZHANG Kaihua

Journal of Computer Applications 2021, 41 (4): 1100-1105. DOI: 10.11772/j.issn.1001-9081.2020060869

Abstract （351）

PDF （2560KB）（298）

Save

In order to solve the problems of Discriminative Correlation Filter(DCF) tracking algorithm such as model drift, rough scale and tracking failure when the tracking object suffers from rotation or non-rigid deformation, an accurate object tracking algorithm based on Distance Weighting Overlap Prediction and Ellipse Fitting Optimization(DWOP-EFO) was proposed. Firstly, the overlap and center-distance between bounding-boxes were both used as the basis for the evaluation of dynamic anchor boxes, which can narrow the spatial distance between the prediction result and the object region,easing the model drift problem. Secondly,in order to further improve the tracking accuracy,a lightweight object segmentation network was applied to segment the object from background, and the ellipse fitting algorithm was applied to optimize the segmentation contour result and output stable rotated bounding box, achieving accurate estimation of the object scale. Finally, a scale-confidence optimization strategy was used to realize gating output of the scale result with high confidence. The proposed algorithm can alleviate the problem of model drift, enhance the robustness of the tracker, and improve the accuracy of the tracker. Experiments were conducted on two widely used evaluation datasets Visual Object Tracking challenge(VOT2018) and Object Tracking Benchmark(OTB100). Experimental results demonstrate that the proposed algorithm improves Expected-Average-Overlap(EAO) index by 2.2 percentage points compared with Accurate Tracking by Overlap Maximization(ATOM) and by 1.9 percentage points compared with Learning Discriminative Model Prediction for tracking(DiMP). Meanwhile, on evaluation dataset OTB100, the proposed algorithm outperforms ATOM by 1.3 percentage on success rate index and shows significant performance especially on attribute of non-rigid deformation. the proposed algorithm runs over 25 frame/s averagely on evaluation datasets which realizes real-time tracking.

Reference | Related Articles | Metrics

Select

Overview of information extraction of free-text electronic medical records

CUI Bowen, JIN Tao, WANG Jianmin

Journal of Computer Applications 2021, 41 (4): 1055-1063. DOI: 10.11772/j.issn.1001-9081.2020060796

Abstract （696）

PDF （1090KB）（1274）

Save

Information extraction technology can extract the key information in free-text electronic medical records, helping the information management and subsequent information analysis of the hospital. Therefore, the main process of free-text electronic medical record information extraction was simply introduced, the research results of single extraction and joint extraction methods for three most important types of information:named entity, entity assertion and entity relation in the past few years were studied, and the methods, datasets, and final effects of these results were compared and summarized. In addition, an analysis of the features, advantages and disadvantages of several popular new methods, a summarization of commonly used datasets in the field of information extraction of free-text electronic medical records, and an analysis of the current status and research directions of related fields in China was carried out.

Reference | Related Articles | Metrics

Select

Auto-encoder based multi-view attributed network representation learning model

FAN Wei, WANG Huimin, XING Yan

Journal of Computer Applications 2021, 41 (4): 1064-1070. DOI: 10.11772/j.issn.1001-9081.2020061006

Abstract （336）

PDF （1029KB）（484）

Save

Most of the traditional network representation learning methods cannot consider the rich structure information and attribute information in the network at the same time, resulting in poor performance of subsequent tasks such as classification and clustering. In order to solve this problem, an Auto-Encoder based Multi-View Attributed Network Representation learning model(AE-MVANR) was proposed. Firstly, the topological structure information of the network was transformed into the Topological Structure View(TSV), and the co-occurrence frequencies of the same attributes between nodes were calculated to construct the Attributed Structure View(ASV). Then, the random walk algorithm was used to obtain a series of node sequences on two views separately. At last, by inputting all the generated sequences into an auto-encoder model for training, the node representation vectors that integrate structure information and attribute information were obtained. Extensive experiments of classification and clustering tasks on several real-world datasets were carried out. The results demonstrate that AE-MVANR outperforms the widely used network representation learning method based solely on structure information and the one based on both network structure information and node attribute information. In specific, for classification results of the proposed model, the maximum increase of accuracy is 43.75%, and for clustering results of the proposed model, the maximum increase of Normalized Mutual Information(NMI) is 137.95%, the maximum increase of Silhouette Coefficient is 1 314.63% and the maximum decrease of Davies Bouldin Index(DBI) is 45.99%.

Reference | Related Articles | Metrics

Select

Robust multi-view clustering algorithm based on adaptive neighborhood

LI Xingfeng, HUANG Yuqing, REN Zhenwen, LI Yihong

Journal of Computer Applications 2021, 41 (4): 1093-1099. DOI: 10.11772/j.issn.1001-9081.2020060828

Abstract （375）

PDF （1021KB）（716）

Save

Since the existing adaptive neighborhood based multi-view clustering algorithms do not consider the noise and the loss of consensus graph information, a Robust Multi-View Graph Clustering(RMVGC) algorithm based on adaptive neighborhood was proposed. Firstly, to avoid the influence of noise and outliers on the data, the Robust Principal Component Analysis(RPCA) model was used to learn multiple clean low-rank data from the original data. Secondly, the adaptive neighborhood learning was employed to directly fuse multiple clean low-rank data to obtain a clean consensus affinity graph, thus reducing the information loss in the process of graph fusion. Experimental results demonstrate that the Normalized Mutual Informations(NMI) of the proposed algorithm RMVGC is improved by 5.2, 1.36, 27.2, 4.66 and 5.85 percentage points, respectively, compared to the current popular multi-view clustering algorithms on MRSCV1, BBCSport, COIL20, ORL and UCI digits datasets. Meanwhile, in the proposed algorithm, the local structure of data is maintained, the robustness against the original data is enhanced, the quality of affinity graph is improved, and such that the proposed algorithm has great clustering performance on multi-view datasets.

Reference | Related Articles | Metrics

Select

Weight allocation and case base maintenance method of case-based reasoning classifier

YAN Aijun, WEI Zhiyuan

Journal of Computer Applications 2021, 41 (4): 1071-1077. DOI: 10.11772/j.issn.1001-9081.2020071016

Abstract （264）

PDF （871KB）（787）

Save

As feature weight allocation and case base maintenance have an important influence on the performance of Case-Based Reasoning(CBR) classifier, a CBR algorithm model named Ant lion and Expectation maximization of Gaussian mixture model CBR(AGECBR) was proposed, in which the Ant Lion Optimizer(ALO) was used to allocate weights and Expectation Maximization algorithm of Gaussian Mixture Model(GMMEM) was used for case base maintenance. Firstly, the ALO was used to allocate the feature weights. In this process, the classification accuracy of CBR was used as the fitness function of the ALO to iteratively optimize the feature weights, so as to achive the optimized allocation of feature weights. Secondly, the expectation maximization algorithm of Gaussian mixture model was used to perform clustering analysis to each case in the case base, and the noise cases and redundant cases in the base were deleted, so as to realize the maintenance of the case base. The experiments were carried out on the UCI standard datasets, in which, AGECBR has the average classification accuracy 3.83-5.44 percentage points higher than Back Propagation(BP), k-Nearest Neighbor(kNN) and other classification algorithms. Experimental results show that the proposed method can effectively improve the accuracy of CBR classification.

Reference | Related Articles | Metrics

Select

Topic-expanded emotional conversation generation based on attention mechanism

YANG Fengrui, HUO Na, ZHANG Xuhong, WEI Wei

Journal of Computer Applications 2021, 41 (4): 1078-1083. DOI: 10.11772/j.issn.1001-9081.2020071063

Abstract （593）

PDF （937KB）（1049）

Save

More and more studies begin to focus on emotional conversation generation. However, the existing studies tend to focus only on emotional factors and ignore the relevance and diversity of topics in dialogues, as well as the emotional tendency closely related to topics, which may lead to the quality decline of generated responses. Therefore, a topic-expanded emotional conversation generation model that integrated topic information and emotional factors was proposed. Firstly, the conversation context was globally-encoded, the topic model was introduced to obtain the global topic words, and the external affective dictionary was used to obtain the global affective words in this model. Secondly, the topic words were expanded by semantic similarity and the topic-related affective words were extracted by dependency syntax analysis in the fusion module. Finally, the context, topic words and affective words were input into a decoder based on the attention mechanism to prompt the decoder to generate topic-related emotional responses. Experimental results show that the model can generate rich and emotion-related responses. Compared with the model Topic-Enhanced Emotional Conversation Generation(TE-ECG), the proposed model has an average increase of 16.3% and 15.4% in unigram diversity(distinct-1) and bigram diversity(distinct-2); and compared with Seq2SeqA(Sequence to Sequence model with Attention), the proposed model has an average increase of 26.7% and 28.7% in unigram diversity(distinct-1) and bigram diversity(distinct-2).

Reference | Related Articles | Metrics

Select

β-distribution reduction based on discernibility matrix in interval-valued decision systems

LI Leitao, ZHANG Nan, TONG Xiangrong, YUE Xiaodong

Journal of Computer Applications 2021, 41 (4): 1084-1092. DOI: 10.11772/j.issn.1001-9081.2020040563

Abstract （299）

PDF （935KB）（316）

Save

At present, the scale of interval type data is getting larger and larger. When using the classic attribute reduction method to process, the data needs to be preprocessed,thus leading to the loss of original information. To solve this problem, a β-distribution reduction algorithm of the interval-valued decision system was proposed. Firstly, the concept and the reduction target of β-distribution of the interval-valued decision system were given, and the proposed related theories were proved. Then, the discernibility matrix and discernibility function of β-distribution reduction were constructed for the above reduction target,and the β-distribution reduction algorithm of the interval-valued decision system was proposed. Finally,14 UCI datasets were selected for experimental verification. On Statlog dataset, when the similarity threshold is 0.6 and the number of objects is 100, 200, 400, 600 and 846 respectively, the average reduction length of the β-distribution reduction algorithm is 1.6, 2.2, 1.4, 2.4 and 2.6 respectively, the average reduction length of the Distribution Reduction Algorithm based on Discernibility Matrix(DRADM) is 2.0, 3.0, 3.0, 4.0 and 4.0 respectively, the average reduction length of the Maximum Distribution Reduction Algorithm based on Discernibility Matrix(MDRADM) is 2.0, 3.0, 3.0, 4.0 and 3.0 respectively. The effectiveness of the proposed β-distribution reduction algorithm is verified by experimental results.

Reference | Related Articles | Metrics

Select

Weakly supervised fine-grained image classification algorithm based on attention-attention bilinear pooling

LU Xinwei, YU Pengfei, LI Haiyan, LI Hongsong, DING Wenqian

Journal of Computer Applications 2021, 41 (5): 1319-1325. DOI: 10.11772/j.issn.1001-9081.2020071105

Abstract （371）

PDF （1945KB）（1040）

Save

With the rapid development of artificial intelligence, the purpose of image classification is not only to identify the major categories of objects, but also to classify the images of the same category into more detailed subcategories. In order to effectively discriminate small differences between categories, a fine-grained classification algorithm was proposed based on Attention-Attention Bilinear Pooling (AABP). Firstly, the Inception V3 pre-training model was applied to extract the global image features, and the local attention region on the feature mapping was forecasted with the deep separable convolution. Then, the Weakly Supervised Data Augmentation Network (WS-DAN) was applied to feed the augmented image back into the network, so as to enhance the generalization ability of the network to prevent overfitting. Finally, the linear fusion of the further extracted attention features was performed in AABP network to improve the accuracy of the classification. Experimental results show that this method achieves accuracy of 88.51% and top5 accuracy of 97.65% on CUB-200-2011 dataset, accuracy of 89.77% and top5 accuracy of 99.27% on Stanford Cars dataset, and accuracy of 93.5% and top5 accuracy of 97.96% on FGVC-Aircraft dataset.

Reference | Related Articles | Metrics

Select

Multimodal sentiment analysis based on feature fusion of attention mechanism-bidirectional gated recurrent unit

LAI Xuemei, TANG Hong, CHEN Hongyu, LI Shanshan

Journal of Computer Applications 2021, 41 (5): 1268-1274. DOI: 10.11772/j.issn.1001-9081.2020071092

Abstract （972）

PDF （960KB）（1330）

Save

Aiming at the problem that the cross-modality interaction and the impact of the contribution of each modality on the final sentiment classification results are not considered in multimodal sentiment analysis of video, a multimodal sentiment analysis model of Attention Mechanism based feature Fusion-Bidirectional Gated Recurrent Unit (AMF-BiGRU) was proposed. Firstly, Bidirectional Gated Recurrent Unit (BiGRU) was used to consider the interdependence between utterances in each modality and obtain the internal information of each modality. Secondly, through the cross-modality attention interaction network layer, the internal information of the modalities were combined with the interaction between modalities. Thirdly, an attention mechanism was introduced to determine the attention weight of each modality, and the features of the modalities were effectively fused together. Finally, the sentiment classification results were obtained through the fully connected layer and softmax layer. Experiments were conducted on open CMU-MOSI (CMU Multimodal Opinion-level Sentiment Intensity) and CMU-MOSEI (CMU Multimodal Opinion Sentiment and Emotion Intensity) datasets. The experimental results show that compared with traditional multimodal sentiment analysis methods (such as Multi-Attention Recurrent Network (MARN)), the AMF-BiGRU model has the accuracy and F1-Score on CMU-MOSI dataset improved by 6.01% and 6.52% respectively, and the accuracy and F1-Score on CMU-MOSEI dataset improved by 2.72% and 2.30% respectively. AMF-BiGRU model can effectively improve the performance of multimodal sentiment classification.

Reference | Related Articles | Metrics

Project Articles