Journal of Computer Applications

Stylistic multiple features mining based on attention network

WU Haiyan, LIU Ying

2020, 40(8): 2171-2181. DOI: 10.11772/j.issn.1001-9081.2019122204

Asbtract ( )

PDF (1584KB) ( )

References | Related Articles | Metrics

To solve the problem that it is difficult to mine the features of different registers in large-scale corpus and it needs a lot of professional knowledge and manpower, a method to mine the features of distinguishing different registers automatically was proposed. First, the register was expressed as words, parts-of-speech, punctuations, and their bigrams, syntactic structure as well as multiple combined features. Then, the combination model of attention mechanism and Multi-Layer Perceptron (MLP) (i.e. attention network) was used to classify the registers into novel, news and textbook. And, the important features that were able to help to distinguish the registers were automatically extracted in this process. Finally, through the further analysis of these features, the characteristics of different registers and some linguistic conclusions were obtained. Experimental results show that novel, news, and textbook have significant differences in words, topic words, word dependencies, parts-of-speech, punctuations and syntactic structures, which implies that there will naturally present some diversity in the use of words, parts-of-speech, punctuations, and syntactic structures due to the different communication objects, purposes, contents, and environments when people utilize language.

Entity recognition and relation extraction model for coal mine

ZHANG Xinyi, FENG Shimin, DING Enjie

2020, 40(8): 2182-2188. DOI: 10.11772/j.issn.1001-9081.2019122255

Asbtract ( )

PDF (1096KB) ( )

References | Related Articles | Metrics

In view of the problems of term nesting, polysemy and error propagation between extraction subtask tasks, a deep attention model framework was proposed. First, the annotation strategy was used to jointly learn two sub tasks of knowledge extraction for solving the problem of error propagation. Second, a projection method combining multiple word vector information was proposed to alleviate the polysemy problem in term extraction in coal mine field. Third, a deep feature extraction network was designed, and a deep attention model and two model enhancement schemes were proposed to fully extract the semantic information. Finally, the classification layer of the model was analyzed to simplify the model to the maximum extent under the premise of ensuring the extraction effect. Experimental results show that, compared with the best model of coding-decoding structure, the proposed model has the F1-score increased by 1.5 percentage points and the model training speed improved by nearly 3 times. The proposed model can effectively complete two knowledge extraction subtasks which are term extraction and term relationship extraction in coal mine field.

Fake news content detection model based on feature aggregation

HE Hansen, SUN Guozi

2020, 40(8): 2189-2193. DOI: 10.11772/j.issn.1001-9081.2019122114

Asbtract ( )

PDF (845KB) ( )

References | Related Articles | Metrics

Concerning the problem that detection performance and generalization performance of the classification algorithm model in fake news content detection cannot be taken into account at the same time, a model based on feature aggregation was proposed, namely CCNN (Center-Cluster-Neural-Network). Firstly, the global temporal features of the text were extracted by bi-directional long and short term recurrent neural network, and the word or phrase features in the range of window were extracted by Convolutional Neural Network (CNN). Secondly, the feature aggregation layer based on dual center loss training was selected after the CNN pooling layer. Finally, the feature data of Bi-directional Long-Short Term Memory (Bi-LSTM) and CNN were stitched into a vector in the depth direction and provided to the fully connected layer. And the final classification result was output by the model trained by uniform loss function (uniform-sigmod). Experimental results show that the proposed model has an F1 value of 80.5%, the difference between training and validation sets is 1.3%. Compared with the traditional models such as Support Vector Machines (SVM), Naïve Bayes (NB) and Random Forest (RF), the proposed model has the F1 value increased by 9%-14%; compared with neural network models such as Long Short Term Memory (LSTM) and FastText, the proposed model has the generalization performance increased by 1.3%-2.5%. It can be seen that the proposed algorithm can improve the classification performance while ensuring a certain generalization ability, so the overall performance is enhanced.

Fake review detection model based on vertical ensemble Tri-training

YIN Chunyong, ZHU Yuhang

2020, 40(8): 2194-2201. DOI: 10.11772/j.issn.1001-9081.2019112046

Asbtract ( )

PDF (1099KB) ( )

References | Related Articles | Metrics

In view of the problems that fake reviews mislead users and make their interests suffer losses and the cost of large-scale manual labeling reviews is too high, by using the classification model generated in the previous iteration process to improve the accuracy of detection, a fake review detection model based on Vertical Ensemble Tri-Training (VETT) was proposed. In the model, the user behavior characteristics were combined as features based on the review text characteristics to perform feature extraction. In VETT algorithm, the iterative process was divided into two parts:vertical ensemble within the group and horizontal ensemble between groups. In-group ensemble is to construct an original classifier using the previous iterative models of the classifier, and the inter-group ensemble is to train three original classifiers through the traditional process to obtain the second-generation classifiers after this iteration, thereby improving the accuracy of the labels. Compared with Co-training, Tri-training, PU learning based on Area Under Curve (PU-AUC) and Vertical Ensemble Co-training (VECT) algorithms, VETT algorithm has the maximum value of F1 increased by 6.5, 5.08, 4.27 and 4.23 percentage points respectively. Experimental results show that the proposed VETT algorithm has better classification performance.

Aspect-based sentiment analysis with self-attention gated graph convolutional network

CHEN Jiawei, HAN Fang, WANG Zhijie

2020, 40(8): 2202-2206. DOI: 10.11772/j.issn.1001-9081.2019122154

Asbtract ( )

PDF (803KB) ( )

References | Related Articles | Metrics

Aspect-based sentiment analysis tries to estimate different emotional tendencies expressed in different aspects of a sentence. Aiming at the problem that the existing network model based on Recurrent Neural Network (RNN) combined with attention mechanism has too many training parameters and lacks explanation of related syntax constraints and long distance word dependence mechanism, a self-attention gated graph convolutional network was proposed, namely MSAGCN. First, the multi-headed self-attention mechanism was used to encode context words and targets, thus capturing semantic associations within the sentence. Then, a graph convolutional network was established on the sentence's dependency tree to obtain syntactic information and word dependencies. Finally, the sentiment of the specific target was obtained through the GTRU (Gated Tanh-ReLU Unit). Compared with the baseline model, the proposed model has the accuracy and F1 improved by 1%-3.3% and 1.4%-6.3% respectively. At the same time, the pre-trained Bidirectional Encoder Representations from Transformers (BERT) model was also applied to the current task to further improve the model effect. Experimental results verify that the proposed model can better grasp the emotional tendencies of user reviews.

Sentiment prediction of small sample abstract painting image based on feature fusion

BAI Ruyi, GUO Xiaoying, JIA Chunhua

2020, 40(8): 2207-2213. DOI: 10.11772/j.issn.1001-9081.2019122169

Asbtract ( )

PDF (1480KB) ( )

References | Related Articles | Metrics

Painting image sentiment prediction is a research hotspot in affective computing. At present, there are few sources of abstract paintings and a small sample size; most of its sentiment analysis uses low-level features of the image, and the accuracy is not high. To resolve these problems, a sentiment prediction of small sample abstract painting image based on feature fusion was proposed. First, the relationship between the basic elements of abstract painting (point, line, plane and color) and human emotions in abstract art theory was analyzed, and according to these theories, the low-level features of abstract painting image were quantified. Second, the transfer learning algorithm was adopted to obtain the parameters from large sample data in the pre-training network, and these parameters were transferred to the target model, and then the target model was fine-tuned on the small sample data to obtain the high-level features of the image. Finally, the low-level and high-level features were fused linearly, and the multi-class Support Vector Machine (SVM) was used to achieve the sentiment prediction of abstract painting image. The experiments were carried out on three small sample abstract painting datasets, and the proposed method was compared with the methods of directly using low-level features. The results show that the classification accuracy of the proposed algorithm is improved, confirming its effectiveness in sentiment research of small sample abstract painting.

Dynamic weighted siamese network tracking algorithm

XIONG Changzhen, LI Yan

2020, 40(8): 2214-2218. DOI: 10.11772/j.issn.1001-9081.2019122195

Asbtract ( )

PDF (1142KB) ( )

References | Related Articles | Metrics

In order to improve the tracking accuracy of fast online target tracking and segmentation algorithm, a dynamic weighted siamese network tracking algorithm was proposed. First, the template features extracted from the initial frame and the template features extracted from each frame were learned and fused to improve the generalization ability of the tracker. Second, in the process of obtaining the target mask by the mask branch, the features were fused in a weighting method, so as to reduce the interference caused by redundant features and improve the tracking accuracy. The algorithm was evaluated on the VOT2016 and VOT2018 datasets. The results show that the proposed algorithm has the expected average overlap rate of 0.450 and 0.390 respectively, the accuracy of 0.649 and 0.618 respectively, and the robustness of 0.205 and 0.267 respectively, all of which are higher than those of baseline algorithm. The tracking speed of the proposed algorithm is 34 frame/s, which meets the requirements of real-time tracking. The proposed algorithm effectively improves the tracking accuracy, and completes the tracking task well in a complex tracking environment.

Multi-domain convolutional neural network based on self-attention mechanism for visual tracking

LI Shengwu, ZHANG Xuande

2020, 40(8): 2219-2224. DOI: 10.11772/j.issn.1001-9081.2019122139

Asbtract ( )

PDF (1092KB) ( )

References | Related Articles | Metrics

In order to solve the model drift problem of Multi-Domain convolutional neural Network (MDNet) when the target moves rapidly and the appearance changes drastically, a Multi-Domain convolutional neural Network based on Self-Attention (SAMDNet) was proposed to improve the performance of the tracking network from the dimensions of channel and space by introducing the self-attention mechanism. First, the spatial attention module was used to selectively aggregate the weighted sum of features at all positions to all positions in the feature map, so that the similar features were related to each other. Then, the channel attention module was used to selectively emphasize the importance of interconnected channels by aggregating all feature maps. Finally, the final feature map was obtained by fusion. In addition, in order to solve the problem of inaccurate classification of the network model caused by the existence of many similar sequences with different attributes in training data of MDNet algorithm, a composite loss function was constructed. The composite loss function was composed of a classification loss function and an instance discriminant loss function. First of all, the classification loss function was used to calculate the classification loss value. Second, the instance discriminant loss function was used to increase the weight of the target in the current video sequence and suppress its weight in other sequences. Lastly, the two losses were fused as the final loss of the model. The experiments were conducted on two widely used testing benchmark datasets OTB50 and OTB2015. Experimental results show that the proposed algorithm improves success rate index by 1.6 percentage points and 1.4 percentage points respectively compared with the champion algorithm MDNet of the 2015 Visual-Object-Tracking challenge (VOT2015). The results also show that the precision rate and success rate of the proposed algorithm exceed those of the Continuous Convolution Operators for Visual Tracking (CCOT) algorithm, and the precision rate index of it on OTB50 is also superior to the Efficient Convolution Operators (ECO) algorithm, which verifies the effectiveness of the proposed algorithm.

Object detection of Gaussian-YOLO v3 implanting attention and feature intertwine modules

LIU Dan, WU Yajuan, LUO Nanchao, ZHENG Bochuan

2020, 40(8): 2225-2230. DOI: 10.11772/j.issn.1001-9081.2020010030

Asbtract ( )

PDF (5261KB) ( )

References | Related Articles | Metrics

Wrong object detection may lead to serious accidents, so high-precision object detection is very important in autonomous driving. An object detection method of Gaussian-YOLO v3 combining attention and feature intertwine module was proposed, in which several specific feature maps were mainly improved. First, the attention module was added to the feature map to learn the weight of each channel autonomously, enhancing the key features and suppressing the redundant features, so as to enhance the network ability to distinguish foreground object and background. Second, at the same time, different channels of the feature map were intertwined to obtain more representative features. Finally, the features obtained by the attention and feature intertwine modules were fused to form a new feature map. Experimental results show that the proposed method achieves mAP (mean Average Precision) of 20.81% and F₁ score of 18.17% on BDD100K dataset, and has the false alarm rate decreased by 3.5 percentage points, reducing the false alarm rate effectively. It can be seen that the detection performance of the proposed method is better than those of YOLO v3 and Gaussian-YOLO v3.

Recognition of two-person interaction behavior based on key gestures

YANG Wenlu, YU Mengmeng, XIE Hong

2020, 40(8): 2231-2235. DOI: 10.11772/j.issn.1001-9081.2019122223

Asbtract ( )

PDF (933KB) ( )

References | Related Articles | Metrics

Concerning the problem of wide applications and low efficiency of two-person interaction behavior recognition, a method of two-person interaction behavior recognition based on key gestures was proposed. First, the key frames were extracted by comparing the differences between frames. Second, the key gestures in the key frames were determined by using the variance and spatial relationship of the angle changes of the bone points. Then, the key gestures were represented by features such as joint distance, angle, and joint motion. Every key gesture was expressed as a feature matrix. Finally, the combination with the best recognition rate was selected by comparing different dimension reductions and classification combinations. The proposed recognition method was evaluated on the SBU interaction dataset and the self-built interaction dataset, and the recognition rate of it reached 92.47% and 94.14% respectively. Experimental results show that the proposed method of representing actions by extracting the features of key gestures to form feature matrices can effectively improve the recognition result of two-person interaction behavior.

Behavior recognition method based on two-stream non-local residual network

ZHOU Yun, CHEN Shurong

2020, 40(8): 2236-2240. DOI: 10.11772/j.issn.1001-9081.2020010041

Asbtract ( )

PDF (1122KB) ( )

References | Related Articles | Metrics

The traditional Convolutional Neural Network (CNN) can only extract local features for human behaviors and actions, which leads to low recognition accuracy for similar behaviors. To resolve this problem, a two-stream Non-Local Residual Network (NL-ResNet) based behavior recognition method was proposed. First, the RGB (Red-Green-Blue) frame and the dense optical flow graph of the video were extracted, which were used as the inputs of spatial and temporal flow networks, respectively, and a pre-processing method combining corner cropping and multiple scales was used to perform data enhancement. Second, the residual blocks of the residual network were used to extract local appearance features and motion features of the video respectively, then the global information of the video was extracted by the non-local CNN module connected after the residual block, so as to achieve the crossover extraction of local and global features of the network. Finally, the two branch networks were classified more accurately by A-softmax loss function, and the recognition results after weighted fusion were output. The method makes full use of global and local features to improve the representation capability of the model. On UCF101 dataset, NL-ResNet achieves a recognition accuracy of 93.5%, which is 5.5 percentage points higher compared to the original two-stream network. Experimental results show that the proposed model can better extract behavior features, and effectively improve the behavior recognition accuracy.

Network situation prediction method based on deep feature and Seq2Seq model

LIN Zhixing, WANG Like

2020, 40(8): 2241-2247. DOI: 10.11772/j.issn.1001-9081.2020010010

Asbtract ( )

PDF (1073KB) ( )

References | Related Articles | Metrics

In view of the problem that most existing network situation prediction methods are unable to mine the deep information in the data and need to manually extract and construct features, a deep feature network situation prediction method named DFS-Seq2Seq (Deep Feature Synthesis-Sequence to Sequence) was proposed. First, the data produced by network streams, logs and system events were cleaned, and the deep feature synthesis algorithm was used to automatically synthesize the deep relation features. Then the synthesized features were extracted by the AutoEncoder (AE). Finally, the data was estimated by using the Seq2Seq (Sequence to Sequence) model constructed by Long Short-Term Memory (LSTM). Through a well-designed experiment, the proposed method was verified on the public dataset Kent2016. Experimental results show that when the depth is 2, compared with four classification models including Support Vector Machine (SVM), Bayes, Random Forest (RF) and LSTM, the proposed method has the recall rate increased by 7.4%, 11.5%, 6.5% and 3.0%, respectively. It is verified that DFS-Seq2Seq can effectively identify dangerous events in network authentication and effectively predict network situation in practice.

Clustering tendency analysis algorithm based on data stream

FAN Zhongxin

2020, 40(8): 2248-2254. DOI: 10.11772/j.issn.1001-9081.2020010057

Asbtract ( )

PDF (1853KB) ( )

References | Related Articles | Metrics

Focusing on the issues that clustering tendency analysis algorithms based on sampling have instability and one-sidedness in clustering tendecy index, and clustering tendency parameters need to be computed repeatedly because the algorithms do not suit the batch incremental property of data stream, an improved Clustering Tendency Index analysis algorithm based on Minimum Distance Connected Graph (MDCG) was proposed, namely MDCG-CTI, which performs overall analysis on all data. First, MDCG was built with complexity optimization by using stack depth-first traversal to update the nearest path of incremental data; then clustering tendency index was computed to determine the judgment threshold of clustering; finally, the proposed algorithm was integrated with batch incremental Density-Based Spatial Clustering of Applications with Noise (DBSCAN). Experimental results on self-built datasets show that the proposed algorithm has higher accuracy of clusterable determination than existing algorithms for single cluster and data with a large number of noises. And on large datasets pendigits and avila, the proposed algorithm has the time consumption reduced by 38% and 42% compared to Spectral Visual Assessment of cluster Tendency (SpecVAT); meanwhile, the proposed algorithm combined with batch incremental DBSCAN has average accuracy of clustering increased by 6% and 11% and time consumption of clustering reduced by 7% and 8% compared to SpecVAT combined with batch incremental DBSCAN. It can be seen that the proposed algorithm not only determines clustering tendency nonparametrically and accurately, but also improves effectiveness and operational efficiency of incremental clustering.

Focused crawler method combining ontology and improved Tabu search for meteorological disaster

LIU Jingfa, GU Yaoping, LIU Wenjie

2020, 40(8): 2255-2261. DOI: 10.11772/j.issn.1001-9081.2019122238

Asbtract ( )

PDF (1325KB) ( )

References | Related Articles | Metrics

Considering the problems that the traditional focused crawler is easy to fall into local optimum and has insufficient topic description, a focused crawler method combining Ontology and Improved Tabu Search (On-ITS) was proposed. First, the topic semantic vector was calculated by ontology semantic similarity, and the Web page text feature vector was constructed by Hyper Text Markup Language (HTML) Web page text feature position weighting. Then, the vector space model was used to calculate the topic relevance of Web pages. On this basis, in order to analyze the comprehensive priority of link, the topic relevance of the link anchor text and the PR (PageRank) value of Web page to the link were calculated. In addition, to avoid the crawler falling into local optimum, the focused crawler based on ITS was designed to optimize the crawling queue. Experimental results of the focused crawler on the topics of rainstorm disaster and typhoon disaster show that, under the same environment, the accuracy of the On-ITS method is higher than those of the contrast algorithms by maximum of 58% and minimum of 8%, and other evaluation indicators of the proposed algorithm are also very excellent. On-ITS focused crawler method can effectively improve the accuracy of obtaining domain information and catch more topic-related Web pages.

Terrorist attack organization prediction method based on feature selection and hyperparameter optimization

XIAO Yuelei, ZHANG Yunjiao

2020, 40(8): 2262-2267. DOI: 10.11772/j.issn.1001-9081.2019122141

Asbtract ( )

PDF (1101KB) ( )

References | Related Articles | Metrics

Aiming at the difficulty of finding terrorist attack organizations and the imbalance of terrorist attack data samples, a terrorist attack organization prediction method based on feature selection and hyperparameter optimization was proposed. First, by taking the advantage of Random Forest (RF) in dealing with imbalanced data, the backward feature selection was carried out through the RF iteration. Second, four mainstream classifiers including Decision Tree (DT), RF, Bagging and XGBoost were used to classify and predict terrorist attack organizations, and the Bayesian optimization method was used to optimize the hyperparameters of these classifiers. Finally, the Global Terrorism Database (GTD) was used to evaluate the classification prediction performance of these classifiers on the majority class samples and minority class samples. Experimental results show that the proposed method improves the classification and prediction performance of terrorist attack organizations, and the classification and prediction performance is the best when using RF and Bagging, with the accuracy of 0.823 9 and 0.831 6 respectively. Especially for minority class samples, the classification and prediction performance when using RF and Bagging is significantly improved.

Two-channel dynamic data encryption strategy in cloud computing environment

LYU Jiayu, ZHU Zhirong, YAO Zhiqiang

2020, 40(8): 2268-2273. DOI: 10.11772/j.issn.1001-9081.2020010113

Asbtract ( )

PDF (979KB) ( )

References | Related Articles | Metrics

In the case of limited mobile device performance, a Two-channel Dynamic Encryption Strategy (TDES) based on greedy algorithm was proposed to perform selective encryption to the data packet, so as to maximize the total privacy weight of packets in a limited time. First, the data packets were roughly classified into two categories according to the privacy weight of the data packets. Then, the weight ranking table was calculated by the privacy weight and the encryption time of the different data packets and sorted in descending order.The two types of data packets corresponded to two transmission channels, and the packet with the maximum privacy weight was encrypted for transmission until at the end of the transmission time. Finally, the remaining time inside the channel was checked, and the transmission channels of some packets were adjusted until the remaining time was less than the encryption time of any packet. The simulation of packet transmission tests shows that compared with Dynamic Data Encryption Strategy (D2ES) and greedy algorithm under the same time limit, the total privacy weight of the proposed strategy was increased by 9.5% and 10.3%, and the running time of the proposed strategy was reduced by 10.8% and 8.5%. Experimental results verify that the proposed TDES has shorter computation time and higher efficiency, which can well balance data security and equipment performance.

Improved consensus mechanism of blockchain based on proof-of-work and proof-of-stake

WU Mengyu, ZHU Guosheng, WU Shanchao

2020, 40(8): 2274-2278. DOI: 10.11772/j.issn.1001-9081.2019122206

Asbtract ( )

PDF (849KB) ( )

References | Related Articles | Metrics

At present, the Proof-of-Work (PoW) mechanism of blockchain wastes a lot of computational power and electric power, and the Proof-of-Stake (PoS) mechanism is prone to bifurcation and makes the rich get richer due to the nothing-at-stake and unlimited growth of rights and interests, which cannot guarantee the stability of blockchain. In order to overcome the shortcomings of the two above mechanisms, an improved consensus mechanism of blockchain named Proof of Work and Stake (PoWaS) was proposed based on PoW and PoS. First, the difficulty of hashing calculation was reduced and the maximum value of difficulty was limited to reduce the computational power and electric power resources spent in searching for nonces. Second, the upper limits for the effective coin holding time and the coin age were set to prevent the problem of unlimited wealth of the rich caused by the infinite growth of the coin age. Third, with the concept of credit value adopted, the credit value was assigned to each node, and the credit value was increased or decreased based on node behaviors. Finally, the competition waiting time was added, and the time spent searching for nonces, coin age, and credit value were used to calculate a value named pStake. The node with the largest pStake gained the right to pack and account. A blockchain of PoWaS consensus mechanism with six nodes was built to carry out experiments. Experimental results show that the PoWaS can reduce the waste of computational power, speed up the block mining and balance the competition of accounting rights.

Lightweight detection technology of typosquatting based on visual features

ZHU Yi, NING Zhenhu, ZHOU Yihua

2020, 40(8): 2279-2285. DOI: 10.11772/j.issn.1001-9081.2019111952

Asbtract ( )

PDF (1044KB) ( )

References | Related Articles | Metrics

Recently, botnets, domain name hijacking, phishing websites and other typosquatting attacks are more and more frequent, seriously threatening the security of society and individuals. Therefore, the typosquatting detection is an important part of network protection. The current typosquatting detections mainly focus on public domain names, and the detection methods are mainly based on edit distance which is difficult to fully reflect the visual characteristics of domain names. In addition, using the related information of the given domains for determination can help to increase the detection efficiency, but it also introduces a large additional cost. Based on this, a lightweight detection strategy only based on domain name strings was adopted for typosquatting detection. By comprehensively considering the influence of character locations, character similarities and operation types on the vision of domain names, the edit distance algorithm based on visual characteristics was proposed. According to the characteristics of typosquatting, firstly the domain names were preprocessed, then different weights were given to the characters according to their positions, character similarities and operation types, and finally, the typosquatting determination was performed by calculating the edit distance value. Experimental results show that compared with the detection method based on edit distance, the typosquatting lightweight detection method based on visual features has the F1 value increased by 5.98% and 13.56% respectively when the threshold value is 1 and 2, which proves that the proposed method has a good detection effect.

Secure communication scheme of unmanned aerial vehicle system based on MAVLink protocol

ZHANG Linghao, WANG Sheng, ZHOU Hui, CHEN Yifan, GUI Shenglin

2020, 40(8): 2286-2292. DOI: 10.11772/j.issn.1001-9081.2019122160

Asbtract ( )

PDF (1132KB) ( )

References | Related Articles | Metrics

The MAVLink is a lightweight communication protocol between Unmanned Aerial Vehicle (UAV) and Ground Control Station (GCS). It defines a set of mutual bi-directional messages between UAV and GCS, including UAV states and GCS control commands. However, the MAVLink protocol lacks sufficient security mechanisms, and there are security vulnerabilities that may cause serious threats and hidden dangers. To resolve these problems, a security communication scheme for the UAV system based on the MAVLink protocol was proposed. First, the connection requests were broadcasted by the UAV constantly and alternately; then the public key was sent to the UAV by the GSC, and the DH algorithm was used by both sides to negotiate a shared key, and the AES algorithm was used to encrypt the communication on MAVLink message packages, achieving identity authentication. If the UAV did not receive the public key sent by the GCS within the specified time or a decryption error on MAVLink message package happened, the UAV would actively disconnect and update a new public key to rebroadcast the connection request. In addition, concerning the security problem of the UAV system being maliciously tampered with, the system firmware was self-checked during booting. Finally, based on the formal verification platform UPPAAL, it has been proved that the proposed scheme has the security properties of liveness, connectability and connection uniqueness. Results of the communication process between UAV PX4 1.6.0 and GCS QgroundControl 3.5.0 show that the proposed secure communication scheme of UAV system can prevent malicious eavesdropping, message tampering, man in the middle attack and other malicious attacks in the communication process between UAV and GCS, and solve the security vulnerabilities of MAVLink protocol well with little effect on UAV performance.

Computation offloading strategy based on particle swarm optimization in mobile edge computing

LUO Bin, YU Bo

2020, 40(8): 2293-2298. DOI: 10.11772/j.issn.1001-9081.2019122200

Asbtract ( )

PDF (961KB) ( )

References | Related Articles | Metrics

Computation offloading is one of the means to reduce delay and save energy in Mobile Edge Computing (MEC). Through reasonable offloading decisions, industrial costs can be greatly reduced. Aiming at the problems of long delay and high energy consumption after the deployment of MEC servers in the industrial production line, a computation offloading strategy based on Particle Swarm Optimization (PSO) was proposed, namely PSAO. First, the actual problem was modeled to a delay model and an energy consumption model. Since it was targeted at delay-sensitive applications, the model was transformed into a delay minimization problem under the constraints of energy consumption, and a penalty function was used to balance delay and energy consumption. Second, according to the PSO, the computation offloading decision vector was obtained, and each computation task was reasonably allocated to the corresponding MEC server through the centralized control method. Finally, through simulation experiments, the delay data of local offloading strategy, MEC baseline offloading strategy, Artificial Fish Swarm Algorithm (AFSA) based offloading strategy and PSAO were compared and analyzed. The average total delay of PSAO was much lower than those of the other three offloading strategies, and PSAO reduces the total cost of the original system by 20%. Experimental results show that the proposed strategy can effectively reduce the delay in MEC and balance the loads of MEC servers.

Fast ambiguity resolution method based on lattice theory

WANG Shouhua, WU Lirong, JI Yuanfa, SUN Xiyan

2020, 40(8): 2299-2304. DOI: 10.11772/j.issn.1001-9081.2019122126

Asbtract ( )

PDF (1080KB) ( )

References | Related Articles | Metrics

In order to take into account the compatibility and interoperability of the future Global Navigation Satellite System (GNSS), and solve the problem of low resolution efficiency of multi-frequency, multi-mode and high-dimensional ambiguity in conventional methods, based on the lattice theory, a Closest Lattice Point (CLP) search algorithm was proposed to search the ambiguity integer value. First, the ambiguity search was transformed into the closest lattice point search problem of the known lattice points in the lattice. Then, according to the lattice base specification, the lattice base vectors with the minimum possible length and orthogonal to each other were obtained. Finally, the CLP search algorithm was used to search the optimal ambiguity parameters. The results of simulation and testing data verify that, the proposed CLP search algorithm is theoretically more efficient and reliable on the resolution of ambiguity parameters compared to the classical Least squares AMBiguity Decorrelation Adjustment (LAMBDA) and Modified LAMBDA (MLAMBDA) algorithms, and it has the search time of each parameter stable at 0.01 seconds, which means even in high-dimensional case, the CLP search algorithm is still stable and reliable.

Yin-Yang-pair optimization algorithm based on chaos search and intricate operator

XU Qiuyan, MA Liang, LIU Yong

2020, 40(8): 2305-2312. DOI: 10.11772/j.issn.1001-9081.2020010089

Asbtract ( )

PDF (1180KB) ( )

References | Related Articles | Metrics

To solve the premature convergence problem of the basic Yin-Yang-Pair Optimization (YYPO) algorithm, the chaos search was introduced to the algorithm to explore more areas based on the ergodicity of chaos, so as to improve the global exploration capability. Besides, based on the intricate operator of I Ching, opposition-based learning was adopted to search for the opposite solutions to the current ones in order to improve the local exploitation ability. The design of parallel programming was also added to the algorithm to make full use of computing resources such as multi-core processors. Benchmark functions were used for numerical experiments to test the performance of the improved YYPO algorithm combined with chaos search and intricate operator, namely CSIOYYPO. Experimental results show that, compared with YYPO algorithms including basic YYPO algorithm and adaptive YYPO algorithm as well as other intelligent optimization algorithms, CSIOYYPO algorithm has higher calculation accuracy and higher convergence speed.

Improved community evolution relationship analysis method for dynamic graphs

LUO Xiangyu, LI Jianan, LUO Xiaoxia, WANG Jia

2020, 40(8): 2313-2318. DOI: 10.11772/j.issn.1001-9081.2020010072

Asbtract ( )

PDF (3929KB) ( )

References | Related Articles | Metrics

The community evolution relationships extracted by the traditional adjacent time slice analysis cannot fully describe the entire community evolution process in dynamic graphs. Therefore, an improved community evolution relationship analysis method was proposed. First, the community events were defined, and the evolution states of the community were described according to the occurred community events. Then, the event matching was performed on two communities within different time slices to obtain community evolution relationships. Results of comparison with the traditional methods show that the total number of community events detected by the proposed method is more than twice that revealed by the traditional method, which proves that the proposed method can provide more useful information for describing the evolution process of communities in dynamic graphs.

Nonlinear systems identification based on structural adaptive filtering method

FENG Zikai, CHEN Lijia, LIU Mingguo, YUAN Meng’en

2020, 40(8): 2319-2326. DOI: 10.11772/j.issn.1001-9081.2019111996

Asbtract ( )

PDF (2796KB) ( )

References | Related Articles | Metrics

In order to solve the problems of high identification limitation and low identification rate in nonlinear system identification with fixed structure and parameters, a Subsystem-based Structural Adaptive Filtering (SSAF) method for nonlinear system identification was proposed with introducing structural adaptation into the optimization of identification. Multiple subsystems with linear-nonlinear hybrid structure were cascaded to form the model for this method. The linear part is a 1-order or 2-order Infinite Impulse Response (IIR) digital filter with uncertain parameters, and the nonlinear part is a static nonlinear function. In the initial stage, the parameters of the subsystems were randomly generated, and the generated subsystems were connected randomly according to the set connection rules, and the effectiveness of the nonlinear system was guaranteed by the connection mechanism with no feedback branches. An Adaptive Multiple-Elites-guided Composite Differential Evolution with a shift mechanism(AMECoDEs) algorithm was used for loop optimization of the adaptive model until the optimal structure and parameters were found, that is, the global optimal. The simulation results show that AMECoDEs performs well on nonlinear test functions and real data sets with high identification rate and good convergence rate. Compared with the Focused Time Lagged Recurrent Neural Network (FTLRNN), the number of parameters used in SSAF is reduced to 1/10, and the accuracy of fitness is improved by 7%, which proves the effectiveness of the proposed method.

Classification model for class imbalanced traffic data

LIU Dan, YAO Lishuang, WANG Yunfeng, PEI Zuofei

2020, 40(8): 2327-2333. DOI: 10.11772/j.issn.1001-9081.2019122241

Asbtract ( )

PDF (1110KB) ( )

References | Related Articles | Metrics

In the process of network traffic classification, the traditional model has poor classification on minority classes and cannot be updated frequently and timely. In order to solve the problems, a network Traffic Classification Model based on Ensemble Learning (ELTCM) was proposed. First, in order to reduce the impact of class imbalance problem, feature metrics biased towards minority classes were defined according to the class distribution information, and the weighted symmetric uncertainty and Approximate Markov Blanket (AMB) were used to reduce the dimensionality of network traffic features. Then, early concept drift detection was introduced to enhance the model's ability to cope with the changes in traffic features as the network changed. At the same time, incremental learning was used to improve the flexibility of model update training. Experimental results on real traffic datasets show that compared with the Internet Traffic Classification based on C4.5 Decision Tree (DTITC) and Classification Model for Concept Drift Detection based on ErrorRate (ERCDD), the proposed ELTCM has the average overall accuracy increased by 1.13% and 0.26% respectively, and the classification performance of minority classes all higher than those of the models. ELTCM has high generalization ability, and can effectively improve the classification performance of minority classes without sacrificing the overall classification accuracy.

Radio frequency identification anti-collision algorithm based on Logistic mapping

LIU Yan, ZHANG Yu

2020, 40(8): 2334-2339. DOI: 10.11772/j.issn.1001-9081.2019122121

Asbtract ( )

PDF (950KB) ( )

References | Related Articles | Metrics

Concerning the low tag recognition throughput caused by frame length limitation in the Dynamic Frame Slot Aloha (DFSA) algorithm, a Logistic mapping based DFSA (Logistic-DFSA) algorithm was proposed. First, the sequence generated by logistic mapping was used as the spreading code, and the spread spectrum technology was combined with the DFSA algorithm to realize the parallel recognition of multiple tags with one slot. Second, the influence of frame length, spreading code length and the number of tags on throughput in the recognition process was analyzed, and the optimal frame length and spreading code length were obtained. Finally, based on the number of remaining tags after a frame, a repeating frame algorithm with all tags recognizable was proposed. Simulation results show that compared with the DFSA algorithm, the Logistic-DFSA algorithm has reduced the total number of slots for tag recognition by 98.3% and increased the system throughout by 162%. Therefore, the Logistic-DFSA algorithm can greatly reduce the total number of slots, improve the system throughput, and effectively identify tags within the range of the reader.

Location nearest neighbor query method for social network based on differential privacy

JIN Bo, ZHANG Zhiyong, ZHAO Ting

2020, 40(8): 2340-2344. DOI: 10.11772/j.issn.1001-9081.2019122220

Asbtract ( )

PDF (855KB) ( )

References | Related Articles | Metrics

Concerning the problem of privacy leak of personal location when querying the nearest neighbor location in social network, a geo-indistinguishability mechanism was used to add random noise to the location data, and a privacy budget allocation method was proposed. First, the spatial regions were divided into grids, and the personalized privacy budget allocation was performed according to the location hits of user in different regions. Then, in order to solve the problem of low hit rate of the neighbor query in the disturbance location dataset, a Combined Incremental Neighbor Query (CINQ) algorithm was proposed to expand the search range of the demand space, and the combination query was used to filter out the redundancy data. Simulation results show that compared with the SpaceTwist algorithm, the CINQ algorithm had the query hit rate increased by 13.7 percentage points. Experimental results verify that the CINQ algorithm effectively solves the problem of low query hit rate caused by the location disturbance of the query target, and it is suitable for neighbor queries for disturbed locations in social network applications.

Image super-resolution reconstruction based on spherical moment matching and feature discrimination

LIN Jing, HUANG Yuqing, LI Leimin

2020, 40(8): 2345-2350. DOI: 10.11772/j.issn.1001-9081.2019122142

Asbtract ( )

PDF (1395KB) ( )

References | Related Articles | Metrics

Due to the instability of network training, the image super-resolution reconstruction based on Generative Adversarial Network (GAN) has a mode collapse phenomenon. To solve this problem, a Spherical double Discriminator Super-Resolution Generative Adversarial Network (SDSRGAN) based on spherical geometric moment matching and feature discrimination was proposed, and the stability of network training was improved by adopting geometric moment matching and discrimination of high-frequency features. First of all, the generator was used to produce a reconstructed image through feature extraction and upsampling. Second, the spherical discriminator was used to map image features to high-dimensional spherical space, so as to make full use of higher-order statistics of feature data. Third, a feature discriminator was added to the traditional discriminator to extract high-frequency features of the image, so as to reconstruct both the characteristic high-frequency component and the structural component. Finally, game training between the generator and double discriminator was carried out to improve the quality of the image reconstructed by the generator. Experimental results show that the proposed algorithm can effectively converge, its network can be stably trained, and has Peak Signal-to-Noise Ratio (PSNR) of 31.28 dB, Structural SIMilarity (SSIM) of 0.872. Compared with Bicubic, Super-Resolution Residual Network (SRResNet), Fast Super-Resolution Convolutional Neural Network (FSRCNN), Super-Resolution using a Generative Adversarial Network (SRGAN), and Enhanced Super-Resolution Generative Adversarial Network (ESRGAN) algorithms, the reconstructed image of the proposed algorithm has more precise structural texture characteristics. The proposed algorithm provides a double discriminant method for spherical moment matching and feature discrimination for the research of image super-resolution based on GAN, which is feasible and effective in practical applications.

Application of deep learning to 3D model reconstruction of single image

ZHANG Hao, ZHANG Qiang, SHAO Siyu, DING Haibin

2020, 40(8): 2351-2357. DOI: 10.11772/j.issn.1001-9081.2020010070

Asbtract ( )

PDF (1711KB) ( )

References | Related Articles | Metrics

To solve the problem that the reconstructed 3D model of a single image has high uncertainty, a network model based on depth image estimation, spherical projection mapping and 3D generative adversarial network was proposed. Firstly, the depth image of the input image was obtained by the depth estimator, which was helpful for the further analysis of the image. Secondly, the obtained depth image was converted into a 3D model by spherical projection mapping. Finally, 3D generative adversarial network was utilized to judge the authenticity of the reconstructed 3D model, so as to obtain 3D model closer to reality. In the comparison experiments with LVP algorithm which learning view priors for 3D reconstruction, the proposed model has the Intersection-over-Union (IoU) increased by 20.1% and the Charmfer Distance (CD) decreased by 13.2%. Theoretical analysis and simulation results show that the proposed model has good generalization ability in the 3D model reconstruction of a single image.

Magnetic resonance image reconstruction algorithm via non-convex total variation regularization

SHEN Marui, LI Jincheng, ZHANG Ya, ZOU Jian

2020, 40(8): 2358-2364. DOI: 10.11772/j.issn.1001-9081.2019122187

Asbtract ( )

PDF (10893KB) ( )

References | Related Articles | Metrics

To solve the problems of incomplete reconstruction, blurred boundary and residual noise in Magnetic Resonance (MR) image reconstruction, a non-convex total variation regularization reconstruction model based on L₂ regularization was proposed. First, Moreau envelope and minmax-concave penalty function were used to construct the non-convex regularization of L₂ norm, then it was applied into the total variation regularization to construct the sparse reconstruction model based on the isotropic non-convex total variation regularization. The proposed non-convex regularization was able to effectively avoid the underestimation of larger non-zero elements in convex regularization, so as to reconstruct the edge contour of the target more effectively. At the same time, it was able to guarantee the global convexity of objective function under certain conditions. Therefore, Alternating Direction Method of Multipliers (ADMM) was able to be used to solve the model. Simulation experiments were carried out to reconstruct several MR images under different sampling templates and sampling rates. Experimental results show that compared with several typical image reconstruction methods, the proposed model has better performance and lower relative error, its Peak Signal-to-Noise Ratio (PSNR) is significantly improved, which is 4 dB higher than that of traditional reconstruction method based on the non-convex regularization of L₁ norm; in addition, the visual effects of the reconstructed images are promoted significantly, effectively maintaining the edge details of the original images.

Multi-exposure image fusion based on local features of scene

LI Weizhong

2020, 40(8): 2365-2371. DOI: 10.11772/j.issn.1001-9081.2019122077

Asbtract ( )

PDF (2052KB) ( )

References | Related Articles | Metrics

Focusing on the problems of low quality of obtained images and low algorithm efficiency of existing multi-exposure image fusion algorithms, a multi-exposure image fusion algorithm based on local features of scene was proposed. Firstly, the image sequence with different exposures was divided into regular patches with overlapping regions of some pixels between neighbouring patches. For static scenes, the weight for each patch was calculated based on local variance, local visibility and local saliency; for dynamic scenes, in addition to the three features described above, local similarity was also used to remove ghost caused by moving objects. Then, the optimal patches were obtained based on the weighted sum method. Finally, the output patches were fused together and the pixels in overlapping regions were averaged to obtain the final fusion result. With 12 sets of exposure sequences of different natural scenes, the proposed algorithm was compared with 7 existing pixel-based and feature-based algorithms in subjective and objective aspects. Experimental results demonstrate that the proposed algorithm preserves more details and obtains good visual effects in both static scenes and dynamic scenes. At the same time, the proposed algorithm also maintains high computational efficiency.

Design and implementation of adaptive compensation enhancement for low-light video images

YANG Jiayi, CHEN Yong

2020, 40(8): 2372-2377. DOI: 10.11772/j.issn.1001-9081.2020010046

Asbtract ( )

PDF (2639KB) ( )

References | Related Articles | Metrics

It is difficult to identify video images with low contrast in a low-light environment, so an adaptive contrast compensation enhancement algorithm was proposed. First, the average gray of video image feature parameters in the low-light environment was extracted, then a mathematical model of human visual contrast resolution compensation based on the grayscale difference of original images was established, and the true color three-primary colors were compensated respectively by proportional integration. Second, the compensation threshold was set to linearly compensate photopic vision until full bandwidth when the compensation degree was lower than the just noticeable difference of photopic vision. Finally, the automatic optimization model of compensation proportion coefficient was established based on the subjective image quality evaluation and image feature parameters, which was embedded in DirectShow video processing system for video image adaptive enhancement. Experimental results show that the video enhancement system has good real-time performance, which can mine scotopic vision information effectively and be widely applied in different scenes.

Single image shadow removal method based on multistage generative adversarial network

ZHANG Shuping, WU Wen, WAN Yi

2020, 40(8): 2378-2385. DOI: 10.11772/j.issn.1001-9081.2019122146

Asbtract ( )

PDF (2308KB) ( )

References | Related Articles | Metrics

Traditional deep learning shadow removal methods often change the pixels in non-shadow areas and cannot obtain results with smooth boundary transition. In order to solve these problems, a new multistage shadow removal framework based on Generative Adversarial Network (GAN) was proposed. Firstly, shadow mask and shadow matte of the input image were generated by multitask driven generator via shadow detection subnet and shadow matter generation subnet respectively. Secondly, under the guidance of shadow mask and shadow matte, an umbra module and a penumbra module were designed respectively to remove different types of shadows successively. Thirdly, a new compose loss function dominated by least squares loss was created to obtain a better result. Compared with state-of-the-art shadow removal methods based on deep learning, the proposed method has the Balanced Error Rate (BER) averagely reduced by 4.39%, the Structural SIMilarity index (SSIM) averagely improved by 0.44%, and the Root Mean Square Error (RMSE) averagely reduced by 13.32%. Experimental results show that the boundary transition of shadow removal result of the proposed method is smoother.

Small-array speech enhancement based on noise cancellation and beamforming

LONG Chao, ZENG Qingning, LUO Ying

2020, 40(8): 2386-2391. DOI: 10.11772/j.issn.1001-9081.2019122106

Asbtract ( )

PDF (999KB) ( )

References | Related Articles | Metrics

In order to improve the speech enhancement effect of small microphone array, a better method was proposed for small-array speech enhancement by combining the Array Crosstalk Resistant Adaptive Noise Cancellation (ACRANC) method with the BeamForming (BF) method. Firstly, ACRANC subsystems were constructed to obtain multiple channels of enhanced speech signals. Then, the proposed Adaptive Mode Control (AMC) algorithm and the Delay And Sum (DAS) beamforming method were applied to the enhanced speech signals for further improving the enhancement effect of multi-channel speech signals. The computational complexity of the proposed method was estimated, and it was verified that the proposed method was able to be realized in real-time with common chips. Experimental results in actual environments show that the speech enhancement effect of the proposed method is higher than that of the ACRANC method and thus the method has some advantages.

Rectal tumor segmentation method based on improved U-Net model

GAO Haijun, ZENG Xiangyin, PAN Dazhi, ZHENG Bochuan

2020, 40(8): 2392-2397. DOI: 10.11772/j.issn.1001-9081.2020030318

Asbtract ( )

PDF (1307KB) ( )

References | Related Articles | Metrics

In the diagnosis of rectal cancer, if the rectal tumor area can be automatically and accurately segmented from Computed Tomography (CT) images, it will help doctors make a more accurate and rapid diagnosis. Aiming at the problem of rectal tumor segmentation, an automatic segmentation method of rectal tumor based on improved U-Net model was proposed. Firstly, the sub coding modules were embedded in the U-Net model encoder of different levels to improve the feature extraction ability of the model. Secondly, by comparing the optimization performances of different optimizers, the most suitable optimizer was determined to train the model. Finally, data augmentation was performed to the training set to make the model more fully trained, so as to improve the segmentation performance. Experimental results show that compared with U-Net, Y-Net and FocusNetAlpha network models, the segmentation region obtained by the improved model is closer to the real tumor region, and the segmentation performance of this model for small objects is more prominent; at the same time, the proposed model is superior to other three models on three evaluation indexes including precision, recall and Dice coefficient, which can effectively segment the rectal tumor area.

Overview of relief distribution optimization models based on mathematical programming and their solving algorithms

CAO Cejun, GAO Xuehong

2020, 40(8): 2398-2409. DOI: 10.11772/j.issn.1001-9081.2020010102

Asbtract ( )

PDF (1286KB) ( )

References | Related Articles | Metrics

To improve the utilization of relief, reduce various losses and alleviate the suffering of survivors, how to use mathematical programming methods to optimize relief distribution strategy is a critical issue urgent to be solved currently. The current status of mathematical programming models for relief distribution was analyzed according to the criteria of objective quantity and intergovernmental relationships. An overview of the algorithms to solve the relief distribution optimization models was conducted. The future directions of the relief distribution optimization problem were summarized and pointed out. Research indicates that the establishment of the optimization support framework of relief distribution is necessary. And it is necessary that:the relief distribution optimization model is extended from single-objective programming to multiple-objective programming, the research viewpoint of relief distribution issue is changed from horizontal intergovernmental relationship to vertical one, the development from the certain condition based relief distribution optimization problem to the uncertain condition based one, the change from traditional relief distribution problem to the one with sustainable development idea, the design of solutions from using exact algorithm to using heuristic one, and applying new Information and Communications Technology (ICT) such as big data, digital twin and blockchain to develop relief distribution models.

Modeling and solving of high-dimensional multi-objective adaptive allocation for emergency relief supplies

YAN Huajian, ZHANG Guofu, SU Zhaopin, LIU Yang

2020, 40(8): 2410-2419. DOI: 10.11772/j.issn.1001-9081.2020010045

Asbtract ( )

PDF (1120KB) ( )

References | Related Articles | Metrics

To seek a good balance between efficiency and fairness in emergency relief supply allocation, a high-dimensional multi-objective adaptive allocation algorithm based on two-dimensional integer encoding was developed. First of all, a high-dimensional multi-objective optimization model was constructed with the consideration of total emergency response time, panic degree of the victims, unsatisfactory degree of relief supplies, fairness of supply allocation, loss of the victims, and total cost of emergency response. Then, two-dimensional integer encoding and Adaptive Individual Repair (AIR) were adopted to resolve potential emergency resource conflicts. Finally, the shift-based density estimation and Strength Pareto Evolutionary Algorithm 2 (SPEA2) were introduced to design a high-dimensional multi-objective allocation algorithm for disaster relief supplies. Simulation results show that compared with Encoding Repair and Non-dominated Sorting based Differential Evolution algorithm (ERNS-DE) and Greedy-Search-based Multi-Objective Genetic Algorithm (GSMOGA), the proposed algorithm had coverage values increased by 34.87%, 100% and 23.59%, 100% in two emergency environments, respectively. Moreover, the hypervolume values of the proposed algorithm were much higher than those of the two comparison algorithms. Experimental results verify that the proposed model and algorithm allow decision makers to select emergency schemes according to actual emergency needs, and have better flexibility and efficiency.

CliqueNet flight delay prediction model based on clique random connection

QU Jingyi, CAO Lei, CHEN Min, DONG Liang, CAO Yexiu

2020, 40(8): 2420-2427. DOI: 10.11772/j.issn.1001-9081.2019112061

Asbtract ( )

PDF (1315KB) ( )

References | Related Articles | Metrics

Aiming at the current high delay rate of the civil aviation transportation industry, and the fact that the high-precision delay prediction problem can hardly be solved by traditional algorithms, a randomly connected Clique Network (CliqueNet) based flight delay prediction model was proposed. Firstly, the flight data and related weather data were fused by the model. Then, making full use of the improved network model to extract features from the fused dataset. Finally, the softmax classifier was used to predict the flight departure delay of all levels with high precision. The main features of the model include random connection of clique feature layers and the introduction of Channel-wise and Spatial Attention Residual (CSAR) block to the transition layer. The former transmits the feature information in a more effective connection; and the latter double-calibrates the feature information on the channel and spatial dimensions to improve accuracy. Experimental results show that the prediction accuracy of the fused data is improved by 0.5% and 1.3% respectively with the introduction of random connection and CSAR block, and the final accuracy of the new model reaches 93.40%.

Deep learning-based on-road obstacle detection method

PENG Yuhui, ZHENG Weihong, ZHANG Jianfeng

2020, 40(8): 2428-2433. DOI: 10.11772/j.issn.1001-9081.2019122227

Asbtract ( )

PDF (1655KB) ( )

References | Related Articles | Metrics

Concerning the problems of 3D point cloud data processing and on-road obstacle detection based on Light Detection And Ranging (LiDAR), a deep learning-based on-road obstacle detection method was proposed. First, the statistical filtering algorithm was applied to eliminate the outliers from the original point cloud, improving the roughness of point clouds. Then, an end-to-end deep neural network named VNMax was proposed, the max pooling was used to optimize the structure of Region Proposal Network (RPN), and an improved target detection layer was built. Finally, training and testing experiments were performed on KITTI dataset. The results show that, by filtering, the average distance between the points in point cloud is reduced effectively. For the car location processing results of easy, medium difficult and hard detection tasks in KITTI dataset, it can be seen that the average precisions of the proposed method are improved by 11.30 percentage points, 6.02 percentage points and 3.89 percentage points, respectively, compared with those of the VoxelNet. Experimental results show that the statistical filtering algorithm is still an effective 3D point cloud data processing method, and the max pooling module can improve the learning performance and object location ability of the deep neural network.

Robotic grasping system based on improved single shot multibox detector algorithm

HAN Xin, YU Yongwei, DU Liuqing

2020, 40(8): 2434-2440. DOI: 10.11772/j.issn.1001-9081.2019122234

Asbtract ( )

PDF (1634KB) ( )

References | Related Articles | Metrics

Concerning the problem that automobile part recycling factories cannot achieve accurate grasping and thus affects production efficiency due to poor part detection under actual complex working conditions, a robotic grasping system based on improved Single Shot multibox Detector (SSD) algorithm was proposed to realize the tasks of part detection, classification, location and grasping, including detection, location and grasping functions of the target parts. First, the target parts were detected by the improved SSD model, obtaining the part location and class information. Second, through Kinect camera calibration and hand-eye calibration, the pixel coordinate system was transferred into robot world coordinate system to realize the location of parts in robot spatial coordinate system. Third, the target part grasping task was completed by robot positive and inverse kinematic modeling and trajectory planning. Finally, the validation experiments of the whole integrated system on part detection, classification, location and grasping were carried out. Experimental results show that under complex working conditions, the average part grasping success rate of the proposed system reaches 95%, which meets the actual production demand of part grasping.

Networked cane system for blind people based on K-nearest neighbor and dynamic time warping algorithms

XIA Lunteng, ZHANG Li

2020, 40(8): 2441-2448. DOI: 10.11772/j.issn.1001-9081.2020010122

Asbtract ( )

PDF (1566KB) ( )

References | Related Articles | Metrics

Concerning the safety and monitoring problems of the blind people during traveling, the design of a networked cane system for blind people based on machine learning algorithms was proposed. Multiple functions were added to the system, such as obstacle avoidance, positioning, alarm and communication. First, infrared obstacle avoidance and ultrasonic ranging obstacle avoidance were designed as the basic functions of the system, which could be used to detect road conditions and obstacles for the daily travel of the blind and provide real-time voice and motor vibration reminders. Second, remote communication function for help was added to the system, which was able to send help text messages and phone calls to specific mobile numbers. In addition, Global Positioning System (GPS) function, accelerometer gyroscope attitude angle calculation function and abnormal attitude alarm function based on K-Nearest Neighbor (KNN) and Dynamic Time Warping (DTW) algorithms were also added, which were able to transfer all kinds of information data to the cloud server storage. Finally, the WeChat mini program was used to replace the native APP as the monitoring operation interface, and functions such as one-click alarm, weather query, blind safety information were provided. Test results show that the proposed system has the attitude recognition success rate reached 86%, and has the accuracy improved by nearly 31% compared to the attitude angle system. The networked cane system for blind people can greatly improve the security of the blind during traveling, so that the blind can ask for help in time when an accident occurs, and achieve the safe monitoring and positioning monitoring of the blind postures.

Soft fault detection for flapping wing micro aerial vehicle based on multistep neural network observer

WANG Sipeng, DU Changping, YE Zhixian, SONG Guanghua, ZHENG Yao

2020, 40(8): 2449-2454. DOI: 10.11772/j.issn.1001-9081.2020010107

Asbtract ( )

PDF (1103KB) ( )

References | Related Articles | Metrics

Since the small initial variation amplitude of soft fault leads to the low detection efficiency of fault detection algorithm based on traditional neural network observer, a soft fault detection algorithm for Flapping Wing Micro Aerial Vehicle (FWMAV) based on multistep neural network observer and adaptive threshold was proposed. Firstly, a multistep prediction observer model was constructed, and the time-delay ability of it can prevent the observer from being polluted by faulty data. Secondly, the window width of the multistep observer was tested and analyzed according to the actual flight data of FWMAV. Thirdly, an adaptive threshold strategy was proposed to perform the fault detection of the observer residuals with the assistance of residual chi-square detection algorithm. Finally, the proposed algorithm was verified and analyzed with the use of actual flight data of FWMAV. Experimental results show that compared with the fault detection algorithm based on traditional neural network observer, the proposed algorithm has the soft fault detection speed increased by 737.5%, and the soft fault detection accuracy increased by 96.1%. It can be seen that the proposed algorithm can effectively improve the soft fault detection speed and accuracy of FWMAV.

Brain network feature identification algorithm for Alzheimer's patients based on MRI image

ZHU Lin, YU Haitao, LEI Xinyu, LIU Jing, WANG Ruofan

2020, 40(8): 2455-2459. DOI: 10.11772/j.issn.1001-9081.2019122105

Asbtract ( )

PDF (915KB) ( )

References | Related Articles | Metrics

In view of the problem of subjectivity and easy misdiagnosis in the artificial identification of Alzheimer's Disease (AD) through brain imaging, a method of automatic identification of AD by constructing brain network based on Magnetic Resonance Imaging (MRI) image was proposed. Firstly, MRI images were superimposed and were divided into structural blocks, and the Structural SIMilarity (SSIM) between any two structural blocks was calculated to construct the network. Then, the complex network theory was used to extract structural parameters, which were used as the input of machine learning algorithm to realize the AD automatic identification. The analysis found that the classification effect was optimal with two parameters, especially the node betweenness and edge betweenness were taken as the input. Further study found that the classification effect was optimal when MRI image was divided into 27 structural blocks, and the accuracy of weighted network and unweighted network was up to 91.04% and 94.51% respectively. The experimental results show that the complex network of structural similarity based on MRI block division can identify AD with higher accuracy.

Staging and lesion detection of diabetic retinopathy based on deep convolution neural network

XIE Yunxia, HUANG Haiyu, HU Jianbin

2020, 40(8): 2460-2464. DOI: 10.11772/j.issn.1001-9081.2019122198

Asbtract ( )

PDF (2044KB) ( )

References | Related Articles | Metrics

For Diabetic Retinopathy (DR), the image resolution is too high, the lesion features are too scattered to obtain, and the positive, negative, hard and easy samples are imbalanced, thus the DR staging accuracy cannot be effectively improved. Therefore, a DR staging method based on the combination of improved Faster Region-based Convolutional Neural Network (Faster R-CNN) and subgraph segmentation was proposed. First, subgraph segmentation was used to solve the interference problem of the optic disc region to lesion recognition. Second, a deep residual network was used in the feature extraction process to solve the problem of difficulty of obtaining features due to the small proportion of the lesions in the high-resolution fundus image. Finally, the Online Hard Example Mining (OHEM) method was used to solve the problem of imbalance between positive, negative, hard and easy samples during the generation of Region of Interest (ROI). In the DR staging experiments on EyePACS, an internationally open dataset, the accuracy of the proposed method in DR staging reached 94.83% in stage 0, 86.84% in stage 1, 94.00% in stage 2, 87.21% in stage 3 and 82.96% in phase 4. Experimental results show that the improved Faster R-CNN can efficiently stage DR images and automatically label the lesions.

Handwritten Chinese character recognition based on two dimensional principal component analysis and convolutional neural network

ZHENG Yanbin, HAN Mengyun, FAN Wenxin

2020, 40(8): 2465-2471. DOI: 10.11772/j.issn.1001-9081.2020010081

Asbtract ( )

PDF (1282KB) ( )

References | Related Articles | Metrics

With the rapid growth of computing power, the accumulation of training data and the improvement of nonlinear activation function, Convolutional Neural Network (CNN) has a good recognition performance in handwritten Chinese character recognition. To solve the problem of slow speed of CNN for handwritten Chinese character recognition, Two Dimensional Principal Component Analysis (2DPCA) and CNN were combined to identify handwritten Chinese characters. Firstly, 2DPCA was used to extract the projection eigenvectors of handwritten Chinese characters. Secondly, the obtained projection eigenvectors were formed into an eigenmatrix. Thirdly, the formed eigenmatrix was used as the input of CNN. Finally, the softmax function was used for classification. Compared with the model based on AlexNet, the proposed method has the running time reduced by 78%; and compared with the model based on ACNN and DCNN, the proposed method has the running time reduced by 80% and 73%, respectively. Experimental results show that the proposed method can reduce the running time of handwritten Chinese character recognition without reducing the recognition accuracy.

Improved traffic sign recognition algorithm based on YOLO v3 algorithm

JIANG Jinhong, BAO Shengli, SHI Wenxu, WEI Zhenkun

2020, 40(8): 2472-2478. DOI: 10.11772/j.issn.1001-9081.2020010062

Asbtract ( )

PDF (1310KB) ( )

References | Related Articles | Metrics

Concerning the problems of large number of parameters, poor real-time performance and low accuracy of traffic sign recognition algorithms based on deep learning, an improved traffic sign recognition algorithm based on YOLO v3 was proposed. First, the depthwise separable convolution was introduced into the feature extraction layer of YOLO v3, as a result, the convolution process was decomposed into depthwise convolution and pointwise convolution to separate intra-channel convolution and inter-channel convolution, thus greatly reducing the number of parameters and the calculation of the algorithm while ensuring a high accuracy. Second, the Mean Square Error (MSE) loss was replaced by the GIoU (Generalized Intersection over Union) loss, which quantified the evaluation criteria as a loss. As a result, the problems of MSE loss such as optimization inconsistency and scale sensitivity were solved. At the same time, the Focal loss was also added to the loss function to solve the problem of severe imbalance between positive and negative samples. By reducing the weight of simple background classes, the new algorithm was more likely to focus on detecting foreground classes. The results of applying the new algorithm to the traffic sign recognition task show that, on the TT100K (Tsinghua-Tencent 100K) dataset, the mean Average Precision (mAP) of the algorithm reaches 89%, which is 6.6 percentage points higher than that of the YOLO v3 algorithm; the number of parameters is only about 1/5 of the original YOLO v3 algorithm, and the Frames Per Second (FPS) is 60% higher than YOLO v3 algorithm. The proposed algorithm improves detection speed and accuracy while reducing the number of model parameters and calculation.

Table of Content