Journal of Computer Applications

Decision tree improvement method for imbalanced data

WANG Wei, XIE Yaobin, YIN Qing

2019, 39(3): 623-628. DOI: 10.11772/j.issn.1001-9081.2018071513

Asbtract ( )

PDF (1053KB) ( )

References | Related Articles | Metrics

Focusing on the problem that serious imbalance between abnormal data and normal data in anomaly detection will lead to performance degradation of decision tree, three improved methods for C4.5 decision tree were proposed, which are C4.5+δ, UDE (Uniform Distribution Entropy) and IDEF (Improved Distribution Entropy Function). Firstly, it was deduced that the attribute selection criterion of C4.5 tends to choose the ones with imbalanced splitting. Secondly, why imbalanced splitting decreases the accuracy of anomaly (minority) detection was analyzed. Thirdly, the attribute selection criterion-information gain ratio of C4.5 was improved by introducing relaxation factor and uniform distribution entropy, or substituting distribution entropy function. Finally, three improved decision trees were verified on WEKA platform and NSL-KDD dataset. Experimental results show that three proposed improved methods can increase the accuracy of anomaly detection. Compared with C4.5, the accuracies of C4.5+7, UDE and IDEF on KDDTest-21 dataset are improved by 3.16, 3.02 and 3.12 percentage points respectively, which are better than the methods using Rényi entropy or Tsallis entropy as splitting criterion. Furthermore, using improved decision trees to detect anomalies in the industrial control system can not only improve the recall ratio of anomalies, but also reduce false positive rate.

NIBoost: new imbalanced dataset classification method based on cost sensitive ensemble learning

WANG Li, CHEN Hongmei, WANG Shengwu

2019, 39(3): 629-633. DOI: 10.11772/j.issn.1001-9081.2018071598

Asbtract ( )

PDF (858KB) ( )

References | Related Articles | Metrics

The problem of misclassification of minority class samples appears frequently when classifying massive amount of imbalanced data in real life with traditional classification algorithms, because most of these algorithms only suit balanced class distribution or samples with same misclassification cost. To overcome this problem, a classification algorithm for imbalanced dataset based on cost sensitive ensemble learning and oversampling-New Imbalanced Boost (NIBoost) was proposed. Firstly, the oversampling algorithm was used to add a certain number of minority samples to balance the dataset in each iteration, and the classifier was trained on the new dataset. Secondly, the classifier was used to classify the dataset to obtain the predicted class label of each sample and the classification error rate of the classifier. Finally, the weight coefficient of the classifier and new weight of each sample were calculated according to the classification error rate and the predicted class labeles. Experimental results on UCI datasets with decision tree and Naive Bayesian used as weak classifier algorithm show that when decision tree was used as the base classifier of NIBoost, compared with RareBoost algorithm, the F-value is increased up to 5.91 percentage points, the G-mean is increased up to 7.44 percentage points, and the AUC is increased up to 4.38 percentage points. The experimental results show that the proposed algorithm has advantages on imbalanced data classification problem.

Collaborative filtering recommendation algorithm based on tag weight

LEI Man, GONG Qin, WANG Jichao, WANG Baoqun

2019, 39(3): 634-638. DOI: 10.11772/j.issn.1001-9081.2018071521

Asbtract ( )

PDF (830KB) ( )

References | Related Articles | Metrics

Aiming at the problem that the recommendation accuracy is not good enough due to the similarity calculation in traditional collaborative filtering recommendation algorithm, a collaborative filtering recommendation algorithm based on the similarity measurement method of tag weight was proposed. Firstly, the calculation of tag weights in existing algorithm was improved to construct a user-tag weight matrix and an item-tag weight matrix. Secondly, as the recommendation system is based on the user-centered recommendation, the most accurate evaluation and demand of the users were obtained by constructing a user-item association matrix. Finally, according to the user-item bipartite graph, the similarity between users based on the label weight was calculated by the material diffusion algorithm, and the recommendation lists were generated for the target users. The experimental results show that compared with UITGCF (a hybrid Collaborative Filtering recommendation algorithm by combining the diffusion on User-Item-Tag Graph and users' personal interest model), when the sparsity environment is 0.1, the recall, accuracy, F1 score of the proposed algorithm were respectively increased by 14.69%, 9.44% and 17.23%. When the recommendation item number is 10, the three indicators respectively were increased by 17.99%, 8.98%, and 16.27%. The results show that the collaborative filtering recommendation algorithm based on tag weight effectively improves the recommendation results.

Suggestion sentence classification method based on PU learning

ZHANG Pu, LIU Chang, LI Xiao

2019, 39(3): 639-643. DOI: 10.11772/j.issn.1001-9081.2018081759

Asbtract ( )

PDF (880KB) ( )

References | Related Articles | Metrics

As a new research task, suggestion mining has important application value. Since traditional suggestion sentence classification methods have problems like complex rules, large labeling workload, high feature dimension and data sparsity, a PU (Positive and Unlabeled)-based suggestion sentence classification method was proposed. Firstly, some suggestion sentences were selected from an unlabeled review set by using a simple rule to form a positive example set; then a reliable negative example set was constructed by Spy technique in the feature space of autoencoder neural network to reduce the feature dimension and alleviate data sparsity; finally, Multi-Layer Perceptron (MLP) was trained by the positive example set and the reliable negative example set to classify the remaining unlabeled samples. On a Chinese dataset, the F1 value and the accuracy of the proposed method, reached 81.98% and 82.67% respectively. The experimental results show that the proposed method can classify suggestion sentences effectively without manually labelling the data.

SVD-CNN barrage text classification algorithm combined with improved active learning

QIU Ningjia, CONG Lin, ZHOU Sicheng, WANG Peng, LI Yanfang

2019, 39(3): 644-650. DOI: 10.11772/j.issn.1001-9081.2018081757

Asbtract ( )

PDF (1109KB) ( )

References | Related Articles | Metrics

For the loss of much semantic information in dimension reduction of text features when using pooling layer of the traditional Convolutional Network (CNN) model, a Convolutional Neural Network model based on Singular Value Decomposition algorithm (SVD-CNN) was proposed. Firstly, an improved Active Learning algorithm based on Density Center point sampling (DC-AL) was used to tag samples contributing a lot to the classification model, obtaining a high-quality model training set at a low tagging cost. Secondly, an SVD-CNN barrage text classification model was established by combining SVD algorithm, and SVD was used to replace the traditional CNN model pooling layer for feature extraction and dimension reduction, then the barrage text classification task was completed on these bases. Finally, the model parameters were optimized by using Partial Sampling Gradient Descent algorithm (PSGD). In order to verify the effectiveness of the improved algorithm, multiple barrage data sample sets were used in the comparison experiments between the proposed model and the common text classification model. The experimental results show that the improved algorithm can better preserve semantic features of the text, ensure the stability of training process and improve the convergence speed of the model. In summary, the proposed algorithm has better classification performance than traditional algorithms on multiple barrage texts.

Network representation learning algorithm based on improved random walk

WANG Wentao, HUANG Ye, WU Lintao, KE Xuan, TANG Wan

2019, 39(3): 651-655. DOI: 10.11772/j.issn.1001-9081.2018071509

Asbtract ( )

PDF (817KB) ( )

References | Related Articles | Metrics

Existing Word2vec-based Network Representation Learning (NRL) algorithms use a Random Walk (RW) to generate node sequence. The RW tends to select nodes with larger degrees, so that the node sequence can not reflect the network structure information well, decreasing the performance of the algorithm. To solve the problem, a new network representation learning algorithm based on improved random walk was proposed. Firstly, RLP-MHRW (Remove self-Loop Probability for Metropolis-Hastings Random Walk) was used to generate node sequence. This algorithm would not favor nodes with larger degrees while forming a node sequence, so that the obtained sequence can efficiently reflect the network structure information. Then, the node sequence was put into Skip-gram model to obtain the node representation vector. Finally, the performance of the network representation learning algorithm was measured by a link prediction task. Contrast experiment has been performed in four real network datasets. Compared with LINE (Large-scale Information Network Embedding) and node2vec on arXiv ASTRO-PH, the AUC (Area Under Curve) value of link prediction has increased by 8.9% and 3.5% respectively, and so do the other datasets. Experimental results show that RLP-MHRW can effectively improve the performance of the network representation learning algorithm based on Word2vec.

New simplified model of discounted {0-1} knapsack problem and solution by genetic algorithm

YANG Yang, PAN Dazhi, LIU Yi, TAN Dailun

2019, 39(3): 656-662. DOI: 10.11772/j.issn.1001-9081.2018071580

Asbtract ( )

PDF (1164KB) ( )

References | Related Articles | Metrics

Current Discounted {0-1} Knapsack Problem (D{0-1}KP) model takes the discounted relationship as a new individual, so the repair method must be adopted in the solving process to repair the individual coding, making the model have less solving methods. In order to solve the problem of single solving method, by changing the binary code expression in the model, an expression method with discounted relationship out of individual code was proposed. Firstly, if and only if each involved individual encoding value was one (which means the product was one), the discounted relationship was established. According to this setting, a Simplified Discounted {0-1} Knapsack Problem (SD{0-1}KP) model was established. Then, an improved genetic algorithm-FG (First Gentic algorithm) was proposed based on Elitist Reservation Strategy (EGA) and GREedy strategy (GRE) for SD{0-1}KP model. Finally, combining penalty function method, a high precision penalty function method-SG (Second Genetic algorithm) for SD{0-1}KP was proposed. The results show that the SD{0-1}KP model can fully cover the problem domain of D{0-1}KP. Compared with FirEGA (First Elitist reservation strategy Genetic Algorithm), the two algorithms proposed have obvious advantages in solving speed. And SG algorithm introduces the penalty function method for the first time, which enriches the solving methods of the problem.

Krill herd algorithm based on dynamic pressure control operator

SHEN Ying, HUANG Zhangcan, TAN Qing, LIU Ning

2019, 39(3): 663-667. DOI: 10.11772/j.issn.1001-9081.2018081661

Asbtract ( )

PDF (786KB) ( )

References | Related Articles | Metrics

Aiming at the problem that basic Krill Herd (KH) algorithm has poor local search ability and insufficient exploitation capacity on complex function optimization problems, a Krill Herd algorithm based on Dynamic Pressure Control operator (DPCKH) was proposed. A new dynamic pressure control operator was added to the basic krill herd algorithm, which made it more effective on complex function optimization problems. The dynamic pressure control operator quantified the induction effects of several different outstanding individuals on the target individual through Euclidean distance, accelerating the production of new krill individuals near the excellent individuals and improving the local exploration ability of krill individuals. Compared to ACO (Ant Colony Optimization) algorithm, DE algorithm, KH algorithm, KHLD (Krill Herd with Linear Decreasing step) algorithm and PSO (Particle Swarm Optimization) algorithm on 7 benchmark functions, DPCKH algorithm has stronger local exporatioin and exploitation ability.

Attention mechanism based pedestrian trajectory prediction generation model

SUN Yasheng, JIANG Qi, HU Jie, QI Jin, PENG Yinghong

2019, 39(3): 668-674. DOI: 10.11772/j.issn.1001-9081.2018081645

Asbtract ( )

PDF (1160KB) ( )

References | Related Articles | Metrics

Aiming at that Long Short Term Memory (LSTM) has only one pedestrian considered in isolation and cannot realize prediction with various possibilities, an attention mechanism based generative model for pedestrian trajectory prediction called AttenGAN was proposed to construct pedestrian interaction model and predict multiple reasonable possibilities. The proposed model was composed of a generator and a discriminator. The generator predicted multiple possible future trajectories according to pedestrian's past trajectory probability while the discriminator determined whether the trajectories were really existed or generated by the discriminator and gave feedback to the generator, making predicted trajectories obtained conform social norm more. The generator consisted of an encoder and a decoder. With other pedestrians information obtained by the attention mechanism as input, the encoder encoded the trajectories of the pedestrian as an implicit state. Combined with Gaussian noise, the implicit state of LSTM in the encoder was used to initialize the implicit state of LSTM in the decoder and the decoder decoded it into future trajectory prediction. The experiments on ETH and UCY datasets show that AttenGAN can provide multiple reasonable trajectory predictions and can predict the trajectory with higher accuracy compared with Linear, LSTM, S-LSTM (Social LSTM) and S-GAN (Social Generative Adversarial Network) models, especially in scenes of dense pedestrian interaction. Visualization of predicted trajectories obtained by the generator indicated the ability of this model to capture the interaction pattern of pedestrians and jointly predict multiple reasonable possibilities.

Location prediction method of mobile user based on Adaboost-Markov model

YANG Zhen, WANG Hongjun

2019, 39(3): 675-680. DOI: 10.11772/j.issn.1001-9081.2018071506

Asbtract ( )

PDF (1000KB) ( )

References | Related Articles | Metrics

To solve the problem that Markov model has poor prediction accuracy and sparse matching in location prediction, a mobile user location prediction method based on Adaboost-Markov model was proposed. Firstly, the original trajectory data was preprocessed by a trajectory division method based on angle offset and distance offset to extract feature points, and density clustering algorithm was used to cluster the feature points into interest regions of the user, then the original trajectory data was discretized into a trajectory sequence composed of interest regions. Secondly, according to the matching degree of prefix trajectory sequence and historical trajectory pattern tree, the model order k was adaptively determined. Finally, Adaboost algorithm was used to assign the corresponding weight coefficients according to the importance degree of 1 to k order Markov models to form a multi-order fusion Markov model, realizing the prediction of future interest regions of the mobile user. The experimental results on a large-scale real user trajectory dataset show that the average prediction accuracy of Adaboost-Markov model is improved by 20.83%, 11.3%, and 5.38% respectively compared with the first-order Markov model, the second-order Markov model, and the multi-order fusion Markov model with average weight coefficient, and the proposed model has good universality and multi-step prediction performance.

Multi-Agent path planning algorithm based on ant colony algorithm and game theory

ZHENG Yanbin, WANG Linlin, XI Pengxue, FAN Wenxin, HAN Mengyun

2019, 39(3): 681-687. DOI: 10.11772/j.issn.1001-9081.2018071601

Asbtract ( )

PDF (1115KB) ( )

References | Related Articles | Metrics

A two-stage path planning algorithm was proposed for multi-Agent path planning. Firstly, an improved ant colony algorithm was used to plan an optimal path for each Agent from the starting point to the target point without colliding with the static obstacles in the environment. The reverse learning method was introduced to an improved ant colony algorithm to initialize the ant positions and increase the global search ability of the algorithm. The adaptive inertia weighted factor in the particle swarm optimization algorithm was used to adjust the pheromone intensity Q value to make it adaptively change to avoid falling into local optimum. The pheromone volatilization factor ρ was adjusted to speed up the iteration of the algorithm. Then, if there were dynamic collisions between multiple Agents, the game theory was used to construct a dynamic obstacle avoidance model between them, and the virtual action method was used to solve the game and select multiple Nash equilibria, making each Agent quickly learn the optimal Nash equilibrium. The simulation results show that the improved ant colony algorithm has a significant improvement in search accuracy and search speed compared with the traditional ant colony algorithm. And compared with Mylvaganam's multi-Agent dynamic obstacle avoidance algorithm, the proposed algorithm reduces the total path length and improves the convergence speed.

Pedestrian visual positioning algorithm for underground roadway based on deep learning

HAN Jianghong, YUAN Jiaxuan, WEI Xing, LU Yang

2019, 39(3): 688-694. DOI: 10.11772/j.issn.1001-9081.2018071501

Asbtract ( )

PDF (1079KB) ( )

References | Related Articles | Metrics

The self-driving mine locomotive needs to detect and locate pedestrians in front of it in the underground roadway in real-time. Non-visual methods such as laser radar are costly, while traditional visual methods based on feature extraction cannot solve the problem of poor illumination and uneven light in the laneway. To solve the problem, a pedestrian visual positioning algorithm for underground roadway based on deep learning was proposed. Firstly, the overall structure of the system based on deep learning network was given. Secondly, a multi-layer Convolutional Neural Network (CNN) for object detection was built to calculate the two-dimensional coordinates and the size of bounding box of pedestrians in visual field of the self-driving locomotive. Thirdly, the third-dimensional distance between the pedestrian in the image and the locomotive was calculated by polynomial fitting. Finally, the model was trained, verified and tested through real sample sets. Experimental results show that the accuracy of the proposed algorithm reaches 94%, the speed achieves 25 frames per second, and the distance detection error is less than 4%, thus efficient and real-time laneway pedestrian visual positioning is realized.

Quality evaluation of face image based on convolutional neural network

LI Qiuzhen, LUAN Chaoyang, WANG Shuangxi

2019, 39(3): 695-699. DOI: 10.11772/j.issn.1001-9081.2018071588

Asbtract ( )

PDF (821KB) ( )

References | Related Articles | Metrics

Aiming at the low recognition rate caused by low quality of face images in the process of face recognition, a face image quality evaluation model based on convolutional neural network was proposed. Firstly, an 8-layer convolutional neural network model was built to extract deep semantic information of face image quality. Secondly, face images were collected in unconstrained environment, and were filtered by traditional image processing method and manual selecting, then the dataset obtained was used to train the model parameters. Thirdly, by accelerating training on GPU (Graphics Processing Unit), the mapping relationship of fitted face images to categories was obtained. Finally, the input probability of high-quality image category was taken as the image quality score, and the face image quality scoring mechanism was established. Experimental results show that compared with VGG-16 network, the precision rate of the proposed model is reduced by 0.21 percentage points, but the scale of the parameters is reduced by 98%, which greatly improves the efficiency of the model. At the same time, the proposed model has strong discriminant ability in aspects such as face blur, illumination, posture and occlusion. Therefore, the proposed model can be applied to real-time face recognition system to improve the accuracy of the system without affecting the efficiency.

Multi-class vehicle detection in surveillance video based on deep learning

XU Zihao, HUANG Weiquan, WANG Yin

2019, 39(3): 700-705. DOI: 10.11772/j.issn.1001-9081.2018071587

Asbtract ( )

PDF (976KB) ( )

References | Related Articles | Metrics

Since performance of traditional machine learning methods of detecting vehicles in traffic surveillance video is influenced by objective factors such as video quality, shooting angle and weather, which results in complex preprocessing, hard generalization and poor robustness, combined with dilated convolution, feature pyramid and focal loss, two deep learning models which are improved Faster R-CNN (Faster Regions with Convolutional Neural Network) and SSD (Single Shot multibox Detector) model were proposed for vehicle detection. Firstly, a dataset was composed of 851 labeled images captured from the surveillance video at different time. Secondly, improved and original models were trained under same training strategies. Finally, average accuracy of each model were calculated to evaluate. Experimental results show that compared with original Faster R-CNN and SSD, the average accuracies of the improved models improve 0.8 percentage points and 1.7 percentage points respectively. Both deep learning methods are more suitable for vehicle detection in complicated situation than traditional methods. The former has higher accuracy and slower speed, which is more suitable for video off-line processing, while the latter has lower accuracy and higher speed, which is more suitable for video real-time detection.

Non-negative local sparse coding algorithm based on elastic net and histogram intersection

WAN Yuan, ZHANG Jinghui, CHEN Zhiping, MENG Xiaojing

2019, 39(3): 706-711. DOI: 10.11772/j.issn.1001-9081.2018071483

Asbtract ( )

PDF (1007KB) ( )

References | Related Articles | Metrics

To solve the problems that group effect is neglected when selecting dictionary bases in sparse coding models, and distance between a features and a dictionary base can not be effectively measured by Euclidean distance, Non-negative Local Sparse Coding algorithm based on Elastic net and Histogram intersection (EH-NLSC) was proposed. Firstly, with elastic-net model introduced in the optimization function to remove the restriction on selected number of dictionary bases, multiple groups of correlation features were selected and redundant features were eliminated, improving the discriminability and effectiveness of the coding. Then, histogram intersection was introduced in the locality constraint of the coding, and the distance between the feature and the dictionary base was redefined to ensure that similar features share their local bases. Finally, multi-class linear Support Vector Machine (SVM) was adopted to realize image classification. The experimental results on four public datasets show that compared with LLC (Locality-constrained Linear Coding for image classification) and NENSC (Non-negative Elastic Net Sparse Coding), the classification accuracy of EH-NLSC is increased by 10 percentage points and 9 percentage points respectively on average, proving its effectiveness in image representation and classification.

Gait recognition method based on multiple classifier fusion

HUAN Zhan, CHEN Xuejie, LYU Shiyun, GENG Hongyang

2019, 39(3): 712-718. DOI: 10.11772/j.issn.1001-9081.2018071638

Asbtract ( )

PDF (1202KB) ( )

References | Related Articles | Metrics

To improve the performance of gait recognition based on smartphone accelerometer, a recognition method based on Multiple Classifier Fusion (MCF) was proposed. Firstly, as the gait features extracted from the existing methods were relatively simple, the speed variation of the relative gradual acceleration extracted from each single gait cycle and the acceleration variation per unit time were taken as two new types of features (16 in total). Secondly, combing the new features with the frequently-used time domain and frequency domain features to form a new feature set, which could be used to train multiple classifiers with excellent recognition effect and short training time. Finally, Multiple Scale Voting (MSV) was used to fuse the output of the multiple classifiers to obtain the final classification result. To test the performance of the proposed method, the gait data of 32 volunteers were collected. Experimental results show that the recognition rate of new features for a single classifier is increased by 5.95% on average, and the final recognition rate of MSV fusion algorithm is 97.78%.

Survey of frequent pattern mining over data streams

HAN Meng, DING Jian

2019, 39(3): 719-727. DOI: 10.11772/j.issn.1001-9081.2018081712

Asbtract ( )

PDF (1510KB) ( )

References | Related Articles | Metrics

Advanced applications such as fraud detection and trend learning lead to the development of frequent pattern mining over data streams. Data stream mining has to face more problems than static data mining like spatio-temporal constraint and combinatorial explosion of itemsets. In the paper, the existing frequent pattern mining algorithms over data streams were reviewed, and some classical algorithms and some newest algorithms were analyzed. According to the completeness of pattern set, frequent patterns of data stream could be divided into complete patterns and compressed patterns. Compressed patterns include closed frequent patterns, maximal frequent patterns, top-k frequent patterns and combinations of them. Between them, only closed frequent patterns are losslessly compressed. And constrained frequent pattern mining was used to narrow the result set obtained, satisfying the user's demand more. Algorithms based on sliding window model and time decay model were used to better handle recent transactions which occupy an important position in data stream mining. Moreover, two of the common algorithms, sequential pattern mining and high utility pattern mining algorithms were introduced. At last, further research direction of frequent pattern mining over data streams were discussed.

Association rule mining algorithm for Hopfield neural network based on threshold adaptive memristor

YU Yongbin, QI Minhui, Nyima Tashi, WANG Lin

2019, 39(3): 728-733. DOI: 10.11772/j.issn.1001-9081.2018071497

Asbtract ( )

PDF (980KB) ( )

References | Related Articles | Metrics

Aiming at the inaccurate mining results of the Maximum Frequent Itemset mining algorithm based on Hopfield Neural Network (HNNMFI), an improved association rule mining algorithm for Hopfield neural network based on current ThrEshold Adaptive Memristor (TEAM) model was proposed. Firstly, TEAM model was used to design and implement synapses whose weights were set and updated by the ability of that threshold memristor continuously changes memristance value with square-wave voltage, and the input of association rule mining algorithm was self-adapted by the neural network. Secondly, the energy function was improved to align with standard energy function, and the memristance values were used to represent the weights, then the weights and bias were amplified. Finally, an algorithm of generating association rules from the maximum frequent itemsets was designed. A total of 1000 simulation experiments using 10 random transaction sets with size less than 30 were performed. Experimental results show that compared with HNNMFI algorithm, the proposed algorithm improves the result accuracy of association mining by more than 33.9%, which indicates that the memristor can effectively improve the result accuracy of Hopfield neural network in association rule mining.

Feature selection based on maximum conditional and joint mutual information

MAO Yingchi, CAO Hai, PING Ping, LI Xiaofang

2019, 39(3): 734-741. DOI: 10.11772/j.issn.1001-9081.2018081694

Asbtract ( )

PDF (1284KB) ( )

References | Related Articles | Metrics

In the analysis process of high-dimensional data such as image data, genetic data and text data, when samples have redundant features, the complexity of the problem is greatly increased, so it is important to reduce redundant features before data analysis. The feature selection based on Mutual Information (MI) can reduce the data dimension and improve the accuracy of the analysis results, but the existing feature selection methods cannot reasonably eliminate the redundant features because of the single standard. To solve the problem, a feature selection method based on Maximum Conditional and Joint Mutual Information (MCJMI) was proposed. Joint mutual information and conditional mutual information were both considered when selecting features with MCJMI, improving the feature selection constraint. Exerimental results show that the detection accuracy is improved by 6% compared with Information Gain (IG) and minimum Redundancy Maximum Relevance (mRMR) feature selection; 2% compared with Joint Mutual Information (JMI) and Joint Mutual Information Maximisation (JMIM); and 1% compared with LW index with Sequence Forward Search algorithm (SFS-LW). And the stability of MCJMI reaches 0.92, which is better than JMI, JMIM and SFS-LW. In summary the proposed method can effectively improve the accuracy and stability of feature selection.

Sparse non-negative matrix factorization based on kernel and hypergraph regularization

YU Jianglan, LI Xiangli, ZHAO Pengfei

2019, 39(3): 742-749. DOI: 10.11772/j.issn.1001-9081.2018071617

Asbtract ( )

PDF (1229KB) ( )

References | Related Articles | Metrics

Focused on the problem that when traditional Non-negative Matrix Factorization (NMF) is applied to clustering, robustness and sparsity are not considered at the same time, which leads to low clustering performance, a sparse Non-negative Matrix Factorization algorithm based on Kernel technique and HyperGraph regularization (KHGNMF) was proposed. Firstly, on the basis of inheriting good performance of kernel technique, L_2,1 norm was used to improve F-norm of standard NMF, and hyper-graph regularization terms were added to preserve inherent geometric structure information among the original data as much as possible. Secondly, L_2,1/2 pseudo norm and L_1/2 regularization terms were merged into NMF model as sparse constraints. Finally, a new algorithm was proposed and applied to image clustering. The experimental results on six standard datasets show that KHGNMF can improve clustering performance (accuracy and normalized mutual information) by 39% to 54% compared with nonlinear orthogonal graph regularized non-negative matrix factorization, and the sparsity and robustness of the proposed algorithm are increased and the clustering effect is improved.

Efficient identity-based multi-identity fully homomorphic encryption scheme

TU Guangsheng, YANG Xiaoyuan, ZHOU Tanping

2019, 39(3): 750-755. DOI: 10.11772/j.issn.1001-9081.2018081669

Asbtract ( )

PDF (903KB) ( )

References | Related Articles | Metrics

Focusing on the issue that the traditional Identity-Based Fully Homomorphic Encryption scheme (IBFHE) cannot perform homomorphic operations on ciphertexts under different IDentities (ID), a hierarchical identity-based multi-identity fully homomorphic encryption scheme based on Learning With Error (LWE) problem was proposed. In the proposed scheme, the transformation mechanism of identity-based multi-identity homomorphic encryption scheme ([CM15] scheme) proposed by Clear et al. (CLEAR M, McGOLDRICK C. Multi-identity and multi-key leveled FHE from learning with errors. Proceedings of the 2015 Annual Cryptology Conference, LNCS 9216. Berlin:Springer, 2015:630-656) in 2015 was combined with Identity-Based Encryption (IBE) scheme proposed by Cash et al. (CASH D, HOFHEINZ D, KILTZ E, et al. Bonsai trees, or how to delegate a lattice basis. Proceedings of the 2010 Annual International Conference on the Theory and Applications of Cryptographic Techniques, LNCS 6110. Berlin:Springer, 2010:523-552) in 2010 ([CHKP10] scheme), guranteeing IND-ID-CPA (INDistinguishability of IDentity-based encryption under Chosen-Plaintext Attack) security in the random oracle model and realizing ciphertext homomorphic operation under different identities, so the application of this scheme was more promising. Compared with[CM15] scheme, the proposed scheme has advantages in terms of public key scale, private key scale, ciphertext size, and hierarchical properties, and has a wide application prospect.

Reversible data hiding scheme in encrypted videos based on vector histogram shifting

NIU Ke, ZHANG Shuo, YANG Xiaoyuan

2019, 39(3): 756-762. DOI: 10.11772/j.issn.1001-9081.2018071604

Asbtract ( )

PDF (1032KB) ( )

References | Related Articles | Metrics

Aiming at the problem of low embedding capacity and poor invisibility in compressed domain video hiding algorithm, a reversible steganography scheme for H.264/AVC encryption domain was proposed. Firstly, the reference frame interval parameter was determined by the embedded capacity and the carrier size, and whether the cover was encrypted was determined by the need. Then, an embedded key was generated according to the number of video frames to be embedded. Finally, the reversible information embedding on motion vector was realized by the vector histogram shifting in the compressed video. The proposed scheme overcame the distortion accumulation effect due to motion vector modification by specifying a decoding reference frame and is compatible with motion vector-based video encryption algorithms. Video decryption and information extraction depend on the decryption key and the embedded key respectively, which are separated from each other. The information can be extracted in the video ciphertext domain or the decrypted plaintext domain and has no influence on video cover recovery. As security of the information depends on the embedded key, the length of the key can be controlled as needed with the maximum length equal to the number of frames in which the information can be embedded. Experimental results show that the proposed scheme has low computational complexity and high security, and can adjust capacity and invisibility according to embedded load. Compared with BCH code reversible embedding scheme, the PSNR (Peak Signal-to-Noise Ratio) value increases by 3 to 5 dB and the average embedded capacity increases by 5 to 10 times.

Crowdsourcing location data collection for local differential privacy

HUO Zheng, ZHANG Kun, HE Ping, WU Yanbin

2019, 39(3): 763-768. DOI: 10.11772/j.issn.1001-9081.2018071541

Asbtract ( )

PDF (922KB) ( )

References | Related Articles | Metrics

To solve the problem of privacy leakage in crowdsourced location data collection, a locally differentially private location data collection method with crowdsourcing was proposed. Firstly, a Voronoi diagram constructed by point-by-point insertion method was used to partition the road network space. Secondly, a random disturbance satisfying local differential privacy was used to disturb the original location data in each Voronoi grid. Thirdly, a designed spatial range query method was applied to noisy datasets to get the unbiased estimation of the actual result. Finally, experiments were carried out on spatial range queries to compare the proposed algorithm with PTDC (Privacy-preserving Trajectory Data Collection) algorithm. The results show that the query error rate is no more than 40%, and less than 20%in the best situation, and the running time is less than 8 seconds, which are better than those of PTDC algorithm while the proposed method has a higher degree of privacy preserving.

Intrusion detection based on improved sparse denoising autoencoder

GUO Xudong, LI Xiaomin, JING Ruxue, GAO Yuzhuo

2019, 39(3): 769-773. DOI: 10.11772/j.issn.1001-9081.2018071627

Asbtract ( )

PDF (833KB) ( )

References | Related Articles | Metrics

In order to solve the problem that traditional intrusion detection methods can not effectively solve instrusion data in high-dimensional networks, an intrusion detection method based on Stacked Sparse Denosing Autoencoder (SSDA) network was proposed. Firstly, SSDA was used to perform dimensionality reduction on the intrusion data. Then, the highly abstracted low-dimensional data was used as input data of softmax classifier to realize intrusion detection. Finally, in order to improve original intrusion data decoding ability of the network and intrusion detection ability of the model, an Improved model based on SSDA (ISSDA) was proposed, with new constraints added to the autoencoder. The experimental results show that compared with SSDA, ISSAD's detection accuracy of four types of attacks was improved by about 5%, and the false positive rate of ISSAD was also effectively reduced.

Provable radio frequency identification authentication protocol with scalability

SHI Zhicai, WANG Yihan, ZHANG Xiaomei, CHEN Shanshan, CHEN Jiwei

2019, 39(3): 774-778. DOI: 10.11772/j.issn.1001-9081.2018081648

Asbtract ( )

PDF (817KB) ( )

References | Related Articles | Metrics

The popular Radio Frequency IDentification (RFID) tags are some passive ones and they only have very limited computing and memory resources, which makes it difficult to solve the security, privacy and scalability problems of RFID authentication protocols. Based on Hash function, a security-provable lightweight authentication protocol was proposed. The protocol ensures the confidentiality and privacy of the sessions during the authentication process by Hashing and randomizing. Firstly, the identity of a tag was confirmed by its pseudonym and was preserved from leaking to any untrusted entity such as a reader. Secondly, only one Hashing computation was needed to confirm a tag's identity in the backend server, and the searching time to the tag's identity was limited to a constant by using the identifier to construct a Hash table. Finally, after each authentication, the secrecy and pseudonym of the tag were updated to ensure forward security of the protocol. It is proved that the proposed protocol satisfies scalability, forward security and anonymity demands and can prevent eavesdropping, tracing attack, replay attack and de-synchronization attack. The protocol only needs Hash function and pseudorandom generating operation for the tag, therefore it is very suitable to low-cost RFID systems.

Authentication scheme for smart grid communication based on elliptic curve cryptography

LIU Xindong, XU Shuishuai, CHEN Jianhua

2019, 39(3): 779-783. DOI: 10.11772/j.issn.1001-9081.2018071486

Asbtract ( )

PDF (801KB) ( )

References | Related Articles | Metrics

To ensure the security and reliability of communication in the smart grid, more and more authentication protocols have been applied in the communication process. For the authentication protocol proposed by Mahmood et al. (MAHMOOD K, CHAUDHRY S A, NAQVI H, et al. An elliptic curve cryptography based lightweight authentication scheme for smart grid communication. Future Generation Computer Systems. 2018,81:557-565), some defects were pointed out. For example, this protocol can be easily attacked by internal privileged personnel, is lack of password replacement phase and unfriendly to users, in which unique username cannot be guaranteed, even a formula error exists. To improve this protocol, an authentication protocol based on elliptic curve was proposed. Firstly, a login phase between the user and the device was added in the improved protocol. Secondly, elliptic curve cryptography puzzle was used to realize information exchange. Finally, the password replacement phase was added. Through the formal analysis by BAN (Burrows-Abadi-Needha) logic, the improved protocol is safe and feasible, which can resist internal personnel attacks, has password replacement and unique username, and is more friendly to users.

Optimization of virtual resource deployment strategy in container cloud

LI Qirui, PENG Zhiping, CUI Delong, HE Jieguang

2019, 39(3): 784-789. DOI: 10.11772/j.issn.1001-9081.2018081662

Asbtract ( )

PDF (1119KB) ( )

References | Related Articles | Metrics

Aiming at high energy consumption of data center in container cloud, a virtual resource deployment strategy based on host selection algorithm with Power Full (PF) was proposed. Firstly, the allocation and migration scheme of virtual resource in container cloud was proposed and the significant impact of host selection strategy on energy consumption of data center was found. Secondly, by studying the mathematical relationship between the utilization of host and the utilization of containers, between the utilization of host and the utilization of virtual machines and between the utilization of host and energy consumption of data center, a mathematical model of the energy consumption of data center in container cloud was constructed and an optimization objective function was defined. Finally, the function of host's energy consumption was simulated using linear interpolation method, and a host selection algorithm with PF was proposed according to the clustering property of the objects. Simulation results show that compared with First Fit (FF), Least Full (LF) and Most Full (MF), the energy consumption of the proposed algorithm is averagely reduced by 45%,53% and 49% respectively in the computing service of regular tasks and different host clusters; is averagely reduced by 56%,46% and 58% respectively in the computing service of regular tasks and same host cluster; is averagely reduced by 32%,24% and 12% respectively in the computing service of irregular tasks and different host clusters. The results indicate that the proposed algorithm realizes reasonable virtual resource deployment in container cloud, and has advantage in data center energy saving.

k nearest neighbor query based on parallel ant colony algorithm in obstacle space

GUO Liangmin, ZHU Ying, SUN Liping

2019, 39(3): 790-795. DOI: 10.11772/j.issn.1001-9081.2018081647

Asbtract ( )

PDF (932KB) ( )

References | Related Articles | Metrics

To solve the problem of k nearest neighbor query in obstacle space, a k nearest neighbor Query method based on improved Parallel Ant colony algorithm (PAQ) was proposed. Firstly, ant colonies with different kinds of pheromones were utilized to search k nearest neighbors in parallel. Secondly, a time factor was added as a condition of judging path length to directly show the searching time of ants. Thirdly, the concentration of initial pheromone was redefined to avoid the blind searching of ants. Finally, visible points were introduced to divide the obstacle path into multiple Euclidean paths, meawhile the heuristic function was improved and the visible points were selected by ants to conduct probability transfer making ants search in more proper direction and prevent the algorithm from falling into local optimum early. Compared to WithGrids method, with number of data points less than 300, the running time for line segment obstacle is averagely reduced by about 91.5%, and the running time for polygonal obstacle is averagely reduced by about 78.5%. The experimental results show that the running time of the proposed method has obvious advantage on small-scale data, and the method can process polygonal obstacles.

Optimization of fractional PID controller parameters based on improved PSO algorithm

JIN Tao, DONG Xiucheng, LI Yining, REN Lei, FAN Peipei

2019, 39(3): 796-801. DOI: 10.11772/j.issn.1001-9081.2018081698

Asbtract ( )

PDF (931KB) ( )

References | Related Articles | Metrics

Aiming at poor control effect of Fractional Order Proportional-Integral-Derivative (FOPID) controller and the characteristics of wide range and high complexity of parameter tuning for FOPID controller, an improved Particle Swarm Optimization (PSO) method was proposed to optimize the parameters of FOPID controller. In the proposed algorithm, the upper and lower limits of inertial weight coefficients in PSO were defined and decreased nonlinearly with the iteration times in form of Gamma function, meanwhile, the inertia weight coefficients and learning factors of particles were dynamically adjusted according to the fitness value of particles, making the particles keep reasonable motion inertia and learning ability, and improving self-adaptive ability of the particles. Simulation experiments show that the improved PSO algorithm has faster convergence rate and higher convergence accuracy than the standard PSO algorithm in optimizing the parameters of FOPID controller, which makes the FOPID controller obtain better comprehensive performance.

Review of network background traffic classification and identification

ZOU Tengkuan, WANG Yuying, WU Chengrong

2019, 39(3): 802-811. DOI: 10.11772/j.issn.1001-9081.2018071552

Asbtract ( )

PDF (1686KB) ( )

References | Related Articles | Metrics

Internet traffic classification is a process of identifying network applications and classifying corresponding traffic, which is considered as the most basic function of modern network management and security system. And application-related traffic classification is the basic technology of recent network security. Traditional traffic classification methods include port-based prediction methods and payload-based depth detection methods. In current network environment, there are some practical problems in traditional methods, such as dynamic ports and encryption applications. Therefore, Machine Learning (ML) technology based on traffic statistics is used to classify and identify traffic. Machine learning can realize centralized automatic search by using provided traffic data and describe useful structural patterns, which is helpful to intelligently classify traffic. Initially, Naive Bayes method was used to identify and classify network traffic classification, performing well on specific flows with accuracy over 90%, while on traffic such as peer-to-peer transmission network traffic (P2P) with accuracy only about 50%. Then, methods such as Support Vector Machine (SVM) and Neural Network (NN) were used, and neural network method could make accuracy of overall network classification reach 80% or more. A number of studies show that the use of a variety of machine learning methods and their improvements can improve the accuracy of traffic classification.

Load balancing opportunistic routing protocol for power line communication network in smart grids

LI Zhuhong, ZHAO Canming, YAN Long, ZHANG Xinming

2019, 39(3): 812-816. DOI: 10.11772/j.issn.1001-9081.2018071457

Asbtract ( )

PDF (790KB) ( )

References | Related Articles | Metrics

To solve load balancing problem of Power Line Communication (PLC) network in Smart Grid (SG), an adaptive opportunistic routing protocol named LBORP (Load Balancing Opportunistic Routing Protocol) was proposed. All the candidate forwarding nodes receiving a packet in LBORP had the opportunity to participate in packet forwarding. As a result, packet forwarding was no longer limited to one routing path, which avoided load imbalance caused by traffic with only one link for traffic to pass. And forwarding priority of the candidate forwarding nodes considered not only the distance from forwarding nodes to destination node, but also instability of PLC links and change of traffic. Besides, an implicit acknowledgment scheme was adopted in LBORP, further reducing the end-to-end delay of the proposed protocol. In simulation experiment, compared to PLC-TR (Power Line Communication-Tree Routing) and PLC-OR (Power Line Communication-Opportunistic Routing), LBORP reduced delay by 19.7% and 45.8% respectively and reduced packet loss rate by 23.4% and 32.5% respectively. Experimental results show that LBORP can achieve network load balancing, improve network reliability and reduce end-to-end delay.

Reliable beacon-based and density-aware distance localization algorithm for wireless sensor network

QIAN Kaiguo, BU Chunfen, WANG Yujian, SHEN Shikai

2019, 39(3): 817-823. DOI: 10.11772/j.issn.1001-9081.2018071661

Asbtract ( )

PDF (1083KB) ( )

References | Related Articles | Metrics

Traditional DV-Hop localization algorithm and Amorphous algorithm for Wireless Sensor Network (WSN) can not meet practical application with lower localization accuracy due to defects of colinearity of beacons, range ambiguity and the distance error caused by path deviation. Especially, in the node heterogeneously distributed application scenario, the problem becomes more serious. So, a Reliable beacon-based and Density-aware distance Localization Algorithm for WSN (RDLA) was proposed to improve localization accuracy. Firstly, hop threshold and reliability function of approximate equilateral triangle were employed to select the beacon nodes with small error to avoid collinear problem. Secondly, node density-aware hop distance estimation method was used to solve range ambiguity problem, and distances were cumulatived along the Shortest Hop Path (SHP) from unknown node to three beacons. This distance was amended to straight-line distance. Finally, two-dimensional hyperbolic calculation method was adopted to determine locations of unknown nodes and improve node location accuracy. The extensive simulation results by Matlab R2012a show that the Average Localization Error (ALE) of RDLA is lower than that of DV-Hop algorithm and its improvement algorithms in node uniform distribution network. Remarkably, RDLA is tremendously superior to the others with the lowest ALE in node non-uniform distribution network and C shape network, in which, the ALE is almost controlled below 28%.

Improved centroid localization algorithm based on dynamic loss factor and weight

REN Xiaokui, LI Feng, CHENG Lin

2019, 39(3): 824-828. DOI: 10.11772/j.issn.1001-9081.2018081674

Asbtract ( )

PDF (810KB) ( )

References | Related Articles | Metrics

Aiming at the problem that the positioning accuracy of wireless sensor network nodes is affected by the environment and the error weight factor, a centroid positioning algorithm was proposed to dynamically correct path loss factor and error weight factor. At earlier stage, a dynamic loss factor was obtained by weighting correction according to the actual measurement and path loss model; at later stage, the weight factor matrix was constructed by dividing rectangular region. Firstly, the location of the unknown node was estimated by introducing dynamic loss factor into the traditional weighted centroid localization algorithm. Then, the error weight factor matrix was queried to determine the optimal weight factor. Finally, the unknown node location was recalculated. The experimental results show that the improved algorithm reduces the average error and the minimum error, and the positioning accuracy is 58% higher than ordinary centroid algorithm, 21% higher than dynamic correction centroid algorithm, and 11% higher than dynamic weighted centroid algorithm.

Optimization of data retransmission algorithm in information centric networking

XIN Yingying, LIU Xiaojuan, FANG Chunlin, LUO Huan

2019, 39(3): 829-833. DOI: 10.11772/j.issn.1001-9081.2018071492

Asbtract ( )

PDF (786KB) ( )

References | Related Articles | Metrics

Aiming at the problem of low network bandwidth utilization rate of the original data recovery mechanism in Information Centric Networking (ICN), a Network Coding based Real-time Data Retransmission (NC-RDR) algorithm was proposed. Firstly, the lost data packets in the network were counted according to the real-time status of the network. Then, network coding was combined into ICN, and the statistical lost data packets were combinatorially coded. Finally, the encoded data packets were retransmitted to the receiver. The simulation results show that compared with NC-MDR (Network Coding based Multicast Data Recovery) algorithm, in the transmission bandwidth aspect, the average number of transmissions was reduced by about 30%. In ICN, the proposed algorithm can effectively reduce the number of data re-transmissions, improveing network transmission efficiency.

Influence maximization algorithm on internal decay based on community structure in social network

SUN Zili, PENG Jian, TONG Bo

2019, 39(3): 834-838. DOI: 10.11772/j.issn.1001-9081.2018081695

Asbtract ( )

PDF (837KB) ( )

References | Related Articles | Metrics

The existing network spread model ignores the information attenuation in the process of information spread, and the traditional influence maximization algorithm cannot effectively use the community structure to improve the influence spread range. To solve these problems, an algorithm of Influence Maximization on Internal Decay (IMID) based on community structure was proposed. Firstly, the community structure of a whole social network was divided and the influence range of nodes in the community was evaluated. Then, with spread probability of association points between the communities considered, the attenuation degree of information spread between nodes was calculated. Experimental and analysis results show that the proposed algorithm not only reduces the time complexity, but also obtains the influence transmission range near that of greedy algorithm, with influence coverage over 90%. Therefore, with several nodes selected as the initial nodes between the core seed node set and connected communities, information will be widely spread in the network at the minimum cost.

Adaptive beamforming scheme of massive MIMO based on antenna grouping in high-speed railway environment

XI Haozhe, WANG Ruifeng

2019, 39(3): 839-844. DOI: 10.11772/j.issn.1001-9081.2018081778

Asbtract ( )

PDF (843KB) ( )

References | Related Articles | Metrics

Aiming at the throughput of high-speed railway Multiple Input Multiple Output (MIMO) system has not been fully improved, an adaptive beam transmission scheme based on antenna grouping was proposed. Firstly, the train position information was predicted by the Base Station (BS), and the beamforming technology was introduced into high-speed railway environment to establish a high-speed Massive MIMO three-dimensional model. Secondly, it was verified that in BS antenna grouping situation, the throughput of a sub-beam and its corresponding number of transmit antennas satisfied nonlinear relationship and the number change of sub-beam antennas did not affect the throughput of other beams. Finally, based on the above, an adaptive beamforming scheme based on antenna grouping was used to adjust the number of beams required and the number of transmit antennas required by the sub-beams when the train runed at different locations to ensure optimal system throughput at all the locations. The computer simulation results show that compared with the traditional single-beam, dual-beam and eight-beam schemes, the proposed scheme achieves 87.9%, 62.3%, and 50.6% improvement respectively in system throughput when the distance between the train and the BS is less than 125 m, achieves a similar system throughput of single beamforming when the distance is more than 125 m. The experimental results show that the proposed scheme has best system throughput whether the train is far away from or close to the BS, and is better adapted to high-speed railway environment.

Test case generation method based on improved bacterial foraging optimization algorithm

WANG Shuyan, WANG Rui, SUN Jiaze

2019, 39(3): 845-850. DOI: 10.11772/j.issn.1001-9081.2018081692

Asbtract ( )

PDF (881KB) ( )

References | Related Articles | Metrics

Aiming at the low efficiency of test case automatic generation technology, an IMproved Bacterial Foraging Optimization Algorithm (IM-BFOA) was proposed with introduction of Knet map. Firstly, Kent map was used to increase the diversity of the initial population and global search of bacteria. Secondly, the step size of chemotaxis stage in the algorithm was adaptively designed to make it more rational in the process of bacterial chemotaxis. Finally, a fitness function was constructed according to the program under test to accelerate the optimization of test data. The experimental results show that compared with Genetic Algorithm (GA), Particle Swarm Optimization (PSO) algorithm and standard Bacterial Foraging Optimization Algorithm (BFOA), the proposed algorithm is the best in terms of iterations number and running time with the guarantee of coverage and has high efficiency of test case generation.

Process model mining method for multi-concurrent 2-loops of triangles

SUN Huiming, DU Yuyue

2019, 39(3): 851-857. DOI: 10.11772/j.issn.1001-9081.2018081651

Asbtract ( )

PDF (1014KB) ( )

References | Related Articles | Metrics

To mine the process model including multi-concurrent 2-loops of triangles in incomplete logs, an AlphaMatch algorithm based on extended Alpha algorithm was proposed. Two activities in triangle structure could be correctly matched in 2-loops of triangles by AlphaMatch in the log without repeated activity sequence, thus the process model with multi-concurrent 2-loops of triangles could be mined. Firstly, the activities in 2-loops of triangles were divided into two categories according to the number of activities. Then, a matrix of head and tail position of the activities was constructed to match the two categories and a footprint matrix was constructed to show the relationship between activities. Finally, a large number of experiments were carried out on ProM platform from model correctness, mining efficiency, fitness and precison. Experimental results show that the Petri net model including multi-concurrent 2-loops of triangles can be mined efficiently by the proposed algorithm.

3D model shape optimization method based on optimal minimum spanning tree

HAN Li, LIU Shuning, YU Bing, XU Shengsi, TANG Di

2019, 39(3): 858-863. DOI: 10.11772/j.issn.1001-9081.2018081710

Asbtract ( )

PDF (975KB) ( )

References | Related Articles | Metrics

For the efficient shape analysis of massive, heterogeneous and complex 3D models, an optimization method for 3D model shape based on optimal minimum spanning tree was proposed. Firstly, a model description based on 3D model Minimum Spanning Tree (3D-MST) was constructed. Secondly, local optimization was realized by topology and geometry detection and combination of bilateral filtering and entropy weight distribution, obtaining optimized MST representation of the model. Finally, the shape analysis and similarity detection of the model were realized by optimized Laplacian spectral characteristics and Thin Plate Spline (TPS). The experimental results show that the proposed method not only effectively preserves shape features of the model, but also effectively realizes sparse optimization representation of the complex model, improving the efficiency and robustness of geometric processing and shape retrieval.

Multi-mode filtering object tracking algorithm based on monocular suboptimal parallax under unknown environment

HUANG Shuai, FU Guangyuan, WU Ming, YUE Min

2019, 39(3): 864-868. DOI: 10.11772/j.issn.1001-9081.2018071535

Asbtract ( )

PDF (748KB) ( )

References | Related Articles | Metrics

Under unknown environment, Simultaneous Localization, Mapping and Object Tracking (SLAMOT) based on monocular vision needs sufficient parallax to meet the observability condition of object tracking. Focused on the uncertainty of target motion and the unknown of the system on target motion mode, a multi-mode filtering target tracking algorithm based on monocular suboptimal parallax was proposed. Firstly, the direction in which the target uncertain ellipsoid projection area changed the most was selected as the suboptimal parallax direction and was used as the robot parallax control direction. Then, multi-mode filtering algorithm was used to calculate the probability of different motion modes of the target, estimating the target state of different motion modes. Finally, the target state was estimated according to the probabilistic weighting of each motion mode. The simulation results show that the residual error of suboptimal disparity algorithm is 0.16 m when the parallax velocity is 0.3 m/s, meanwhile the residual means of heuristic algorithm, multimode filtering algorithm, traditional Extended Kalman Filter (EKF) algorithm are 0.25 m, 0.06 m and 0.16 m respectively. Besides, when the parallax speed is small, the proposed algorithm also can satisfy the observability condition of target tracking, having important engineering application value.

3D-HEVC robust video watermarking algorithm based on depth map

CAO Haiyan, FENG Gui, HAN Xue, FANG Dingbang, HUANG Xinda

2019, 39(3): 869-873. DOI: 10.11772/j.issn.1001-9081.2018081676

Asbtract ( )

PDF (750KB) ( )

References | Related Articles | Metrics

Aiming at insufficient robustness of depth map in multi-view 3D video with depth format, a 3D robust video watermarking algorithm based on depth map was proposed. Firstly, a depth map was divided into 4×4 nonoverlapped blocks, the mean square error of each block of pixels was calculated, and a threshold was set to distinguish the texture block and the flat block. Secondly, the block energy value of the texture block was calculated, and a threshold was set according to the calculated result to selectively embed the watermark bits. Finally, the transformed and quantized DC coefficients of each block were obtained and used to construct a 3×3 invertible matrix, then QR decomposition was performed on the invertible matrix and the watermark was embedded in the decomposed Q matrix. The proposed algorithm guarantees that the average Peak Signal to Noise Ratio (PSNR) is a constant, and the average Bit Error Rate (BER) under re-encoding attack with different Quantization Parameter (QP) values (25, 30, 35, 40) is 14.9%. Experimental results show that the algorithm has good robustness and embedding capacity, and has little impact on the quality of video.

Novel image segmentation method with noise based on One-class SVM

SHANG Fangxin, GUO Hao, LI Gang, ZHANG Ling

2019, 39(3): 874-881. DOI: 10.11772/j.issn.1001-9081.2018071494

Asbtract ( )

PDF (1642KB) ( )

References | Related Articles | Metrics

To deal with poor robustness in strong noise environment, weak adaptability to complex mixed noise that appear in the existing unsupervised image segmentation models, an improved noise-robust image segmentation model based on One-class SVM (Support Vector Machine) method was proposed. Firstly, a data outlier detection mechanism was constructed based on One-class SVM. Secondly, an outlier degree was introduced into the energy function, so that more accurate image information could be obtained by the proposed model under multiple noise intensities and the failure of weight-descend mechanism in strong noise environment was avoided. Finally, the segmentation contour was driven to the target edge by minimizing the energy function. In noise image segmentation experiments, the proposed model could obtain ideal segmentation results with different types and intensities of noise. Under F₁-score metric, the proposed model is 0.2 to 0.3 higher than LCK (Local Correntropy-based K-means) model, and has better stability in strong noise environments. The segmentation convergence time of the proposed model is only slightly longer than that of LCK model by about 0.1 s. Experimental results show that the proposed model is more robust to probabilistic, extreme values and mixed noise without significantly increase of segmentation time, and can segment natural images with noise.

Image segmentation for complex voids and pits of bridge based on combination of HSI color space and gray fluctuation

YAO Xuelian, HE Fuqiang, PING An, LUO Hong, WAN Silu

2019, 39(3): 882-887. DOI: 10.11772/j.issn.1001-9081.2018081688

Asbtract ( )

PDF (968KB) ( )

References | Related Articles | Metrics

As the voids and pits image of bridge often has uneven illumination and multi-background interference problems, an image segmentation algorithm for complex voids and pits of bridge was proposed based on HSI color space and gray fluctuation. Firstly, S-component gray curve was plotted and all the potential peaks and troughs of the curve were searched, then the height differences between adjacent peaks and thoughs were calculated. Secondly, partial height differences were selected based on the standard deviation of gray pixel difference value. Finally the threshold segmentation of image was finished by searching the best threshold based on the standard deviation of partial height differences. Experimental results show that the proposed algorithm has better segmentation effect and real-time performance than OTSU, Niblack and Tsallis entropy method.

Railway fastener classification model based on sLDA combined with global and local constraints

YANG Fei, LUO Jianqiao, LI Bailin

2019, 39(3): 888-893. DOI: 10.11772/j.issn.1001-9081.2018081767

Asbtract ( )

PDF (1088KB) ( )

References | Related Articles | Metrics

Aiming at the ignorance of target structure in test topic distribution due to the lack of annotation in supervised Latent Dirichlet Allocation (sLDA) model, a sLDA fastener image classification model combined with global and local constraints (glc-LDA) was proposed. Firstly, the training images were manually labeled, and the training topic distribution with structural information was learned in sLDA model. Then, the test topic distribution was calculated to obtain the image category probabilities as global constraints, the topic similarities of training sub-blocks and test sub-blocks as local constraints. Finally, updated test topic distribution was obtained by weighted summation of training topic distribution with the product of global and local constraints as updated weights. The image category labels were obtained in Softmax classifier by the updated topics. The experimental results show that the proposed algorithm can express the structural information of fastener and compared with sLDA model, the distinction of each category of fastener images is enhanced, and the false detection rate is reduced by 55%.

Reconstructed NMF single channel speech enhancement algorithm based on perceptual masking

LI Yansheng, LIU Yuan, ZHANG Yi

2019, 39(3): 894-898. DOI: 10.11772/j.issn.1001-9081.2018071489

Asbtract ( )

PDF (830KB) ( )

References | Related Articles | Metrics

Aiming at the problem of noise residual in Non-negative Matrix Factorization (NMF) speech enhancement algorithm in low Signal-to-Noise Ratio (SNR) unsteady environment, a Perceptual Masking-based reconstructed NMF (PM-RNMF) single-channel speech enhancement algorithm was proposed. Firstly, psychoacoustic masking features were applied to NMF speech enhancement algorithms. Secondly, different masking thresholds were used for different frequencies to establish an adaptive perceptual masking gain function, and the residual noise energy and speech distortion energy were constrained by the thresholds. Finally, Speech Presence Probability (SPP) was combined to realize perceptual gain correction, the NMF algorithm was reconstructed and a new objective function was established. The simulation results show that under three kinds of unsteady noise environments with different SNR, the average Perceptual Evaluation of Speech Quality (PESQ) of PM-RNMF algorithm is improved by 0.767, 0.474 and 0.162 respectively and the average Signal-to-Distortion Ratio (SDR) is increased by 2.785, 1.197 and 0.948 respectively compared with NMF, RNMF (Reconstructive NMF) and PM-DNN (Perceptual Masking-Deep Neural Network) algorithms. Experimental results show that PM-RNMF has better noise reduction effect in both low frequency and high frequency.

Discovery algorithm for overlapping enterprise community with kernel based on node mapping

LU Zhigang, HU Xinchen

2019, 39(3): 899-906. DOI: 10.11772/j.issn.1001-9081.2018071628

Asbtract ( )

PDF (1254KB) ( )

References | Related Articles | Metrics

As most existing enterprise community discovery algorithms focus on homogenous market environment, without reflecting the participation of some enterprises in multiple supply chain operations, a core community representation model based on node mapping relationship, Map-Community, was proposed. By constructing two different role nodes and their different mapping relationships, the ownership community of a enterprise was determined. Based on this representation model, Node Mapping Algorithm (NMA) with approximately-linear time-space complexity was proposed. Firstly, filtering operation was used to obtain the biconnected core graph in the topology diagram of the supply chain network. Secondly, mapping degree was introduced to select the core enterprise nodes. Thirdly, local expansion was performed according to the mapping judgment rules. Finally, the local community structure was extended to the global network by backtracking and overlapping areas were discovered. In the LFR (Lancichinetti-Fortunato-Radicchi) network application experiment, NMA shows low sensitivity to threshold change and is superior to LFM (Local Fitness Maximization), COPRA (Community Overlap PRopagation Algorithm) and GCE (Greedy Clique Expansion) in terms of practicality. Simulation was carried out in the enterprise social network, and the meaning of distribution effect was summarized by the community division. The experimental results verify the feasibility of this algorithm for overlapping enterprise community discovery and its performance advantages in discovery quality.

Traffic behavior spectrum analysis method based on regional road live data

HUANG Fengyu, WU Yefu, CHEN Jingren, WU Bing

2019, 39(3): 907-912. DOI: 10.11772/j.issn.1001-9081.2018081699

Asbtract ( )

PDF (906KB) ( )

References | Related Articles | Metrics

Aiming at the problem that the characteristic and evaluation indexes for reasearch of traffic behavior spectrum are incompleted both at home and abroad and quantitative analysis cannot be performed in the research, the corresponding characteristic and evaluation indexes were defined to establish a complete traffic behavior spectrum system with quantitative analysis of regional traffic behavior data. Firstly, based on the characteristics of traffic behavior, an improved Analytic Hierarchy Process (AHP) was used to classify the traffic order types. Secondly, Real-Time System Integration (RTSI) algorithm with multi-data fusion was used to comprehensively evaluate the traffic safety of a certain road. Finally, a traffic behavior spectrum analysis tool was developed, calculating traffic safety index of a road section according to the traffic live data, and analyzing traffic behavior in the section more completely.

Optimal path control of parallel truss manipulator based on visual grasping

YANG Jidong, SUN Zhaoqi, WANG Feilong

2019, 39(3): 913-917. DOI: 10.11772/j.issn.1001-9081.2018071586

Asbtract ( )

PDF (873KB) ( )

References | Related Articles | Metrics

To solve the problem of component grasping path planning, a hybrid optimization algorithm based on ant colony algorithm and tabu search algorithm was proposed to minimize the time. Firstly, the problem of grasping components based on machine vision was defined as Traveling Salesman Problem (TSP) with precedence constraint. Secondly, comprehensive effects of the sizes of components and the grasping and placement process on path planning were analyzed, and the path selection probability and tabu region were improved adaptively. Thirdly, on the one hand, 2-opt local optimization, pheromone punishment and reward mechanism were introduced to improve the search ability of ants; on the other hand, pheromone evaporation factor was improved adaptively to increase the adaptability of ants. Finally, for the basic algorithm and the improved hybrid optimization algorithm, the performance index and grasping time were compared and analyzed by simulation experiment and platform experiment. The experimental results show that, compared with Ant Colony Optimization (ACO) algorithm and Tabu Search (TS) algorithm, the average iteration times of the proposed hybrid optimization algorithm is reduced by about 50%, and other performances are superior to the other algorithms. The results of platform test also show that the hybrid optimization algorithm is superior to random results and basic algorithm, and it can realize component grasping task quickly.

Cooperative evolution method for blockchain mining pool based on adaptive zero-determinant strategy

FAN Li, ZHENG Hong, HUANG Jianhua, LI Zhongcheng, JIANG Yahui

2019, 39(3): 918-923. DOI: 10.11772/j.issn.1001-9081.2018071619

Asbtract ( )

PDF (834KB) ( )

References | Related Articles | Metrics

At present, the most common way for bitcoin mining is miners joining in a pool. However, there is a phenomenon that the mining pools penetrate each other, which will result in a decrease in the miners' income of the attacked pools, and a reduction in computing power of the attacking pools. Therefore, the overall computing power of the bitcoin system is reduced. Aiming at the problem of mutual attack and non-cooperative mining between mining pools, an Adaptive Zero-Determinant strategy (AZD) was proposed to promote the cooperation of miners. The strategy adopted the idea of comparing expected payoff with cooperation and defection in the next round then choosing a strategy with high payoff. Firstly, miners' payoff in the next round under two situations could be predicted by the combination of Temporal Difference Learning Method (TD(λ)) and Zero-Determinant strategy (ZD). Secondly, by comparing the cooperation payoff with defection payoff in the next round, a more favorable strategy was chosen for miners by Decision Making Process (DMP), so the cooperation probability and defection probability in the next round were changed correspondingly. Finally, through the iterative implementation of AZD strategy, the ming pools in the network would cooperate with each other and mine actively. Simulation results show that compared with adaptive strategy, AZD strategy increases the speed of converging cooperation probability to 1 by 36.54%, compared with ZD strategy, it improves the stability by 50%. This result indicates that AZD strategy can effectively promote the cooperation of miners, improve the convergence rate of cooperation and ensure the stable income of mining pools.

Abnormal time series data detection of gas station by Seq2Seq model based on bidirectional long short-term memory

TAO Tao, ZHOU Xi, MA Bo, ZHAO Fan

2019, 39(3): 924-929. DOI: 10.11772/j.issn.1001-9081.2018081681

Asbtract ( )

PDF (936KB) ( )

References | Related Articles | Metrics

Time series data of gas station contains multi-dimensional information of fueling behavior, but the data of specific gas station are sparse. The existing abnormal data detection algorithms are not suitable for gas station time series data, because many pseudo outliers are mined and many real abnormal points are missed. To solve the problems, an abnormal detection method based on deep learning was proposed to detect vehicles with abnormal fueling. Firstly, feature extraction was performed on data collected from the gas station through an automatic encoder. Then, a deep learning model Seq2Seq with embedding Bidirectional Long Short-Term Memory (Bi-LSTM) was used to predict the fueling behavior. Finally, the threshold of outliers was defined by comparing the predicted value and the original value. The experiments on a fueling dataset and a credit card fraud dataset verify the effectiveness of the proposed method. Compared with the existing methods, the Root Mean Squared Error (RMSE) of the proposed method is decreased by 21.1% on the fueling dataset, and abnormal detection accuracy of the proposed method is improved by 1.4% on the credit card fraud dataset. Therefore, the proposed method can be applied to detect vehicles with abnormal fueling behavior, improving the management and operational efficiency of gas station.

Cardiac arrhythmia detection algorithm based on deep long short-term memory neural network model

YANG Shuo, PU Baoming, LI Xiangze, WANG Shuai, CHANG Zhanguo

2019, 39(3): 930-934. DOI: 10.11772/j.issn.1001-9081.2018081677

Asbtract ( )

PDF (762KB) ( )

References | Related Articles | Metrics

Aiming at the problems of inaccurate feature extraction and high complexity of traditional ElectroCardioGram (ECG) detection algorithms based on morphological features, an improved Long Short-Term Memory (LSTM) neural network was proposed. Based on the advantage of traditional LSTM model in time series data processing, the proposed model added reverse and depth calculations which avoids extraction of waveform features artificially and strengthens learning ability of the network. And supervised learning was performed in the model according to the given heart beat sequences and category labels, realizing the arrhythmia detection of unknown heart beats. The experimental results on the arrhythmia datasets in MIT-BIH database show that the overall accuracy of the proposed method reaches 98.34%. Compared with support vector machine, the accuracy and F1 value of the model are both improved.

Table of Content