Journal of Computer Applications

Graph convolutional network model using neighborhood selection strategy

CHEN Kejia, YANG Zeyu, LIU Zheng, LU Hao

2019, 39(12): 3415-3419. DOI: 10.11772/j.issn.1001-9081.2019071281

Asbtract ( )

PDF (759KB) ( )

References | Related Articles | Metrics

The composition of neighborhoods is crucial for the spatial domain-based Graph Convolutional Network (GCN) model. To solve the problem that the structural influence is not considered in the neighborhood ordering of nodes in the model, a novel neighborhood selection strategy was proposed to obtain an improved GCN model. Firstly, the structurally important neighborhoods were collected for each node and the core neighborhoods were selected hierarchically. Secondly, the features of the nodes and their core neighborhoods were organized into a matrix. Finally, the matrix was sent to deep Convolutional Neural Network (CNN) for semi-supervised learning. The experimental results on Cora, Citeseer and Pubmed citation network datasets show that, the proposed model has a better accuracy in node classification tasks than the model based on classical graph embedding and four state-of-the-art GCN models. As a spatial domain-based GCN, the proposed model can be effectively applied to the learning tasks of large-scale networks.

Image classification learning via unsupervised mixed-order stacked sparse autoencoder

YANG Donghai, LIN Minmin, ZHANG Wenjie, YANG Jingmin

2019, 39(12): 3420-3425. DOI: 10.11772/j.issn.1001-9081.2019061107

Asbtract ( )

PDF (1005KB) ( )

References | Related Articles | Metrics

Most of the current image classification methods use supervised learning or semi-supervised learning to reduce image dimension. However, supervised learning and semi-supervised learning require image carrying label information. Aiming at the dimensionality reduction and classification of unlabeled images, a mixed-order feature stacked sparse autoencoder was proposed to realize the unsupervised dimensionality reduction and classification learning of the images. Firstly, a serial stacked sparse autoencoder network with three hidden layers was constructed. Each hidden layer of the stacked sparse autoencoder was trained separately, and the output of the former hidden layer was used as the input of the latter hidden layer to realize the feature extraction of image data and the dimensionality reduction of the data. Secondly, the features of the first hidden layer and the second hidden layer of the trained stacked autoencoder were spliced and fused to form a matrix containing mixed-order features. Finally, the support vector machine was used to classify the image features after dimensionality reduction, and the accuracy was evaluated. The proposed method was compared with seven comparison algorithms on four open image datasets. The experimental results show that the proposed method can extract features from unlabeled images, realize image classification learning, reduce classification time and improve image classification accuracy.

Multi-scale attribute granule based quick positive region reduction algorithm

CHEN Manru, ZHANG Nan, TONG Xiangrong, DONGYE Shenglong, YANG Wenjing

2019, 39(12): 3426-3433. DOI: 10.11772/j.issn.1001-9081.2019049238

Asbtract ( )

PDF (1131KB) ( )

References | Related Articles | Metrics

In classical heuristic attribute reduction algorithm for positive region, the attribute with the maximum dependency degree of the current positive domain should be added into the selected feature attribute subset in each iteration, leading to the large number of iterations and the low efficiency of the algorithm, and making the algorithm hard to be applied in the feature selection of high-dimensional and large-scale datasets. In order to solve the problems, the monotonic relationship between the positive regions in a decision system was studied and the formal description for the Multi-Scale Attribute Granule (MSAG) was given, and a Multi-scale Attribute Granule based Quick Positive Region reduction algorithm (MAG-QPR) was proposed. Each MSAG contains several attributes and can provide a large positive region for the selected feature attribute subset. As a result, adding MSAG in each iteration can reduce the number of the iteration and make the selected feature attribute subset more quickly approach to the positive region resolving ability of the condition attribute universal set. Therefore, the computational efficiency of the heuristic attribute reduction algorithm for positive region is improved. With 8 UCI datasets used for experiments, on the datasets Lung Cancer, Flag and German, the running time acceleration ratios of MAG-QPR to the general improved Feature Selection algorithm based on the Positive Approximation-Positive Region (FSPA-PR), the general improved Feature Selection algorithm based on the Positive Approximation-Shannon's Conditional Entropy (FSPA-SCE), the Backward Greedy Reduction Algorithm for positive region Preservation (BGRAP) and the Backward Greedy Reduction Algorithm for Generalized decision preservation (BGRAG) are 9.64, 15.70, 5.03, 2.50; 3.93, 7.55, 1.69, 4.57; and 3.61, 6.49, 1.30, 9.51 respectively. The experimental results show that, the proposed algorithm MAG-QPR can improve the algorithm efficiency and has better classification accuracy.

Customer purchasing power prediction of Google store based on deep LightGBM ensemble learning model

YE Zhiyu, FENG Aimin, GAO Hang

2019, 39(12): 3434-3439. DOI: 10.11772/j.issn.1001-9081.2019071305

Asbtract ( )

PDF (892KB) ( )

References | Related Articles | Metrics

The ensemble learning models such as Light Gradient Boosting Machine (LightGBM) only mine data information once, and cannot automatically refine the granularity of data mining or obtain more potential internal correlation information in the data by deep digging. In order to solve the problems, a deep LightGBM ensemble learning model was proposed, which was composed of sliding window and deepening. Firstly, the ensemble learning model was able to automatically refine the granularity of data mining through the sliding window, so as to further mine the potential internal correlation information in the data and a certain expressive learning ability was given to the model. Secondly, based on the sliding window, the deepening step was used to further improve the representation learning ability of the model. Finally, the dataset was processed with feature engineering. The experimental results on the dataset of Google store show that, the prediction accuracy of the proposed deep ensemble learning model is 6.16 percentage points higher than that of original ensemble learning model. The proposed method can automatically refine the granularity of data mining, so as to obtain more potential information in the dataset. Moreover, compared with the traditional deep neural network, the deep LightGBM ensemble learning model has fewer parameters and better interpretability as a non-neural network.

Siamese detection network based real-time video tracking algorithm

DENG Yang, XIE Ning, YANG Yang

2019, 39(12): 3440-3444. DOI: 10.11772/j.issn.1001-9081.2019081427

Asbtract ( )

PDF (787KB) ( )

References | Related Articles | Metrics

Currently, in the field of video tracking, the typical Siamese network based algorithms only locate the center point of target, which results in poor locating performance on fast-deformation objects. Therefore, a real-time video tracking algorithm based on Siamese detection network called Siamese-FC Region-convolutional neural network (SiamRFC) was proposed. SiamRFC can directly predict the center position of the target, thus dealing with the rapid deformation. Firstly, the position of the center point of the target was obtained by judging the similarity. Then, the idea of object detection was used to return the optimal position by selecting a series of candidate boxes. Experimental results show that SiamRFC has good performance on the VOT2015|16|17 test sets.

Crowd counting model based on multi-scale multi-column convolutional neural network

LU Jingang, ZHANG Li

2019, 39(12): 3445-3449. DOI: 10.11772/j.issn.1001-9081.2019081437

Asbtract ( )

PDF (773KB) ( )

References | Related Articles | Metrics

To improve the bad performance of crowd counting in surveillance videos and images caused by the scale and perspective variation, a crowd counting model, named Multi-scale Multi-column Convolutional Neural Network (MsMCNN) was proposed. Before extracting features with MsMCNN, the dataset was processed with the Gaussian filter to obtain the true density maps of images, and the data augmentation was performed. With the structure of multi-column convolutional neural network as the backbone, MsMCNN firstly extracted feature maps from multiple columns with multiple scales. Then, MsMCNN was used to generate the estimated density map by combining feature maps with the same resolution in the same column. Finally, crowd counting was realized by integrating the estimated density map. To verify the effectiveness of the proposed model, experiments were conducted on Shanghaitech and UCF_CC_50 datasets. Compared to the classic methods:Crowdnet, Multi-column Convolutional Neural Network (MCNN), Cascaded Multi-Task Learning (CMTL) and Scale-adaptive Convolutional Neural Network (SaCNN), the Mean Absolute Error (MAE) of MsMCNN respectively decreases 10.6 and 24.5 at least on Part_A and UCF_CC_50 of Shanghaitech dataset, and the Mean Squared Error (MSE) of MsMCNN respectively decreases 1.8 and 29.3 at least. Furthermore, MsMCNN also achieves the better result on the Part_B of the Shanghaitech dataset. MsMCNN pays more attention to the combination of shallow features and the combination of multi-scale features in the feature extraction process, which can effectively reduce the impact of low accuracy caused by scale and perspective variation, and improve the performance of crowd counting.

Lake extraction algorithm based on three-dimensional convolutional neural network

XU Shanshan, YAN Chao, GAO Linming

2019, 39(12): 3450-3455. DOI: 10.11772/j.issn.1001-9081.2019081436

Asbtract ( )

PDF (920KB) ( )

References | Related Articles | Metrics

Aiming at the low accuracy of lake contour extraction from two-dimensional images of the existing algorithms for analyzing the geometric information of lakes, a lake extraction algorithm based on three-dimensional convolutional neural network was proposed. Firstly, based on the flatness information, the candidate lakes were located from the laser scanning point clouds, and the candidate points were organized as voxels to be an input of the neural network. Meanwhile, the non-lake areas were filtered from candidate areas by the deep learning technique. Then, based on the chain-code algorithm, contours of lakes were extracted from point clouds and their geometry information was analyzed. The experimental results show that, the accuracy of the proposed algorithm in extracting lakes from laser scanning point clouds is 96.34%, and compared with the existing extraction algorithm for two-dimensional images, the proposed algorithm can calculate and analyze the shape information of lakes, which provides convenience for lake monitoring and management.

Segmentation model of neonatal punctate white matter lesion based on refined deep residual U-Net

LIU Yalong, LI Jie, WANG Ying, WU Saifei, ZOU Pei

2019, 39(12): 3456-3461. DOI: 10.11772/j.issn.1001-9081.2019049101

Asbtract ( )

PDF (1112KB) ( )

References | Related Articles | Metrics

The tiny lesion area and the large difference between samples of neonatal punctate white matter lesion make it difficult to detect and segment the lesion. To solve the problem, a refined deep residual U-Net was proposed to realize the fine semantic segment of the lesion. Firstly, a Magnetic Resonance Imaging (MRI) image was cut into small patches. Secondly, the deep features of multiple layers of each image patch were extracted by the residual U-Net. Then, the features were fused and the probability map of the lesion distribution of each image patch was obtained. Finally, the probability map after splicing was optimized by the fully-connected condition random field to obtain the final segmentation results. The performance of the algorithm was evaluated on a dataset provided by a cooperative hospital. The results show that with only T1 order unimodal data used, the proposed model has the lesion's edge segmented more precisely, and the anti-interference ability of the model is prominent. The model has the Dice similarity coefficient of 62.51%, the sensitivity of 69.76%, the specificity of 99.96%, and the modified Hausdorff distance reduced to 33.67.

Novel learning algorithm combining support vector machine and semi-supervised K-means

DU Yang, JIANG Zhen, FENG Lujie

2019, 39(12): 3462-3466. DOI: 10.11772/j.issn.1001-9081.2019050813

Asbtract ( )

PDF (704KB) ( )

References | Related Articles | Metrics

Semi-supervised learning can effectively improve the generalization performance of algorithm by combining a few labeled samples and large number of unlabeled samples. The traditional semi-supervised Support Vector Machine (SVM) algorithm introduces unlabeled sample dependencies into the objective function to drive the decision-making surface through the low-density region, but it often brings problems such as high computational complexity and local optimal solution. At the same time, semi-supervised K-means algorithm faces the problems of how to effectively use the supervised information to initialize and update the centroid. To solve these problems, a novel learning algorithm of Semi-supervised K-means Assisted SVM (SKAS) was proposed. Firstly, an improved semi-supervised K-means algorithm was proposed, which was improved from two aspects:distance measurement and centroid iteration. Then, a fusion algorithm was designed to combine semi-supervised K-means algorithm with SVM in order to further improve the performance of the algorithm. The experimental results on six UCI datasets show that, the proposed method outperforms the current advanced semi-supervised SVM and semi-supervised K-means algorithms on five datasets and has the highest average accuracy.

Non-unidimensional community detection algorithm based on network representation learning

CHEN Wanjie, SHENG Yiqiang

2019, 39(12): 3467-3475. DOI: 10.11772/j.issn.1001-9081.2019061009

Asbtract ( )

PDF (1522KB) ( )

References | Related Articles | Metrics

Focusing on the issue that it is difficult for the existing community detection algorithms to solve the multidimensionality problem of the network, a non-unidimensional community detection algorithm based on network representation learning was proposed. The algorithm considered the difference of nodes from the two dimensions of node attribute feature and network structure feature. Firstly, the node transition probability was calculated according to the node attribute similarity. The length of the random walk path of the network node was set according to the six-degree separation theory of the small world model. After obtaining the walking path of the node by selecting its neighbor nodes according to the transition probability, the walking path of the node was trained by the neural network model to achieve the network feature vectors. The similarity of the network feature vectors of the node was reset as the weight of the connected edge, and the community partition was completed based on the Louvain algorithm. Finally, experiments were conducted on two datasets, Facebook and Giraffe with the Louvain algorithm based on the initial network structure and the unidimensional community detection algorithm as comparison algorithms. Experimental results show that on the Giraffe dataset, compared to the Louvain algorithm, the community detection algorithm based on the node attribute has the modularity increased by 2.7%, the community detection algorithm based on the network structure has the modularity increased by 3.0%, and the proposed non-unidimensional community detection algorithm has the modularity increased by 3.7%. The proposed algorithm focuses on the multidimensionality of the network and improves the modularity of the community detection algorithm effectively.

Short text automatic summarization method based on dual encoder

DING Jianli, LI Yang, WANG Jialiang

2019, 39(12): 3476-3481. DOI: 10.11772/j.issn.1001-9081.2019050800

Asbtract ( )

PDF (931KB) ( )

References | Related Articles | Metrics

Aiming at the problems of insufficient use of semantic information and the poor summarization precision in the current generated text summarization method, a text summarization method was proposed based on dual encoder. Firstly, the dual encoder was used to provide richer semantic information for Sequence to Sequence (Seq2Seq) architecture. And the attention mechanism with dual channel semantics and the decoder with empirical distribution were optimized. Then, position embedding and word embedding were merged in word embedding technology, and Term Frequency-Inverse Document Frequency (TF-IDF), Part Of Speech (POS), key Score (Soc) were added to word embedding, as a result, the word embedding dimension was optimized. The proposed method aims to optimize the traditional sequence mapping of Seq2Seq and word feature representation, enhance the model's semantic understanding, and improve the quality of the summarization. The experimental results show that the proposed method has the performance improved in the Rouge evaluation system by 10 to 13 percentage points compared with traditional Recurrent Neural Network method with attention (RNN+atten) and Multi-layer Bidirectional Recurrent Neural Network method with attention (Bi-MulRNN+atten). It can be seen that the proposed method has more accurate semantic understanding of text summarization and the generation effect better, and has a better application prospect.

Human behavior recognition algorithm based on three-dimensional residual dense network

GUO Mingxiang, SONG Quanjun, XU Zhannan, DONG Jun, XIE Chengjun

2019, 39(12): 3482-3489. DOI: 10.11772/j.issn.1001-9081.2019061056

Asbtract ( )

PDF (1300KB) ( )

References | Related Articles | Metrics

Concerning the problem that the existing algorithm for human behavior recognition cannot fully utilize the multi-level spatio-temporal information of network, a human behavior recognition algorithm based on three-dimensional residual dense network was proposed. Firstly, the proposed network adopted the three-dimensional residual dense blocks as the building blocks, these blocks extracted the hierarchical features of human behavior through the densely-connected convolutional layer. Secondly, the local dense features of human behavior were learned by the local feature aggregation adaptive method. Thirdly, residual connection module was adopted to facilitate the flow of feature information and mitigate the difficulty of training. Finally, after realizing the multi-level local feature extraction by concatenating multiple three-dimensional residual dense blocks, the aggregation adaptive method for global feature was proposed to learn the features of all network layers for realizing human behavior recognition. In conclusion, the proposed algorithm has improved the extraction of network multi-level spatio-temporal features and the features with high discrimination are learned by local and global feature aggregation, which enhances the expression ability of model. The experimental results on benchmark datasets KTH and UCF-101 show that, the recognition rate (top-1 recognition accuracy) of the proposed algorithm can achieve 93.52% and 57.35% respectively, which outperforms that of Three-Dimensional Convolutional neural network (C3D) algorithm by 3.93 percentage points and 13.91 percentage points respectively. The proposed algorithm framework has excellent robustness and migration learning ability, and can effectively handle multiple video behavior recognition tasks.

Infrared human target recognition method based on multi-feature dimensionality reduction and transfer learning

WANG Xin, ZHANG Xin, NING Chen

2019, 39(12): 3490-3495. DOI: 10.11772/j.issn.1001-9081.2019060982

Asbtract ( )

PDF (1009KB) ( )

References | Related Articles | Metrics

Aiming at the poor recognition accuracy and robustness of the human targets caused by the serious interference on the targets under infrared imaging conditions, an infrared human target recognition method based on multi-feature dimensionality reduction and transfer learning was proposed. Firstly, in order to solve the problem of incomplete information during the extraction of a single feature by the traditional infrared human target feature extraction method, different kinds of heterogeneous features were extracted to fully exploit the characteristics of infrared human targets. Secondly, to provide the efficient and compact feature description for subsequent recognition, a principal component analysis method was utilized to reduce the dimensionality of the fused heterogeneous features. Finally, to solve the problems such as poor generalization performance, caused by the lack of tagged human target samples in infrared images as well as the distributional and semantic deviations between the training samples and testing samples, an effective infrared human target classifier based on transfer learning was presented, which was able to greatly improve the generalization performance and the target recognition accuracy. The experimental results show that the recognition accuracy of the method on infrared human target data set reaches more than 94%, which is better and more stable than that of the methods with a single feature such as Histogram of Oriented Gradients (HOG), Intensity Self Similarity (ISS) for feature representation or the methods learned with traditional non-transfer classifiers such as Support Vector Machine (SVM), K-Nearest Neighbors (KNN). Therefore, the performance of infrared human target recognition is improved in real complex scenes by the method.

Indoor crowd detection network based on multi-level features and hybrid attention mechanism

SHEN Wenxiang, QIN Pinle, ZENG Jianchao

2019, 39(12): 3496-3502. DOI: 10.11772/j.issn.1001-9081.2019061075

Asbtract ( )

PDF (1190KB) ( )

References | Related Articles | Metrics

In order to solve the problem of indoor crowd target scale and attitude diversity and confusion of head targets with surrounding objects, a new Network based on Multi-level Features and hybrid Attention mechanism for indoor crowd detection (MFANet) was proposed. It is composed of three parts:feature fusion module, multi-scale dilated convolution pyramid feature decomposition module, and hybrid attention module. Firstly, by combining the information of shallow features and intermediate layer features, a fusion feature containing context information was formed to solve the problem of the lack of semantic information and the weakness of classification ability of the small targets in the shallow feature map. Then, with the characteristics of increasing the receptive field without increasing the parameters, the dilated convolution was used to perform the multi-scale decomposition on the fusion features to form a new small target detection branch, realizing the positioning and detection of the multi-scale targets by the network. Finally, the local fusion attention module was used to integrate the global pixel correlation space attention and channel attention to enhance the features with large contribution on the key information in order to improve the ability of distinguishing target from background. The experimental results show that the proposed method achieves an accuracy of 0.94, a recall rate of 0.91 and an F1 score of 0.92 on the indoor monitoring scene dataset SCUT-HEAD. All of these three are significantly better than those of other algorithms currently used for indoor crowd detection.

Human skeleton key point detection method based on OpenPose-slim model

WANG Jianbing, LI Jun

2019, 39(12): 3503-3509. DOI: 10.11772/j.issn.1001-9081.2019050954

Asbtract ( )

PDF (1060KB) ( )

References | Related Articles | Metrics

The OpenPose model originally used for the detection of key points in human skeleton can greatly shorten the detection cycle while maintaining the accuracy of the Regional Multi-Person Pose Estimation (RMPE) model and the Mask Region-based Convolutional Neural Network (R-CNN) model, which were proposed in 2017 and had the near-optimal detection effect at that time. At the same time, the OpenPose model has the problems such as low parameter sharing rate, high redundancy, long time-consuming and too large model scale. In order to solve the problems, a new OpenPose-slim model was proposed. In the proposed model, the network width was reduced, the number of convolution block layers was decreased, the original parallel structure was changed into sequential structure and the Dense connection mechanism was added to the inner module. The processing process was mainly divided into three modules:1) the position coordinates of human skeleton key points were detected in the key point localization module; 2) the key point positions were connected to the limb in the key point association module; 3) limb matching was performed to obtain the contour of human body in the limb matching module. There is a close correlation between processing stages. The experimental results on the MPII dataset, Common Objects in COntext (COCO) dataset and AI Challenger dataset show that, the use of four localization modules and two association modules as well as the use of Dense connection mechanism inside each module of the proposed model is the best structure. Compared with the OpenPose model, the test cycle of the proposed model is shortened to nearly 1/6, the parameter size is reduced by nearly 50%, and the model size is reduced to nearly 1/27.

Disguised voice detection method based on inverted Mel-frequency cepstral coefficient

LIN Xiaodan, QIU Yingqiang

2019, 39(12): 3510-3514. DOI: 10.11772/j.issn.1001-9081.2019050870

Asbtract ( )

PDF (825KB) ( )

References | Related Articles | Metrics

Voice disguise through pitch shift is commonly used to conceal the identity of speaker. A bunch of voice changers substantially facilitate the application of voice disguise. To simultaneously address the problem of whether a speech signal is pitch-shifted and how it is modified (pitch-raised or pitch-lowered), with the traces of the electronic disguised voice in the signal spectrum especially the high frequency region analyzed, an electronic disguised voice detection method based on statistical moment features derived from Inverted Mel-Frequency Cepstral Coefficient (IMFCC) was proposed. Firstly, IMFCC and its first-order difference of each voice frame were extracted. Then, its statistical mean was calculated. Finally, on the above statistical feature, the design of Support Vector Machine (SVM) multi-classifier was used to identify the original voice, the pitch-raised voice and the pitch-lowered voice. The experimental results on TIMIT and NIST voice datasets show that the proposed method has satisfactory performance on the detection of the original, pitch-raised and pitch-lowered voice signals. Compared with the baseline system using MFCC as feature construction, the method with the proposed features has significantly increased the recognition rate of the disguise operation. And the method outperforms the Convolutional Neural Network (CNN) based framework when limited training data is available. The extensive experiments demonstrate the proposed has good generalization ability on different datasets and different disguising methods.

Environmental sound classification method based on Mel-frequency cepstral coefficient, deep convolution and Bagging

WANG Tianrui, BAO Qianyue, QIN Pinle

2019, 39(12): 3515-3521. DOI: 10.11772/j.issn.1001-9081.2019040678

Asbtract ( )

PDF (991KB) ( )

References | Related Articles | Metrics

The traditional environmental sound classification model does not fully extract the features of environmental sound, and the full connection layer of conventional neural network is easy to cause over-fitting when the network is used for environmental sound classification. In order to solve the problems, an environmental sound classification method combining with Mel-Frequency Cepstral Coefficient (MFCC), deep convolution and Bagging algorithm was proposed. Firstly, for the original audio file, the MFCC model was established by using pre-emphasis, windowing, discrete Fourier transform, Mel filter transformation, discrete cosine mapping. Secondly, the feature model was input into the convolutional depth network for the second feature extraction. Finally, based on reinforcement learning, the Bagging algorithm was adopted to integrate the linear discriminant analyzer, Support Vector Machine (SVM), softmax regression and eXtreme Gradient Boost (XGBoost) models to predict the network output results by voting prediction. The experimental results show that, the proposed method can effectively improve the feature extraction ability of environmental sound and the anti-over-fitting ability of deep network in environmental sound classification.

On-line path planning method of fixed-wing unmanned aerial vehicle

LIU Jia, QIN Xiaolin, XU Yang, ZHANG Lige

2019, 39(12): 3522-3527. DOI: 10.11772/j.issn.1001-9081.2019050863

Asbtract ( )

PDF (869KB) ( )

References | Related Articles | Metrics

By the combination of fuzzy particle swarm optimization algorithm based on receding horizon control and improved artificial potential field, an on-line path planning method for achieving fixed-wing Unmanned Aerial Vehicle (UAV) path planning in uncertain environment was proposed. Firstly, the minimum circumscribed circle fitting was performed on the convex polygonal obstacles. Then, aiming at the static obstacles, the path planning problem was transformed into a series of on-line sub-problems in the time domain window, and the fuzzy particle swarm algorithm was applied to optimize and solve the sub-problems in real time, realizing the static obstacle avoidance. When there were dynamic obstacles in the environment, the improved artificial potential field was used to accomplish the dynamic obstacle avoidance by adjusting the path. In order to meet the dynamic constraints of fixed-wing UAV, a collision detection method for fixed-wing UAV was proposed to judge whether the obstacles were real threat sources or not in advance and reduce the flight cost by decreasing the turning frequency and range. The simulation results show that, the proposed method can effectively improve the planning speed, stability and real-time obstacle avoidance ability of fixed-wing UAV path planning, and it overcomes the shortcoming of easy to falling into local optimum in traditional artificial potential field method.

Medical image fusion algorithm based on generative adversarial residual network

GAO Yuan, WU Fan, QIN Pinle, WANG Lifang

2019, 39(12): 3528-3534. DOI: 10.11772/j.issn.1001-9081.2019050937

Asbtract ( )

PDF (1184KB) ( )

References | Related Articles | Metrics

In the traditional medical image fusion, it is necessary to manually set the fusion rules and parameters by using prior knowledge, which leads to the uncertainty of fusion effect and the lack of detail expression. In order to solve the problems, a Computed Tomography (CT)/Magnetic Resonance (MR) image fusion algorithm based on improved Generative Adversarial Network (GAN) was proposed. Firstly, the network structures of generator and discriminator were improved. In the design of generator network, residual block and fast connection were used to deepen the network structure, so as to better capture the deep image information. Then, the down-sampling layer of the traditional network was removed to reduce the information loss during image transmission, and the batch normalization was changed to the layer normalization to better retain the source image information, and the depth of the discriminator network was increased to improve the network performance. Finally, the CT image and the MR image were connected and input into the generator network to obtain the fused image, and the network parameters were continuously optimized through the loss function, and the model most suitable for medical image fusion was trained to generate the high-quality image. The experimental results show that, the proposed algorithm is superior to Discrete Wavelet Transformation (DWT) algorithm, NonSubsampled Contourlet Transform (NSCT) algorithm, Sparse Representation (SR) algorithm and Sparse Representation of classified image Patches (PSR) algorithm on Mutual Information (MI), Information Entropy (IE) and Structural SIMilarity (SSIM). The final fused image has rich texture and details. At the same time, the influence of human factors on the stability of the fusion effect is avoided.

Automatic recognition algorithm of cervical lymph nodes using adaptive receptive field mechanism

QIN Pinle, LI Pengbo, ZHANG Ruiping, ZENG Jianchao, LIU Shijie, XU Shaowei

2019, 39(12): 3535-3540. DOI: 10.11772/j.issn.1001-9081.2019061069

Asbtract ( )

PDF (965KB) ( )

References | Related Articles | Metrics

Aiming at the problem that the deep learning network model applied to medical image target detection only has a fixed receptive field and cannot effectively detect the cervical lymph nodes with obvious morphological and scale differences, a new recognition algorithm based on adaptive receptive field mechanism was proposed, applying deep learning to the automatic recognition of cervical lymph nodes in complete three-dimensional medical images at the first time. Firstly, the semi-random sampling method was used to crop the medical sequence images to generate the grid-based local image blocks and the corresponding truth labels. Then, the DeepNode network based on the adaptive receptive field mechanism was constructed and trained through the local image blocks and labels. Finally, the trained DeepNode network model was used for prediction. By inputting the whole sequence images, the cervical lymph node recognition results corresponding to the whole sequence was obtained end-to-end and quickly. On the cervical lymph node dataset, the cervical lymph node recognition using the DeepNode network has the recall rate of 98.13%, the precision of 97.38%, and the number of false positives per scan is only 29, and the time consumption is relatively shorter. The analysis of the experimental results shows that compared with current algorithms such as the combination of two-dimensional and three-dimensional convolutional neural networks, the general three-dimensional object detection and the weak supervised location based recognition, the proposed algorithm can realize the automatic recognition of cervical lymph nodes and obtain the best recognition results. The algorithm is end-to-end, simple and efficient, easy to be extended to three-dimensional target detection tasks for other medical images and can be applied to clinical diagnosis and treatment.

Pneumothorax detection and localization in X-ray images based on dense convolutional network

LUO Guoting, LIU Zhiqin, ZHOU Ying, WANG Qingfeng, CHENG Jiezhi, LIU Qiyu

2019, 39(12): 3541-3547. DOI: 10.11772/j.issn.1001-9081.2019050884

Asbtract ( )

PDF (1217KB) ( )

References | Related Articles | Metrics

There are two main problems about pneumothorax detection in X-ray images. The pneumothorax usually overlaps with tissues such as ribs and clavicles in X-ray images, easily causing missed diagnosis and the performance of the existing pneumothorax detection methods remain to be improved. The suspicious pneumothorax area detection cannot be exploited by the convolutional neural network-based algorithms, lacking the interpretability. Aiming at the problems, a novel method combining Dense convolutional Network (DenseNet) and gradient-weighted class activation mapping was proposed. Firstly, a large-scale chest X-ray dataset named PX-ray was constructed for model training and testing. Secondly, the output node of the DenseNet was modified and a sigmoid function was added after the fully connected layer to classify the chest X-ray images. In the training process, the weight of cross entropy loss function was set to alleviate the problem of data imbalance and improve the accuracy of the model. Finally, the parameters of the last convolutional layer of the network and the corresponding gradients were extracted, and the areas of the pneumothorax type were roughly located by gradient-weighted class activation mapping. The experimental results show that, the proposed method has the detection accuracy of 95.45%, and has the indicators such as Area Under Curve (AUC), sensitivity, specificity all higher than 0.9, performs the classic algorithms of VGG19, GoogLeNet and ResNet, and realizes the visualization of pneumothorax area.

Handwritten numeral recognition under edge intelligence background

WANG Jianren, MA Xin, DUAN Ganglong, XUE Hongquan

2019, 39(12): 3548-3555. DOI: 10.11772/j.issn.1001-9081.2019050869

Asbtract ( )

PDF (1271KB) ( )

References | Related Articles | Metrics

With the rapid development of edge intelligence, the development of existing handwritten numeral recognition convolutional network models has become less and less suitable for the requirements of edge deployment and computing power declining, and there are problems such as poor generalization ability of small samples and high network training costs. Drawing on the classic structure of Convolutional Neural Network (CNN), Leaky_ReLU algorithm, dropout algorithm, genetic algorithm and adaptive and mixed pooling ideas, a handwritten numeral recognition model based on LeNet-DL improved convolutional neural network was constructed. The proposed model was compared on large sample MNIST dataset and small sample REAL dataset with LeNet, LeNet+sigmoid, AlexNet and other algorithms. The improved network has the large sample identification accuracy up to 99.34%, with the performance improvement of about 0.83%, and the small sample recognition accuracy up to 78.89%, with the performance improvement of about 8.34%. The experimental results show that compared with traditional CNN, LeNet-DL network has lower training cost, better performance and stronger model generalization ability on large sample and small sample datasets.

Attribute revocation and verifiable outsourcing supported multi-authority attribute-based encryption scheme

MING Yang, HE Baokang

2019, 39(12): 3556-3562. DOI: 10.11772/j.issn.1001-9081.2019061019

Asbtract ( )

PDF (1056KB) ( )

References | Related Articles | Metrics

Focusing on the large decryption overhead of the data user and the lack of effective attribute revocation of the Multi-Authority Attribute-Based Encryption (MA-ABE) access control scheme in cloud storage, an attribute revocation and verifiable outsourcing supported multi-authority attribute-based encryption scheme was proposed. Firstly, the data user's decryption overhead was markedly reduced and the integrity of the data was verified by using verifiable outsourcing technology. Then, the bilinear mapping was used to protect the access policy, preventing the identity of the data owner from leaking. Finally, the version key of each attribute was used to realize the immediate attribute revocation. The security analysis shows that the proposed scheme is safe under the decisional q-bilinear Diffie-Hellman exponent assumption in the standard model, achieves forward security and is able to resist collusion attack. The performance analysis shows that the proposed scheme has great advantages in terms of functionality and computational cost. Therefore, this scheme is more suitable for multi-authority attribute-based encryption environment in cloud storage.

Revocable identity-based encryption scheme with outsourcing decryption and member revocation

WANG Zhanjun, MA Haiying, WANG Jinhua, LI Yan

2019, 39(12): 3563-3568. DOI: 10.11772/j.issn.1001-9081.2019071215

Asbtract ( )

PDF (900KB) ( )

References | Related Articles | Metrics

For the drawbacks of low key updating efficiency and high decryption cost of the Revocable Identity-Based Encryption (RIBE), which make it unsuitable for lightweight devices, an RIBE with Outsourcing Decryption and member revocation (RIBE-OD) was proposed. Firstly, a full binary tree was created and a random one-degree polynomial was picked for each node of this tree. Then, the one-degree polynomial was used to create the private keys of all the users and the update keys of the unrevoked users by combining the IBE scheme based on exponential inverse model and the full subtree method, and the revoked users' decryption abilities were deprived due to not obtaining their update keys. Next, the majority of decryption calculation was securely outsourced to cloud servers after modifying the private key generation algorithm by the outsourcing decryption technique and adding the ciphertext transformation algorithm. The lightweight devices were able to decrypt the ciphertexts by only performing a little simple computation. Finally, the proposed scheme was proved to be secure based on the Decisional Bilinear Diffie-Hellman Inversion (DBDHI) assumption. Compared with Boldyreva-Goyal-Kumar (BGK) scheme, the proposed scheme not only improves the efficiency of key updating by 85.7%, but also reduces the decryption cost of lightweight devices to an exponential operation of elliptic curve, so it is suitable for lightweight devices to decrypt ciphertexts.

Improved decision diagram for attribute-based access control policy evaluation and management

LUO Xiaofeng, YANG Xingchun, HU Yong

2019, 39(12): 3569-3574. DOI: 10.11772/j.issn.1001-9081.2019040603

Asbtract ( )

PDF (952KB) ( )

References | Related Articles | Metrics

The Multi-data-type Interval Decision Diagram (MIDD) approach express and deal with the critical marks of attribute incorrectly, while express and deal with the obligations and advices ambiguously, resulting in the inconformity of node expression and the increase of processing complexity. Aiming at these problems, some improvements and expansions were proposed. Firstly, the graph nodes in MIDD with entity attribute as the unit were converted to the nodes with element as the unit, so that the elements of attribute-based access control policy were able to be represented accurately, and the problem of dealing with the critical marks was solved. Secondly, the obligations and advices were employed as elements, and were expressed by nodes. Finally, the combining algorithm of rule and policy was added to the decision nodes, so that the Policy Decision Point (PDP) was able to use it to make decision on access requests. The analysis results show that the spatio-temporal complexity of the proposed approach is similar to that of the original approach. The result of the two approaches' comparative simulation show that when each attribute has only one subsidiary attribute (the most general application situation), the average decision time difference per access request of the two approaches is at 0.01 μs level. It proves the correctness of the complexity analysis, indicating the performances of the two approaches are similar. Simulation on the number of subsidiary attributes showed that, even with 10 subsidiary attributes (very rare in practical applications), the average decision time difference of the two approaches is at the same order of magnitude. The proposed approach not only ensures the correctness, consistency and convenience of the original approach, but also extends its application scope from eXtensible Access Control Markup Language (XACML) policy to general attribute-based access control policies.

Dynamic cloud data audit model based on nest Merkle Hash tree block chain

ZHOU Jian, JIN Yu, HE Heng, LI Peng

2019, 39(12): 3575-3583. DOI: 10.11772/j.issn.1001-9081.2019040764

Asbtract ( )

PDF (1372KB) ( )

References | Related Articles | Metrics

Cloud storage is popular to users for its high scalability, high reliability, and low-cost data management. However, it is an important security problem to safeguard the cloud data integrity. Currently, providing public auditing services based on semi-trusted third party is the most popular and effective cloud data integrity audit scheme, but there are still some shortcomings such as single point of failure, computing power bottlenecks, and low efficient positioning of erroneous data. Aiming at these defects, a dynamic cloud data audit model based on block chain was proposed. Firstly, distributed network and consensus algorithm were used to establish a block chain audit network with multiple audit entities to solve the problems of single point of failure and computing power bottlenecks. Then, on the guarantee of the reliability of block chain, chameleon Hash algorithm and nest Merkle Hash Tree (MHT) structure were introduced to realize the dynamic operation of cloud data tags in block chain. Finally, by using nest MHT structure and auxiliary path information, the efficiency of erroneous data positioning was increased when error occurring in audit procedure. The experimental results show that compared with the semi-trusted third-party cloud data dynamic audit scheme, the proposed model significantly improves the audit efficiency, reduces the data dynamic operation time cost and increases the erroneous data positioning efficiency.

Solving random constraint satisfaction problems based on tabu search algorithm

LI Feilong, ZHAO Chunyan, FAN Rumeng

2019, 39(12): 3584-3589. DOI: 10.11772/j.issn.1001-9081.2019050834

Asbtract ( )

PDF (918KB) ( )

References | Related Articles | Metrics

A novel algorithm based on tabu search and combined with simulated annealing was proposed to solve random Constraint Satisfaction Problem (CSP) with growing domain. Firstly, tabu search was used to obtain a set of initial heuristic assignments, which meant a set of candidate solutions were constructed based on a randomly initialized feasible solution through neighborhood, and then the tabu table was used to move the candidate solutions to the direction of minimizing the objective function value. If the obtained optimal assignment was not the solution of the problem, the assignment would be used as the initial heuristic assignment and then simulated annealing was performed to correct the set of assignments until the global optimal solution was obtained. The numerical experiments demonstrate that, the proposed algorithm can effectively find the solution of problem when approaching the theoretical phase transition threshold of problem, and it shows obvious superiority compared with other local search algorithms. The proposed algorithm can be applied to the algorithm design of random CSP.

Computation offloading policy for delay-sensitive Internet of things applications

GUO Mian, LI Qiqi

2019, 39(12): 3590-3596. DOI: 10.11772/j.issn.1001-9081.2019050891

Asbtract ( )

PDF (1101KB) ( )

References | Related Articles | Metrics

The large network transmission delay and high energy consumption in cloud computing as well as the limited computing resource in edge servers are the bottlenecks for the development of delay-sensitive Internet of Things (IoT) applications. In order to improve the Quality of Service (QoS) of IoT applications while achieving green computing for computing systems, an edge-cloud cooperation Drift-plus-Penalty-based Computation Offloading (DPCO) policy was proposed. Firstly, mathematical modeling was performed on the business model, the transmission delay as well as the computation delay of the computation job, the computation energy as well as the transmission energy generated by the system were modeled by constructing the IoT-Edge-Cloud model. Then, the system consumption and the job average delay were optimized, with the queueing stability of the edge servers as constraint condition, the edge-cloud cooperation computation offloading optimization model was built. After that, with the optimization targets as the penalty function, the drift-plus-penalty function properties of computation offloading optimization model were analyzed based on Liapunov stability theory. Finally, DPCO was proposed based on the above results, the long-term energy consumption per unit time and the average system delay were reduced by selecting the computation offloading policy of minimizing the present drift-plus-penalty function in every time slot. In comparison with Light Fog Processing (LFP), the benchmarked Edge Computing (EC) and Cloud Computing (CC) policies, DPCO consumes the lowest system energy, which is 2/3 of that of the CC policy; DPCO also provides the shortest average job delay, which is 1/5 of that of the CC policy. The experimental results show that DPCO can efficiently reduce the energy consumption of edge-cloud computing system, shorten the end-to-end delay of the computation job, and satisfy the QoS requirements of delay-sensitive IoT applications.

Multi-objective optimization algorithm for virtual machine placement under cloud environment

LIN Kaiqing, LI Zhihua, GUO Shujie, LI Shuangli

2019, 39(12): 3597-3603. DOI: 10.11772/j.issn.1001-9081.2019050808

Asbtract ( )

PDF (1099KB) ( )

References | Related Articles | Metrics

Virtual Machine Placement (VMP) is the core of virtual machine consolidation and is a multi-objective optimization problem with multiple resource constraints. Efficient VMP algorithm can significantly reduce energy consumption, improve resource utilization, and guarantee Quality of Service (QoS). Concerning the problems of high energy consumption and low resource utilization in data center, a Discrete Bat Algorithm-based Virtual Machine Placement (DBA-VMP) algorithm was proposed. Firstly, an optimization model with multi-object constraints was established for VMP, with minimum energy consumption and maximum resource utilization as optimization objectives. Then, the pheromone feedback mechanism was introduced in the bat algorithm by emulating the pheromone sharing mechanism of artificial ant colonies in the foraging process, and the bat algorithm was improved and discretized. Finally, the improved discrete bat algorithm was used to solve the Pareto optimal solutions of the model. The experimental results show that compared with other multi-objective optimization algorithms for VMP, the proposed algorithm can effectively reduce energy consumption and improve resource utilization, and achieves an optimal balance between reducing energy consumption and improving resource utilization under the premise of guaranteeing QoS.

Review of network protocol recognition techniques

FENG Wenbo, HONG Zheng, WU Lifa, FU Menglin

2019, 39(12): 3604-3614. DOI: 10.11772/j.issn.1001-9081.2019050949

Asbtract ( )

PDF (1987KB) ( )

References | Related Articles | Metrics

Since the protocol classification of network traffic is a prerequisite for protocol analysis and network management, the network protocol recognition techniques were researched and reviewed. Firstly, the target of network protocol recognition was described, and the general process of protocol recognition was analyzed. The practical requirements for protocol recognition were discussed, and the criteria for evaluating protocol recognition methods were given. Then, the research status of network protocol techniques was summarized from two categories:packet-based protocol recognition methods and flow-based protocol recognition methods, and the variety of techniques used for protocol recognition were analyzed and compared. Finally, with the defects of current protocol recognition methods and the practical application requirements considered, the research trend of protocol recognition techniques was forecasted.

Application protocol recognition method based on convolutional neural network

FENG Wenbo, HONG Zheng, WU Lifa, LI Yihao, LIN Peihong

2019, 39(12): 3615-3621. DOI: 10.11772/j.issn.1001-9081.2019060977

Asbtract ( )

PDF (1254KB) ( )

References | Related Articles | Metrics

To solve the problems in traditional network protocol recognition methods, such as difficulty of manual feature extraction and low recognition accuracy, an application protocol recognition method based on Convolutional Neural Network (CNN) was proposed. Firstly, the raw network data was divided according to Transmission Control Protocol (TCP) connection or User Datagram Protocol (UDP) interaction, and the network flow was extracted. Secondly, the network flow was converted into a two-dimensional matrix through data prepocessing to facilitate the CNN analysis. Then, a CNN model was trained using the training set to extract protocol features automatically. Finally, the trained CNN model was used to recognize the application network protocols. The experimental results show that, the overall recognition accuracy of the proposed method is about 99.70%, which can effectively recognize the application protocols.

Data-aided time-domain joint auto-correlation and cross-correlation frequency offset estimation method

WANG Sixiu

2019, 39(12): 3622-3627. DOI: 10.11772/j.issn.1001-9081.2019040584

Asbtract ( )

PDF (790KB) ( )

References | Related Articles | Metrics

Considering the problems of low accuracy and high complexity of frequency offset estimation of data-aided burst data communications, a data-aided time-domain joint auto-correlation and cross-correlation frequency offset estimation method was proposed. Firstly, a general data frame structure based frequency offset estimation Cramer-Rao Bound (CRB) was derived, and a CRB with simpler form was introduced as the performance bound of the estimation algorithm. Then, in the auto-correlation frequency offset estimation, a auto-correlation algorithm with large range and low signal-to-noise ratio threshold was obtained using the auto-correlation operator and the exponent approximation of a complex signal; in the cross-correlation frequency offset estimation, a cross-correlation algorithm with low complexity and high accuracy was obtained by means of the cross-correlation operator and the principle of auto-correlation estimation. The simulation results show that, the proposed method can estimate the carrier frequency offset as large as half of the symbol rate with a near CBR performance, and compared to the classic M&M (Mengali & Moerlli) algorithm, its estimation accuracy is improved by five times and it has linear complexity related to the pilot length according to real multiplication operations, which is suitable for the engineering applications of burst data communications.

Bandwidth control mechanism for Docker container network based on traffic control

WANG Zhiwei, YANG Chao

2019, 39(12): 3628-3632. DOI: 10.11772/j.issn.1001-9081.2019040765

Asbtract ( )

PDF (790KB) ( )

References | Related Articles | Metrics

As Docker container lacks the ability of limiting network bandwidth resources, a bandwidth control mechanism was proposed for Docker container network based on Traffic Control (TC). Firstly, based on the real-time monitoring mechanism of CGroups file system, Virtual File System (VFS) of Linux kernel was used as a medium to pass the network control parameters set when Docker container was created to the Linux kernel controller TC. Then, the Intermediate Functional Block device (IFB) module was introduced to archive uplink and downlink bandwidth control, and the parameters (rate, ceil and prio) were used to achieve idle bandwidth sharing and container priority control. Finally, the specific network limitations were conducted by controlling the TC, and flexible network resource control between containers was realized. The experimental results show that the proposed mechanism can effectively limit the actual container bandwidth within 2% fluctuation range in the container exclusive bandwidth scenario, and can precisely limit the network bandwidth of the container with average 0.5% error range in the shared idle bandwidth scenario. Meanwhile, the mechanism can flexibly manage resources based on priorities. With the advantage of providing a more native interface for Docker and requiring no additional tools, this mechanism can provide a convenient and effective solution for fine-grained elastic network resource control on Docker-based cloud platform.

Hybrid defect prediction model based on network representation learning

LIU Chengbin, ZHENG Wei, FAN Xin, YANG Fengyu

2019, 39(12): 3633-3638. DOI: 10.11772/j.issn.1001-9081.2019061028

Asbtract ( )

PDF (946KB) ( )

References | Related Articles | Metrics

Aiming at the problem of the dependence between software system modules, a hybrid defect prediction model based on network representation learning was constructed by analyzing the network structure of software system. Firstly, the software system was converted into a software network on a module-by-module basis. Then, network representation technique was used to perform the unsupervised learning on the system structural feature of each module in software network. Finally, the system structural features and the semantic features learned by the convolutional neural network were combined to construct a hybrid defect prediction model. The experimental results show that the hybrid defect prediction model has better defect prediction effects in three open source softwares, poi, lucene and synapse of Apache, and its F1 index is respectively 3.8%, 1.0%, 4.1% higher than that of the optimal model based on Convolutional Neural Network (CNN). Software network structure feature analysis provides an effective research thought for the construction of defect prediction model.

Flowchart automatic generation algorithm base on Sugiyama

LIANG Bai'ou

2019, 39(12): 3639-3643. DOI: 10.11772/j.issn.1001-9081.2019050909

Asbtract ( )

PDF (749KB) ( )

References | Related Articles | Metrics

In order to solve the problem of low efficiency of flowchart drawing and better guarantee the consistency of software model, document and code, an algorithm for automatic generation of flowchart was proposed. Firstly, by analyzing the C/C++ source code in reverse, the Token list of the code was extracted, and the Scope tree was created to realize the flowchart generation. At the same time, a method for regulating the annotation of code functions was proposed, improving the comprehensibility of the flowchart. Finally, the readable flowchart was generated after the automatic layout of flowchart by applying the Sugiyama layout algorithm and completing and improving the coordinate designation step. In the actual application process, with the use of the proposed algorithm, the efficiency of writing software design documents is effectively improved and the consistency of the software model, document and code is guaranteed.

Vehicle-based image super-resolution reconstruction based on weight quantification and information compression

XU Dezhi, SUN Jifeng, LUO Shasha

2019, 39(12): 3644-3649. DOI: 10.11772/j.issn.1001-9081.2019050804

Asbtract ( )

PDF (992KB) ( )

References | Related Articles | Metrics

For the intelligent driving field, it is necessary to obtain high-quality super-resolution images under the condition of limited memory. Therefore, a vehicle-based image super-resolution reconstruction algorithm based on weighted eight-bit binary quantization was proposed. Firstly, the information compression module was designed based on the eight-bit binary quantization convolution, reducing the internal redundancy, enhancing the information flow in the network, and improving the reconstruction rate. Then, the whole network was composed of a feature extraction module, a plurality of stacked information compression modules and an image reconstruction module, and the information of the interpolated super-resolution space was fused with the image reconstructed by the low-resolution space, improving the network expression ability without increasing the complexity of the model. Finally, the entire network structure in the algorithm was trained based on the Generative Adversarial Network (GAN) framework, making the image have better subjective visual effect. The experimental results show that, the Peak Signal-to-Noise Ratio (PSNR) of the proposed algorithm for the reconstructed vehicle-based image is 0.22 dB higher than that of Super-Resolution using GAN (SRGAN), its generated model size is reduced to 39% of that of the Laplacian pyramid Networks for fast and accurate Super-Resolution (LapSRN), and the reconstruction speed is improved to 7.57 times of that of LapSRN.

Component substitution-based fusion method for remote sensing images via improving spatial detail extraction scheme

WANG Wenqing, LIU Han, XIE Guo, LIU Wei

2019, 39(12): 3650-3658. DOI: 10.11772/j.issn.1001-9081.2019061063

Asbtract ( )

PDF (1705KB) ( )

References | Related Articles | Metrics

Concerning the spatial and spectral distortions caused by the local spatial dissimilarity between the multispectral and panchromatic images, a component substitution-based remote sensing image fusion method was proposed via improving spatial detail extraction scheme. Different from the classical spatial detail extraction methods, a high-resolution intensity image was synthesized by the proposed method to replace the panchromatic image in spatial detail extraction with the aim of acquiring spatial detail information matching the multispectral image. Firstly, according the manifold consistency between the low-resolution intensity image and the high-resolution intensity image, locally linear embedding-based reconstruction method was used to reconstruct the first high-resolution intensity image. Secondly, after decomposing the low-resolution intensity image and the panchromatic image with the wavelet technique respectively, the low-frequency information of the low-resolution intensity image and the high-frequency information of the panchromatic image were retained, and the inverse wavelet transformation was performed to reconstruct the second high-resolution intensity image. Thirdly, sparse fusion was performed on the two high-resolution intensity images to acquire the high-quality intensity image. Finally, the synthesized high-resolution intensity image was input in the component substitution-based fusion framework to obtain the fused image. The experimental results show that, compared with the other eleven fusion methods, the proposed method has the fused images with higher spatial resolution and lower spectral distortion. For the proposed method, the mean values of the objective evaluation indexes such as correlation coefficient, root mean squared error, erreur relative global adimensionnelle de synthese, spectral angle mapper and quaternion theory-based quality index on three groups of GeoEye-1 fused images are 0.9439, 24.3479, 2.7643, 3.9376 and 0.9082 respectively. These values are better than those of the other eleven fusion methods. The proposed method can efficiently reduce the effect of local spatial dissimilarity on the performance of the component substitution-based fusion framework.

Real-time multi-face landmark localization algorithm based on deep residual and feature pyramid neural network

XIE Jinheng, ZHANG Yansheng

2019, 39(12): 3659-3664. DOI: 10.11772/j.issn.1001-9081.2019040600

Asbtract ( )

PDF (967KB) ( )

References | Related Articles | Metrics

Most face landmark detection algorithms include two steps:face detection and face landmark localization, increasing the processing time. Aiming at the problem, a one-step and real-time algorithm for multi-face landmark localization was proposed. The corresponding heatmaps were generated as data labels by the face landmark coordinates. Deep residual network was used to realize the early feature extraction of image and feature pyramid network was used to fuse the information features representing receptive fields with different scales in different network depths. And then based on intermediate supervision, multiple landmark prediction networks were cascaded to realize the one-step coarse-to-fine facial landmark regression without face detection. With high accuracy localization, a forward propagation of the proposed algorithm only takes about 0.0075 s (133 frames per second), satisfying the requirement of real-time facial landmark localization. And the proposed algorithm has achieved the mean error of 6.06% and failure rate of 11.70% on Wider Facial Landmarks in-the-Wild (WFLW) dataset.

Plant image segmentation method under bias light based on convolutional neural network

ZHANG Wenbin, ZHU Min, ZHANG Ning, DONG Le

2019, 39(12): 3665-3672. DOI: 10.11772/j.issn.1001-9081.2019040637

Asbtract ( )

PDF (1365KB) ( )

References | Related Articles | Metrics

To solve the problems of low precision and poor generalization performance of traditional image segmentation algorithms on the plant images under bias light in plant factory, a method based on neural network and deep learning for accurately segmenting the plant images under artificial bias light in plant factory was proposed. By using this method, the segmentation accuracy on the original test set of bias light plant images is 91.89% and is far superior to that by other segmentation algorithms such as Fully Convolutional Network (FCN), clustering, threshold and region growth. In addition, this method has better segmentation effect and generalization performance than the above methods on plant images under different color lights. The experimental results show that the proposed method can significantly improve the accuracy of plant image segmentation under bias light, and can be applied to practical plant factory projects.

Direction-perception feature recognition on mesh model

GUO Yihui, HUANG Chenghui, ZHONG Xueling, LU Jiyuan

2019, 39(12): 3673-3677. DOI: 10.11772/j.issn.1001-9081.2019050799

Asbtract ( )

PDF (840KB) ( )

References | Related Articles | Metrics

In order to solve the problems of the difficulty to extract features on the smooth regions of mesh models and the impossibility to recognize the feature vertices distributed only along one specific direction by the existing feature detection methods, a direction-perception method of feature recognition on mesh models was proposed. Firstly, the changes of the normal vectors of the mesh vertex adjacent surfaces were detected in x, y and z directions separately. With a suitable threshold set, if the change of a normal vector of the mesh vertex adjacent surfaces exceeded the threshold in any direction, the vertex would be recognized as a feature vertex. Then, concerning the problem that the existing mesh model feature detection algorithms cannot recognize the terraced field structure only distributed along the z-axis of three-dimensional medical model, the algorithm detected the change of normal vectors of the mesh vertex adjacent surfaces just along the z-axis direction, and recognized the vertex as a terraced field structure vertex once the change of the vertex exceeds the threshold. The abnormal terraced field structures were separated from the normal structures of the human body successfully. The experimental results show that, compared with the dihedral angle method, the proposed method can identify the features of the mesh model better under the same conditions. The proposed method solves the problem that the dihedral angle method cannot effectively identify the feature vertices on the smooth regions without obvious broken lines, and also solves the problem that the existing mesh model feature detection algorithms cannot distinguish the abnormal terraced field structures from the normal human body structures due to the lack of the direction detection ability, and establishes a base for the following digital geometry processing of the medical model.

Railway crew rostering plan based on improved ant colony optimization algorithm

WANG Dongxian, MENG Xuelei, HE Guoqiang, SUN Huiping, WANG Xidong

2019, 39(12): 3678-3684. DOI: 10.11772/j.issn.1001-9081.2019061118

Asbtract ( )

PDF (1150KB) ( )

References | Related Articles | Metrics

In order to improve the quality and efficiency of railway crew rostering plan arrangement, the problem of crew rostering plan arrangement was abstracted as a Multi-Traveling Salesman Problem (MTSP) with single base and considering mid-way rest, a single-circulation crew rostering plan mathematical model aiming at the smallest rostering period and the most balanced distributed redundant connection time between crew routings was established, and a new amended heuristic ant colony optimization algorithm was proposed aiming at the model. Firstly, a solution space satisfying the spatial-temporal constraints was constructed and the pheromone concentration was set for the crew routing nodes and the continued paths respectively. Then, the amended heuristic information was adopted to make the ants start at the crew routing order and go through all the crew routings. Finally, the optimal crew rostering plan was selected from the different crew rostering schemes. The proposed model and algorithm were tested on the data of the intercity railway from Guangzhou to Shenzhen. The comparison results with the plan arranged by particle swarm optimization show that under the same model conditions, the crew rostering plan arranged by amended heuristic ant colony optimization algorithm has the average monthly man-hour reduced by 8.5%, the rostering period decreased by 9.4%, and the crew overwork rate of 0. The designed model and algorithm can compress the crew rostering cycle, reduce the crew cost, balance the workload, and avoid the overwork of crew.

Car-following model for intelligent connected vehicles based on multiple headway information fusion

JI Yi, SHI Xin, ZHAO Xiangmo

2019, 39(12): 3685-3690. DOI: 10.11772/j.issn.1001-9081.2019050902

Asbtract ( )

PDF (907KB) ( )

References | Related Articles | Metrics

In order to further enhance the stability of traffic flow, based on the classical Optimal Velocity Changes with Memory (OVCM) model, a novel car-following model for intelligent connected vehicles based on Multiple Headway Optimal Velocity and Acceleration (MHOVA) was proposed. Firstly, the optimal velocity change of k leading cars was introduced with the weight γ, as well as the acceleration of the nearest leading car was considered with the weight ω. Then, the critical stability conditions of traffic flow were obtained based on the proposed model and by the linear stability analysis. Finally, the numerical simulations and analyses were carried out on the parameters such as velocity and headway of the fleet with disturbance by Matlab. Simulation results show that, in the simulation of the starting and stopping processes of the fleet, the proposed model reduces the time to obtain the stable state of the fleet compared to OVCM does, in the simulation of a disturbance to the fleet on the annular road, if both ω and k are of rationality, the proposed model can perform the less fluctuations in terms of velocity and headway, compared with the Full Velocity Difference (FVD) model, OVCM and the Multiple Headway Optimal Velocity (MHOV) model. Especially when ω is 0.3 and k is 5, the minimum upward and downward fluctuations of vehicle velocity can be 0.67% and 0.47% respectively. Consequently, the proposed model can better absorb traffic disturbance and enhance the driving stability of fleet.

Ship behavior recognition method based on multi-scale convolution

WANG Lilin, LIU Jun

2019, 39(12): 3691-3696. DOI: 10.11772/j.issn.1001-9081.2019050896

Asbtract ( )

PDF (947KB) ( )

References | Related Articles | Metrics

The ship behavior recognition by human supervision in complex marine environment is inefficient. In order to solve the problem, a new ship behavior recognition method based on multi-scale convolutional neural network was proposed. Firstly, massive ship driving data were obtained from the Automatic Identification System (AIS), and the discriminative ship behavior trajectories were extracted. Secondly, according to the characteristics of the trajectory data, the behavior recognition network for ship trajectory data was designed and implemented by multi-scale convolution, and the feature channel weighting and Long Short-Term Memory network (LSTM) were used to improve the accuracy of algorithm. The experimental results on ship behavior dataset show that, the proposed recognition network can achieve 92.1% recognition accuracy for the ship trajectories with specific length, which is 5.9 percentage points higher than that of the traditional convolutional neural network. In addition, the stability and convergence speed of the proposed network are significantly improved. The proposed method can effectively improve the ship behavior recognition accuracy, and provide efficient technical support for the marine regulatory authority.

Shipping monitoring event recognition based on three-dimensional convolutional neural network

WANG Zhongjie, ZHANG Hong

2019, 39(12): 3697-3702. DOI: 10.11772/j.issn.1001-9081.2019050916

Asbtract ( )

PDF (982KB) ( )

References | Related Articles | Metrics

Aiming at the poor effect of traditional machine learning algorithms on large data volume shipping monitoring video recognition classification and the low recognition accuracy of previous three-Dimensional (3D) convolution, based on 3D convolutional neural network model, combined with the popular Visual Geometry Group (VGG) network structure and GoogleNet's Inception network structure, a new VGG-Inception 3D Convolutional neural network (VIC3D) model based on VGG-16 3D convolutional network and introduced Inception module was proposed to realize the intelligent recognition of the real-time monitoring video of shipping goods. Firstly, the video data acquired from the camera were processed into images. Then, the video frame sequences by equal interval frame fetching were classified according to the categories, and the training set and the testing set were constructed. Under the premise of the same operating environment and the same training mode, the VIC3D model after combination and the original model were trained separately. Finally, the various models were compared based on the test results of the testing set. The experimental results show that, compared with the original model, the recognition accuracy of VIC3D model is improved, which is increased by 11.1 percentage points compared to the Group-constrained Convolutional Recurrent Neural Network (GCRNN) model, and the time required for every recognition is reduced by 1.349 s; the recognition accuracy of VIC3D model is increased by 14.6 percentage points and 4.2 percentage points respectively compared to the two models of C3D. The VIC3D model can be effectively applied to the shipping video surveillance projects.

Application of multimodal network fusion in classification of mild cognitive impairment

WANG Xin, GAO Yuan, WANG Bin, SUN Jie, XIANG Jie

2019, 39(12): 3703-3708. DOI: 10.11772/j.issn.1001-9081.2019050901

Asbtract ( )

PDF (997KB) ( )

References | Related Articles | Metrics

Since the early Mild Cognitive Impairment (MCI) is very likely to be undiagnosed by the assessment of medical diagnostic cognitive scale, a multimodal network fusion method for the aided diagnosis and classification of MCI was proposed. The complex network analysis method based on graph theory has been widely used in the field of neuroimaging, but different effects of brain diseases on the network topology of the brain would be conducted by using imaging technologies based different modals. Firstly, the Diffusion Tensor Imaging (DTI) and resting-state functional Magnetic Resonance Imaging (rs-fMRI) data were used to construct the fusion network of brain function and structure connection. Then, the topological properties of the fusion network were analyzed by One-way ANalysis of VAriance (ANOVA), and the attributes with significant difference were selected as the classification features. Finally, the one way cross validation of Support Vector Machines (SVM) was used for the classification of healthy group and MCI group, and the accuracy was estimated. The experimental results show that, the classification result accuracy of the proposed method reaches 94.44%, which is significantly higher than that of single modal data method. Many brain regions, such as cingulate gyrus, superior temporal gyrus and parts of the frontal and parietal lobes, of the MCI patients diagnosed by the proposed method show significant differences, which is basically consistent with the existing research results.

Table of Content