Journal of Computer Applications

Improved high-dimensional many-objective evolutionary algorithm based on decomposition

Gangzhu QIAO, Rui WANG, Chaoli SUN

2021, 41(11): 3097-3103. DOI: 10.11772/j.issn.1001-9081.2020121895

Asbtract ( )

HTML ( )

PDF (525KB) ( )

Figures and Tables | References | Related Articles | Metrics

In the reference vector based high-dimensional many-objective evolutionary algorithms， the random selection of parent individuals will slow down the speed of convergence， and the lack of individuals assigned to some reference vectors will weaken the diversity of population. In order to solve these problems， an Improved high-dimensional Many-Objective Evolutionary Algorithm based on Decomposition （IMaOEA/D） was proposed. Firstly， when a reference vector was assigned at least two individuals in the framework of decomposition strategy， the parent individuals were selected for reproduction of offspring according to the distance from the individual assigned to the reference vector to the ideal point， so as to increase the search speed. Then， for the reference vector that was not assigned at least two individuals， the point with the smallest distance from the ideal point along the reference vector was selected from all the individuals， so that at least two individuals and the reference vector were associated. Meanwhile， by guaranteeing one individual was related to each reference vector after environmental selection， the diversity of population was ensured. The proposed method was tested and compared with other four high-dimensional many-objective optimization algorithms based on decomposition on the MaF test problem sets with 10 and 15 objectives. Experimental results show that， the proposed algorithm has good optimization ability for high-dimensional many-objective optimization problems： the optimization results of the proposed algorithm on 14 test problems of the 30 test problems are better than those of the other four comparison algorithms. Especially， the proposed algorithm has certain advantage on the degradation problem optimization.

Structure-fuzzy multi-class support vector machine algorithm based on pinball loss

Kai LI, Jie LI

2021, 41(11): 3104-3112. DOI: 10.11772/j.issn.1001-9081.2021010062

Asbtract ( )

HTML ( )

PDF (816KB) ( )

Figures and Tables | References | Related Articles | Metrics

The Multi-Class Support Vector Machine （MSVM） has the defects such as strong sensitivity to noise， instability to resampling data and lower generalization performance. In order to solve the problems， the pinball loss function， sample fuzzy membership degree and sample structural information were introduced into the Simplified Multi-Class Support Vector Machine （SimMSVM） algorithm， and a structure-fuzzy multi-class support vector machine algorithm based on pinball loss， namely Pin-SFSimMSVM， was proposed. Experimental results on synthetic datasets， UCI datasets and UCI datasets adding different proportions of noise show that， the accuracy of the proposed Pin-SFSimMSVM algorithm is increased by 0~5.25 percentage points compared with that of SimMSVM algorithm. The results also show that the proposed algorithm not only has the advantages of avoiding indivisible areas of multi-class data and fast calculation speed， but also has good insensitivity to noise and stability to resampling data. At the same time， the proposed algorithm considers the fact that different data samples play different roles in classification and the important prior knowledge contained in the data， so that the classifier training is more accurate.

Artificial bee colony algorithm based on multi-population combination strategy

Wenxia LI, Linzhong LIU, Cunjie DAI, Yu LI

2021, 41(11): 3113-3119. DOI: 10.11772/j.issn.1001-9081.2021010064

Asbtract ( )

HTML ( )

PDF (757KB) ( )

Figures and Tables | References | Related Articles | Metrics

In view of the disadvantages of the standard Artificial Bee Colony （ABC） algorithm such as weak development ability and slow convergence， a new ABC algorithm based on multi-population combination strategy was proposed. Firstly， the different-dimensional coordination and multi-dimensional matching update mechanisms were introduced into the search equation. Then， two combination strategies were designed for the hire bee and the follow bee respectively. The combination strategy was composed of two sub-strategies focusing on breadth exploration and depth development respectively. In the follow bee stage， the population was divided into free subset and non-free subset， and different sub-strategies were adopted by the individuals belonging to different subsets to balance the exploration and development ability of algorithm. The 15 benchmark functions were used to compare the proposed improved ABC algorithm with the standard ABC algorithm and other three improved ABC algorithms. The results show that the proposed algorithm has better optimization performance in both low-dimensional and high-dimensional problems.

Community detection method based on tensor modeling and evolutionary K-means clustering

Jicheng CHEN, Hongchang CHEN

2021, 41(11): 3120-3126. DOI: 10.11772/j.issn.1001-9081.2021010043

Asbtract ( )

HTML ( )

PDF (759KB) ( )

Figures and Tables | References | Related Articles | Metrics

Most traditional community detection methods are limited to single relational network， and their applicability and accuracy are relatively poor. In order to solve the problems， a community detection method for multiple relationship networks was proposed. Firstly， for modeling the multiple relational network， the third-order adjacency tensor was used， in which each slice of the tensor represented an adjacency matrix corresponding to a type of relationship between participants. From the perspective of data representation， by interpreting the multiple relational network as a third-order tensor is helpful to use the factorization method as a learning method. Then， RESCAL decomposition was used as a relational learning tool to reveal the unique implicit representation of participants. Finally， the evolutionary K-means clustering algorithm was applied to the results obtained in the previous step to determine the community structure in multiple dimensions. The experiments were conducted on a synthetic dataset and two public datasets. The experimental results show that， compared with Contextual Information-based Community Detection （CICD） method， Memetic method and Local Spectral Clustering （LSC） method， the proposed method has the purity at least 5 percentage points higher， the Overlapping Normalized Mutual Information （ONMI） at least 2 percentage points higher， and the F score at least 3 percentage points higher. And it is proved that the proposed method has fast convergence speed.

Open set fuzzy domain adaptation algorithm via progressive separation

Xiaolong LIU, Shitong WANG

2021, 41(11): 3127-3131. DOI: 10.11772/j.issn.1001-9081.2021010061

Asbtract ( )

HTML ( )

PDF (743KB) ( )

Figures and Tables | References | Related Articles | Metrics

The aim of domain adaptation is to use the knowledge in a labeled （source） domain to improve the model classification performance of an unlabeled （target） domain， and this method has achieved good results. However， in the open realistic scenes， the target domain usually contains unknown classes that are not observed in the source domain， which is called open set domain adaptation problem. For such challenging scene setting， the traditional domain adaptation algorithm is powerless. Therefore， an open set fuzzy domain adaptation algorithm via progressive separation was proposed. Firstly， based on the open set fuzzy domain adaptation algorithm with membership degree introduced， the method of separating the known and unknown class samples in the target domain step by step was explored. Then， only known classes separated from the target domain were aligned with the source domain， so as to reduce the distribution difference between the two domains and perform the fuzzy adaptation. The negative transfer effect caused by the mismatch between unknown and known classes was reduced well by the proposed algorithm. Six domain transformation experimental results on Office dataset show that， the accuracy of the proposed algorithm has the significant improvement in image classification compared with the traditional domain adaptation algorithm， and verify that the proposed algorithm can gradually enhance the accuracy and robustness of the domain adaptation classification model.

Multi-head attention memory network for short text sentiment classification

Yu DENG, Xiaoyu LI, Jian CUI, Qi LIU

2021, 41(11): 3132-3138. DOI: 10.11772/j.issn.1001-9081.2021010040

Asbtract ( )

HTML ( )

PDF (681KB) ( )

Figures and Tables | References | Related Articles | Metrics

With the development of social networks， it has important social value to analyze the sentiments of massive texts in the social networks. Different from ordinary text classification， short text sentiment classification needs to mine the implicit sentiment semantic features， so it is very difficult and challenging. In order to obtain short text sentiment semantic features at a higher level， a new Multi-head Attention Memory Network （MAMN） was proposed for sentiment classification of short texts. Firstly， n-gram feature information and Ordered Neurons Long Short-Term Memory （ON-LSTM） network were used to improve the multi-head self-attention mechanism to fully extract the internal relationship of the text context， so that the model was able obtain richer text feature information. Secondly， multi-head attention mechanism was adopted to optimize the multi-hop memory network structure， so as to expand the depth of the model and mine higher level contextual internal semantic relations at the same time. A large number of experiments were carried out on Movie Review dataset （MR）， Stanford Sentiment Treebank （SST）-1 and SST-2 datasets. The experimental results show that compared with the baseline models based on Recurrent Neural Network （RNN） and Convolutional Neural Network （CNN） structure and some latest works， the proposed MAMN achieves the better classification results， and the importance of multi-hop structure in performance improvement is verified.

Storyline extraction method from Weibo news based on graph convolutional network

Xujian ZHAO, Chongwei WANG

2021, 41(11): 3139-3144. DOI: 10.11772/j.issn.1001-9081.2021030451

Asbtract ( )

HTML ( )

PDF (860KB) ( )

Figures and Tables | References | Related Articles | Metrics

As a key platform for people to acquire and disseminate news events， Weibo hides rich event information. Extracting storylines from Weibo data provides users with an intuitive way to accurately understand event evolution. However， the data sparseness and lack of context make it difficult to extract storylines from Weibo data. Therefore， two consecutive tasks for extracting storylines automatically from Weibo data were introduced： 1） events were modeled by propagation impact of Weibo， and the primary events were extracted； 2） the heterogeneous event graph was built based on the event features， and an Event Graph Convolution Network （E-GCN） model was proposed to improve the learning ability of implicit relations between events， so as to predict story branches of the events and link the events. The proposed method was evaluated from the perspectives of story branch and storyline on real datasets. In story branch generation evaluation， the results show that compared with Bayesian model， Steiner tree and Story forest， the proposed method has the F1 value higher by 28 percentage points， 20 percentage points and 27 percentage points on Dataset1 respectively， and higher by 19 percentage points， 12 percentage points and 22 percentage points on Dataset2 respectively. In storyline extraction evaluation， the results show that compared with Story timeline， Steiner tree and Story forest， the proposed method has the correct edge accuracy higher by 33 percentage points， 23 percentage points and 17 percentage points on Dataset1 respectively， and higher by 12 percentage points， 3 percentage points and 9 percentage points on Dataset2 respectively.

Neural machine translation corpus expansion method based on language similarity mining

Can LI, Yating YANG, Yupeng MA, Rui DONG

2021, 41(11): 3145-3150. DOI: 10.11772/j.issn.1001-9081.2020122039

Asbtract ( )

HTML ( )

PDF (759KB) ( )

Figures and Tables | References | Related Articles | Metrics

Concerning the lack of tagged data resources in machine translation tasks of low-resource languages， a new neural machine translation corpus expansion method based on language similarity mining was proposed. Firstly， Uyghur and Kazakh were considered as similar language pairs and their corpora were mixed. Then， Byte Pair Encoding （BPE）， syllable segmentation and BPE based on syllable segmentation were carried out on the mixed corpus respectively to explore the similarity between Kazakh and Uyghur deeply. Finally， the “Begin-Middle-End （BME）” sequence tagging method was introduced to tag the segmented syllables in the corpus in order to eliminate some ambiguities caused by syllable input. Experimental results on CWMT2015 Uyghur-Chinese parallel corpus and Kazakh-Chinese parallel corpus show that， compared with the ordinary models without special corpus processing and trained by BPE corpus processing training， the proposed method increases the Bilingual Evaluation Understudy （BLEU） by 9.66， 4.55 respectively for the Uyghur-Chinese translation and by 9.44， 4.36 respectively for the Kazakh-Chinese translation. The proposed scheme achieves cross-language neural machine translation from Uyghur and Kazakh to Chinese， improves the translation quality of Uyghur-Chinese and Kazakh-Chinese machine translation， and can be applied to corpus processing of Uyghur and Kazakh.

Text feature selection method based on Word2Vec word embedding and genetic algorithm for biomarker selection in high-dimensional omics

Yang ZHANG, Xiaoning WANG

2021, 41(11): 3151-3155. DOI: 10.11772/j.issn.1001-9081.2020122032

Asbtract ( )

HTML ( )

PDF (673KB) ( )

Figures and Tables | References | Related Articles | Metrics

Text feature is the key part of natural language processing. Concerning the problems of high dimensionality and sparseness of text features， a text feature selection method based on Word2Vec word embedding and Genetic AlgoRithm for Biomarker selection in high-dimensional Omics （GARBO） was proposed， so as to facilitate the subsequent text classification tasks. Firstly， the data input form was optimized， and the Word2Vec word embedding method was used to transform the text into the word vectors similar to gene expression. Then， the gene expression simulated by the high-dimensional word vectors was iteratively evolved. Finally， the random forest classifier was used to classify the text after feature selection. The experiments were conducted on the Chinese comment dataset to verify the proposed method. The experimental results show that， the optimized GARBO feature selection method is effective in text feature selection， successfully reducing 300-dimensional features to 50-dimensional features with more value， and has the classification accuracy reached 88%. Compared with other filtering type text feature selection methods， the proposed method can effectively reduce the dimension of text features and improve the effect of text classification.

Answer selection model based on dynamic attention and multi-perspective matching

Zhichao LI, Tohti TURDI, Hamdulla ASKAR

2021, 41(11): 3156-3163. DOI: 10.11772/j.issn.1001-9081.2021010027

Asbtract ( )

HTML ( )

PDF (599KB) ( )

Figures and Tables | References | Related Articles | Metrics

The current mainstream neural networks cannot satisfy the full expression of sentences and the full information interaction between sentences at the same time when processing answer selection tasks. In order to solve the problems， an answer selection model based on Dynamic Attention and Multi-Perspective Matching （DAMPM） was proposed. Firstly， the pre-trained Embeddings from Language Models （ELMo） was introduced to obtain the word vectors containing simple semantic information. Secondly， the filtering mechanism was used in the attention layer to remove the noise in the sentences effectively， so that the sentence representation of question and answer sentences was obtained in a better way. Thirdly， the multiple matching strategies were introduced in the matching layer at the same time to complete the information interaction between sentence vectors. Then， the sentence vectors output from the matching layer were spliced by the Bidirectional Long Short-Term Memory （BiLSTM） network. Finally， the similarity of splicing vectors was calculated by a classifier， and the semantic correlation between question and answer sentences was acquired. The experimental results on the Text REtrieval Conference Question Answering （TRECQA） dataset show that， compared with the Dynamic-Clip Attention Network （DCAN） method， which is one of the comparison aggregation framework based baseline models， the proposed DAMPM improves the Mean Average Precision （MAP） and Mean Reciprocal Rank （MRR） both by 1.6 percentage points. The experimental results on the Wiki Question Answering （WikiQA） dataset show that， the two performance indices of DAMPM is 0.7 percentage points and 0.8 percentage points higher than those of DCAN respectively. The proposed DAMPM has better performance than the methods in the baseline models in general.

Session-based recommendation model of multi-granular graph neural network

Junwei REN, Cheng ZENG, Siyu XIAO, Jinxia QIAO, Peng HE

2021, 41(11): 3164-3170. DOI: 10.11772/j.issn.1001-9081.2021010060

Asbtract ( )

HTML ( )

PDF (682KB) ( )

Figures and Tables | References | Related Articles | Metrics

Session-based recommendation aims to predict the user’s next click behavior based on the click sequence information of the current user’s anonymous session. Most of the existing methods realize recommendations by modeling the item information of the user’s session click sequence and learning the vector representation of the items. As a kind of coarse-grained information， the item category information can aggregate the items and can be used as an important supplement to the item information. Based on this， a Session-based Recommendation model of Multi-granular Graph Neural Network （SRMGNN） was proposed. Firstly， the embedded vector representations of items and item categories in the session sequence were obtained by using the Graph Neural Network （GNN）， and the attention information of users was captured by using the attention network. Then， the items and item category information given by different weight values of attention were fused and input into the Gated Recurrent Unit （GRU）. Finally， through GRU， the item time sequence information of the session sequence was learned， and the recommendation list was given. Experiments performed on the public Yoochoose dataset and Diginetica dataset verify the advantages of the proposed model with the addition of item category information， and show that the model has better effect compared with all the eight models such as Short-Term Attention/Memory Priority （STAMP）， Neural Attentive session-based RecomMendation （NARM）， GRU4REC on the evaluation indices Precision@20 and Mean Reciprocal Rank （MRR）@20.

IPTV video-on-demand recommendation model based on capsule network

Mingwei GAO, Nan SANG, Maolin YANG

2021, 41(11): 3171-3177. DOI: 10.11772/j.issn.1001-9081.2021010047

Asbtract ( )

HTML ( )

PDF (555KB) ( )

Figures and Tables | References | Related Articles | Metrics

In Internet Protocol Television （IPTV） applications， a television terminal is usually shared by several family members. The exiting recommendation algorithms are difficult to analyze the different interests and preferences of family members from the historical data of terminal. In order to meet the video-on-demand requirements of multiple members under the same terminal， a capsule network-based IPTV video-on-demand recommendation model， namely CapIPTV， was proposed. Firstly， a user interest generation layer was designed on the basis of the capsule network routing mechanism， which took the historical behavior data of the terminal as the input， and the interest expressions of different family members were obtained through the clustering characteristic of the capsule network. Then， the attention mechanism was adopted to dynamically assign different attention weights to different interest expressions. Finally， the interest vector of different family members and the expression vector of video-on-demand were extracted， and the inner product of them was calculated to obtain the Top-N preference recommendation. Experimental results based on both the public dataset MovieLens and real radio and television dataset IPTV show that， the proposed CapIPTV outperforms the other 5 similar recommendation models in terms of Hit Rate （HR）， Recall and Normalized Discounted Cumulative Gain （NDCG）.

Multiply distortion type judgement method based on multi-scale and multi-classifier convolutional neural network

Junhua YAN, Ping HOU, Yin ZHANG, Xiangyang LYU, Yue MA, Gaofei WANG

2021, 41(11): 3178-3184. DOI: 10.11772/j.issn.1001-9081.2020121894

Asbtract ( )

HTML ( )

PDF (1034KB) ( )

Figures and Tables | References | Related Articles | Metrics

It is difficult to judge the image multiply distortion type. In order to solve the problem， based on the idea of deep learning multi-label classification， a new multiply distortion type judgement method based on multi-scale and multi-classifier Convolutional Neural Network （CNN） was proposed. Firstly， the image block containing high-frequency information was obtained from the image， and the image block was input into the convolution layers of different receptive fields to extract the shallow feature maps of the image. Then， the shallow feature maps were input into the structure of each sub-classifier for deep feature extraction and fusion， and the fused features were judged by the Sigmoid classifier. Finally， the judgment results of different sub-classifiers were fused to obtain the multiply distortion type of image. Experimental results show that， on the Natural Scene Mixed Disordered Images Database （NSMDID）， the average judgment accuracy of the proposed method can reach 91.4% for different types of multiply distortion types in the images， and most of them are above 96.8%， illustrating that the proposed method can effectively judge the types of distortion in multiply distortion images.

Improved algorithm of generative adversarial network based on arbitration mechanism

Guihui CHEN, Huikang LIU, Zhongbing LI, Jiao PENG, Shaotian WANG, Jinyu LIN

2021, 41(11): 3185-3191. DOI: 10.11772/j.issn.1001-9081.2020122040

Asbtract ( )

HTML ( )

PDF (2958KB) ( )

Figures and Tables | References | Related Articles | Metrics

Concerning the lack of flexibility in adversarial training of Deep Convolutional Generative Adversarial Network （DCGAN） and the problems of inflexible optimization and unclear convergence state of Binary Cross-Entropy loss （BCE loss） function used in DCGAN， an improved algorithm of Generative Adversarial Network （GAN） based on arbitration mechanism was proposed. In this algorithm， the proposed arbitration mechanism was added on the basis of DCGAN. Firstly， the network structure of the proposed improved algorithm was composed of generator， discriminator， and arbiter. Secondly， the adversarial training was conducted by the generator and discriminator according to the training plan， and the abilities to generate images and verify the authenticity of images were strengthened according to the characteristics learned from the dataset respectively. Thirdly， the arbiter was generated by the generator and the discriminator after the last round of adversarial training and metric score calculation module， and the adversarial training results of the generator and the discriminator were measured by this arbiter and fed back into the training plan. Finally， a wining limit was added to the network structure to improve the stability of model training， and the Circle loss function was used to replace the BCE loss function， which made the model optimization process more flexible and the convergence state more clear. Experimental results show that the proposed algorithm has a good generation effect on the architectural and face datasets. On the Large-scale Scene UNderstanding （LSUN） dataset， the proposed algorithm has the Fréchet Inception Distance （FID） index decreased by 1.04% compared with the DCGAN original algorithm； on the CelebA dataset， the proposed algorithm has the Inception Score （IS） index increased by 4.53% compared with the DCGAN original algorithm. The images generated by the proposed algorithm have better diversity and higher quality.

Health index construction and remaining useful life prediction of mechanical axis based on action cycle degradation similarity measurement

Yubin ZHOU, Hong XIAO, Tao WANG, Wenchao JIANG, Meng XIONG, Zhongtang HE

2021, 41(11): 3192-3199. DOI: 10.11772/j.issn.1001-9081.2021010046

Asbtract ( )

HTML ( )

PDF (1034KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problems of low detection efficiency and accuracy in the health management process of industrial robot axis， a new Health Index （HI） construction method based on action cycle degradation similarity measurement under the background of mechanical axis operation monitoring big data was proposed， and the robot Remaining Useful Life （RUL） prediction was carried out by combining Long Short-Term Memory （LSTM） network. Firstly， MPdist was used to focus on the similarity features of sub-cycle sequences between different action cycles of mechanical axis， and the deviation distance between normal cycle data and degradation cycle data was calculated， so that the HI was constructed. Then， the LSTM network model was trained by HI set， and the mapping relationship between HI and RUL was established. Finally， the MPdist-LSTM hybrid model was used to automatically calculate the RUL and give early warning in time. The six-axis industrial robot of a company was used to carry the experiments， and about 15 million pieces of data were collected. The monotonicity， robustness and trend of HI and Mean Absolute Error （MAE）， Root Mean Square Error （RMSE）， R-Square （ $R 2$ ）， Error Range （ER）， Early Prediction （EP） and Late Prediction （LP） of RUL were tested. The proposed method were compared with the methods such as Dynamic Time Warping （DTW）， Euclidean Distance （ED）， Time Domain Eigenvalue （TDE） combined with LSTM， MPdist combined with RNN and LSTM. The experimental results show that， compared with other comparison methods， the proposed method has the HI monotonicity and trend higher by at least 0.07 and 0.13 respectively， the higher RUL prediction accuracy， and the smaller ER， which verifies the effectiveness of the proposed method.

Surface defect detection method based on auto-encoding and knowledge distillation

Taiheng LIU, Zhaoshui HE

2021, 41(11): 3200-3205. DOI: 10.11772/j.issn.1001-9081.2020121974

Asbtract ( )

HTML ( )

PDF (1549KB) ( )

Figures and Tables | References | Related Articles | Metrics

The traditional surface defect detection methods can only detect obvious defect contours with high contrast or low noise. In order to solve the problem， a surface defect detection method based on auto-encoding and knowledge distillation was proposed to accurately locate and classify the defects that appeared in the input images captured from the actual industrial environment. Firstly， a new Cascaded Auto-Encoder （CAE） architecture was designed to segment and locate defects， whose purpose was to convert the input original image into the CAE-based prediction mask. Secondly， the threshold module was used to binarize the prediction results， thereby obtaining the accurate defect contour. Then， the defect area extracted and cropped by the defect area detector was regarded as the input of the next module. Finally， the defect areas of the CAE segmentation results were classified by knowledge distillation. Experimental results show that， compared with other surface defect detection methods， the proposed method has the best comprehensive performance， and its average accuracy of defect detection is 97.00%. The proposed method can effectively segment the smaller defects with blurred edges， and meet the engineering requirements for real-time segmentation and detection of item surface defects.

Defect target detection for printed matter based on Siamese-YOLOv4

Haojie LOU, Yuanlin ZHENG, Kaiyang LIAO, Hao LEI, Jia LI

2021, 41(11): 3206-3212. DOI: 10.11772/j.issn.1001-9081.2020121958

Asbtract ( )

HTML ( )

PDF (1573KB) ( )

Figures and Tables | References | Related Articles | Metrics

In the production of printing industry， using You Only Look Once version 4 （YOLOv4） directly to detect printing defect targets has low accuracy and requires a large number of training samples. In order to solve the problems， a defect target detection method for printed matter based on Siamese-YOLOv4 was proposed. Firstly， a strategy of image segmentation and random parameter change was used to enhance the dataset. Then， the Siamese similarity detection network was added to the backbone network， and the Mish activation function was introduced into the similarity detection network to calculate the similarity of image blocks. After that， the regions with similarity below the threshold were regarded as the defect candidate regions. Finally， the candidate region images were trained to achieve the precise positioning and classification of defect targets. Experimental results show that， the detection precision of the proposed Siamese-YOLOv4 model is better than those of the mainstream target detection models. On the printing defect dataset， the Siamese-YOLOv4 network has the detection precision for satellite ink droplet defect of 98.6%， the detection precision for dirty spot of 97.8%， the detection precision for print lack of 93.9%； and the mean Average Precision （mAP） reaches 96.8%， which is 6.5 percentage points，6.4 percentage points， 14.9 percentage points and 10.6 percentage points higher respectively than the YOLOv4 algorithm， the Faster Regional Convolutional Neural Network （Faster R-CNN） algorithm， the Single Shot multibox Detector （SSD） algorithm and the EfficientDet algorithm. The proposed Siamese-YOLOv4 model has low false positive rate and miss rate in the defect detection of printed matter， and improves the detection precision by calculating similarity of the image blocks through the similarity detection network， proving that the proposed defect detection method can be applied to the printing quality inspection and therefore improve the defect detection level of printing enterprises.

Drowsiness recognition algorithm based on human eye state

Lin SUN, Yubo YUAN

2021, 41(11): 3213-3218. DOI: 10.11772/j.issn.1001-9081.2020122058

Asbtract ( )

HTML ( )

PDF (1688KB) ( )

Figures and Tables | References | Related Articles | Metrics

Most of the existing drowsiness recognition algorithms are based on machine learning or deep learning， without considering the relationship between the sequence of human eye closed state and drowsiness. In order to solve the problem， a drowsiness recognition algorithm based on human eye state was proposed. Firstly， a human eye segmentation and area calculation model was proposed. Based on 68 feature points of the face， the eye area was segmented according to the extremely large polygon formed by the feature points of human eye， and the total number of eye pixels was used to represent the size of the eye area. Secondly， the area of the human eye in the maximum state was calculated， and the key frame selection algorithm was used to select 4 frames representing the eye opening state the most， and the eye opening threshold was calculated based on the areas of human eye in these 4 frames and in the maximum state. Therefore， the eye closure degree score model was constructed to determine the closed state of the human eye. Finally， according the eye closure degree score sequence of the input video， a drowsiness recognition model was constructed based on continuous multi-frame sequence analysis. The drowsiness state recognition was conducted on the two commonly used international datasets such as Yawning Detection Dataset （YawDD） and NTHU-DDD dataset.Experimental results show that， the recognition accuracy of the proposed algorithm is more than 80% on the two datasets， especially on the YawDD， the proposed algorithm has the recognition accuracy above 94%. The proposed algorithm can be applied to driver status detection during driving， learner status analysis in class and so on.

Application of Inception-v3 model integrated with transfer learning in dynasty identification of ancient murals

Jianfang CAO, Minmin YAN, Yiming JIA, Xiaodong TIAN

2021, 41(11): 3219-3227. DOI: 10.11772/j.issn.1001-9081.2020121924

Asbtract ( )

HTML ( )

PDF (1665KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problems of small quantity， poor quality， difficulty in feature extraction， and similarity of mural text and painting style of ancient mural images， an Inception-v3 model integrated with transfer learning was proposed to identify and classify the dynasties of ancient murals. Firstly， the Inception-v3 model was pre-trained on the ImageNet dataset to obtain the migration model. After fine-tuning the parameters of the migration model on the small mural dataset， the high-level features were extracted from the mural images. Then， the feature representation ability was enhanced by adding two fully connected layers， and the color histogram and Local Binary Pattern （LBP） texture histogram were used to extract the artistic features of murals. Finally， the high-level features were combined with the artistic features， and the Softmax classifier was used to perform the dynasty classification of murals. Experimental results show that， the training process of the proposed model was stable. On the constructed small mural dataset， the proposed model has the final accuracy of 88.70%， the recall of 88.62%， and the F1-score of 88.58%. Each evaluation index above of the proposed model is better than those of the classic network models such as AlexNet and Visual Geometry Group Net （VGGNet）. Compared with LeNet-5， AlexNet-S6 and other improved convolutional neural network models， the proposed model has the accuracy of each dynasty category improved by at least 7 percentage points on average. It can be seen that the proposed model has strong generalization ability， is not prone to overfitting， and can effectively identify the dynasty to which the murals belong.

Road abandoned object detection algorithm based on optimized instance segmentation model

Yue ZHANG, Liang ZHANG, Fei XIE, Jiale YANG, Rui ZHANG, Yijian LIU

2021, 41(11): 3228-3233. DOI: 10.11772/j.issn.1001-9081.2021010073

Asbtract ( )

HTML ( )

PDF (1573KB) ( )

Figures and Tables | References | Related Articles | Metrics

In the field of traffic safety， the road abandoned objects easily cause traffic accidents and become potential traffic safety hazards. Focusing on the problems of low recognition rate and poor detection effect for different abandoned objects of traditional road abandoned object detection methods， a road abandoned object detection algorithm based on the optimized instance segmentation model CenterMask was proposed. Firstly， the residual network ResNet50 optimized by dilated convolution was used as the backbone neural network to extract image features and carry out the multi-scale processing. Then， the Fully Convolutional One-Stage （FCOS） target detector optimized by Distance Intersection over Union （DIoU） function was used to realize the detection and classification of road abandoned objects. Finally， the spatial attention-guided mask was used as the mask segmentation branch to realize the object shape segmentation， and the model training was realized by the transfer learning method. Experimental results show that， the detection rate of the proposed algorithm for road abandoned objects is 94.82%， and compared with the common instance segmentation algorithm Mask Region-Convolutional Neural Network （Mask R-CNN）， the proposed road abandoned object detection algorithm has the Average Precision （AP） increased by 8.10 percentage points in bounding box detection.

Single shot multibox detector recognition method for aerial targets of unmanned aerial vehicle

Huaiyu ZHU, Bo LI

2021, 41(11): 3234-3241. DOI: 10.11772/j.issn.1001-9081.2021010026

Asbtract ( )

HTML ( )

PDF (1657KB) ( )

Figures and Tables | References | Related Articles | Metrics

Unmanned Aerial Vehicle （UAV） aerial images have a wide field of vision， and the targets in the images are small and have blurred boundaries. And the existing Single Shot multibox Detector （SSD） target detection model is difficult to accurately detect small targets in aerial images. In order to effectively solve the problem that the original model is easy to have missed detection， based on Feature Pyramid Network （FPN）， a new SSD model based on continuous upsampling was proposed. In the improved SSD model， the input image size was adjusted to $320 × 320$ ， the Conv3_3 feature layer was added， the high-level features were upsampled， and features of the first five layers of VGG16 network were fused by using feature pyramid structure， so as to enhance the semantic representation ability of each feature layer. Meanwhile， the size of anchor box was redesigned. Training and verification were carried out on the open aerial dataset UCAS-AOD. Experimental results show that， the improved SSD model has 94.78% in mean Average Precision （mAP） of different categories， and compared with the existing SSD model， the improved SSD model has the accuracy increased by 17.62%， including 4.66% for plane category and 34.78% for car category.

Object detection method based on radar and camera fusion

Jie GAO, Yuan ZHU, Ke LU

2021, 41(11): 3242-3250. DOI: 10.11772/j.issn.1001-9081.2021020327

Asbtract ( )

HTML ( )

PDF (1594KB) ( )

Figures and Tables | References | Related Articles | Metrics

In the automatic driving perception system， multi-sensor fusion is usually used to improve the reliability of the perception results. Aiming at the task of object detection in fusion perception system， a object detection method based on radar and camera fusion， namely Priori and Radar Region Proposal Network （PRRPN）， was proposed，with the aim of using radar measurement and the object detection result of the previous frame to improve the generation of region proposals in the image detection network and improve the object detection performance. Firstly， the objects detected in the previous frame with the radar points in the current frame were associated to pre-classify the radar points. Then， the pre-classified radar points were projected into the image， and the corresponding prior region proposals and radar region proposals were obtained according to the distance of the radar and Radar Cross Section （RCS） information. Finally， the regression and classification of the object bounding boxes were performed according to the region proposals. In addition， PRRPN and Region Proposal Network （RPN） were fused to carry out object detection. The newly released nuScenes dataset was adopted to test and evaluate the three detection methods. Experimental results show that， compared with RPN， the proposed PRRPN can not only detect objects faster， but also increase the average detection accuracy of small objects by 2.09 percentage points. And compared with the methods by using PRRPN and RPN alone， the method by fusing the proposed PRRPN and RPN has the average detection accuracy increased by 2.54 percentage points and 0.34 percentage points respectively.

Hierarchical file access control scheme with identity-based multi-conditional proxy re-encryption

Li LI, Hongfei YANG, Xiuze DONG

2021, 41(11): 3251-3256. DOI: 10.11772/j.issn.1001-9081.2020121998

Asbtract ( )

HTML ( )

PDF (490KB) ( )

Figures and Tables | References | Related Articles | Metrics

In view of the problems of traditional file sharing schemes， such as easy leakage of files， difficult control of file destination， and complex access control， as well as the application requirements of cloud file hierarchical classification management and sharing， a hierarchical file access control scheme with identity-based multi-conditional proxy re-encryption was proposed. Firstly， the permission level of file was taken as the condition of ciphertext generation， and the trusted hierarchical management unit was introduced to determine and manage the user levels. Secondly， the re-encryption key of user’s hierarchical access permission was generated， which solved the problem that the identity-based conditional proxy re-encryption scheme only restricts the re-encryption behavior of proxy servers， and lacks the limitation of the user’s permission. Meanwhile， the burden of client was reduced， which means only encryption and decryption operations were needed for users. The results of comparison and analysis of different schemes show that， compared with the existing access control schemes， the proposed scheme has obvious advantages， it can complete the update of the user’s access permission without the direct participation of users， and has the characteristic of uploader anonymity.

Digital rights management scheme based on secret sharing in blockchain environment

Xiaoqiong PANG, Ting YANG, Wenjun CHEN, Yunting WANG, Tianye LIU

2021, 41(11): 3257-3265. DOI: 10.11772/j.issn.1001-9081.2021010024

Asbtract ( )

HTML ( )

PDF (580KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to meet the requirements of safe storage and effective distribution of content encryption key in digital rights protection， a new digital rights protection scheme based on secret sharing in blockchain environment was proposed， including 4 protocols： system initialization， content encryption， license authorization and content decryption. The Pedersen’s verifiable secret sharing scheme and Attribute-Based Encryption （ABE） algorithm were used to protect and distribute the content encryption key. The content providers were freed from the task of managing content encryption keys， which ensured the security and flexibility of key management. In addition， the digital rights protection scheme based on blockchain has the characteristics of information openness and transparency， and is tamper-resistant. Security analysis show that the proposed scheme is safe and feasible in the blockchain environment； simulation results show that the proposed scheme can achieve the rights protection of digital content with low cost.

Adaptive secure outsourced attribute-based encryption scheme with keyword search

Lifeng GUO, Qianli WANG

2021, 41(11): 3266-3273. DOI: 10.11772/j.issn.1001-9081.2020121987

Asbtract ( )

HTML ( )

PDF (673KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to solve the problems of high computational cost of Attribute-Based Encryption （ABE） scheme and low efficiency of data search in cloud servers simultaneously， an Outsourced Attribute-Based Encryption scheme with Keyword Search （OABE-KS） was proposed. Firstly， the outsourced computation technology was used for reducing the local computing cost of encryption and decryption users to the constant level. Then， the indexes and trapdoors of the corresponding keywords were generated by the encryption user and the decryption user respectively， and the cloud server was used to match them. After that， the successful matching results would be returned to the decryption user by the cloud server. The adaptive security of the proposed scheme was proved under the composite order group. According to the experimental analysis， when the number of attributes changes from 10 to 100， the running time of each stage of the proposed scheme is basically unchanged， showing that the running time of the proposed scheme in each stage does not vary with the number change of attributes. Experimental results show that， the proposed scheme is suitable for the application on resource-limited devices and is not affected by the number of attributes in practical applications.

Cascaded quasi-cyclic moderate-density parity-check code based public key scheme for resisting reaction attack

Guangfu WU, Ziheng DAI

2021, 41(11): 3274-3280. DOI: 10.11772/j.issn.1001-9081.2021010023

Asbtract ( )

HTML ( )

PDF (585KB) ( )

Figures and Tables | References | Related Articles | Metrics

The McEliece Public Key Cryptography （PKC） based on Quasi-Cyclic Moderate-Density Parity-Check （QC-MDPC） code is a promising scheme to resist quantum attack with small key size， so it is easy to storage. However， a reaction attack has a great threat to its security currently. The attacker selects some special error patterns to encrypt numerous messages to obtain the decoding failure feedback from the receiver， and then cracks the private key by analyzing the relationship between the decoding failure rate and the private key structure. This attack is called key recovery attack. In response to this attack， a new public key scheme cascading QC-MDPC code and fountain code was proposed. In the scheme， the “rateless characteristic” of fountain code was used to generate abundant encrypted packets which were used to substituted for the Automatic Repeat-reQuest （ARQ） structure， so that the attacker was not able to achieve the feedback information. The analysis results show that the proposed scheme can effectively resist key recovery attack and guarantee the security under other attacks as well.

Three-factor anonymous authentication and key agreement protocol

Ping ZHANG, Yiqiao JIA, Jiechang WANG, Nianfeng SHI

2021, 41(11): 3281-3287. DOI: 10.11772/j.issn.1001-9081.2021010005

Asbtract ( )

HTML ( )

PDF (642KB) ( )

Figures and Tables | References | Related Articles | Metrics

To ensure the information security of communication between two parties， many Authenticated Key Agreement （AKA） protocols have been proposed and applied in practical scenarios. However， the existing three-factor protocols have security vulnerabilities， such as being vulnerable to smart card loss attacks and password guessing attacks， and some even ignore anonymity. In order to solve the problems， a new three-factor anonymous authentication and key agreement protocol was proposed. In the proposed protocol， smart card， password and biometric authentication technology were integrated， the password and biometric characteristic update phase， the update and distribution phase of the smart card were added， and the Computational Diffie-Hellman （CDH） assumption on the elliptic curve was used for information interaction so as to realize secure communications. The security of the proposed protocol was proved by using the random oracle model. Compared with similar protocols， the analysis results show that the proposed protocol can prevent many attacks such as smart card loss attacks and replay attacks， realizes more comprehensive functions such as anonymity and free updating of password， and has higher computing and communication efficiency.

Searchable encryption scheme based on splittable inverted index

Xiaoling SUN, Guang YANG, Yanping SHEN, Qiuge YANG, Tao CHEN

2021, 41(11): 3288-3294. DOI: 10.11772/j.issn.1001-9081.2021010112

Asbtract ( )

HTML ( )

PDF (639KB) ( )

Figures and Tables | References | Related Articles | Metrics

For retrieving the encrypted data in cloud environment quickly， an efficient searchable encryption scheme for batch data processing scenarios was proposed. Firstly， two inverted indexes were built by the client， one file index was used to store the file-keyword mapping， another empty search index was used to store keyword-file mapping. Then， these two indexes were submitted to the cloud server. The search index_{was gradually updated and constructed according to the search tokens and file indexesduring the user’s search by the cloud， and the search results of the searched keywords were recorded by this search index. In this way， the search index construction time was shared to each retrieval process effectively and the storage space of search index was reduced. A set storage method based on key-value structure was adopted by the indexes， which supported the at-the-same-time merging and splitting of index， which means when adding and deleting files， the corresponding file index and search index were generated by the client according to the file set to be added or deleted， then the server merged or split the indexes， so that the files were able to be added and deleted in batch quickly. Testing results show that the proposed scheme greatly improves the updating efficiency of files and is suitable for batch data processing. Through leakage function， it is proved that the proposed scheme can meet the indistinguishability security standard against adaptive dynamic keyword selection attack.}

Scheduling strategy of irregular tasks on graphics processing unit cluster

Fan PING, Xiaochun TANG, Yanyu PAN, Zhanhuai LI

2021, 41(11): 3295-3301. DOI: 10.11772/j.issn.1001-9081.2020121984

Asbtract ( )

HTML ( )

PDF (634KB) ( )

Figures and Tables | References | Related Articles | Metrics

Since a large number of irregular task sets have low resource requirements and high parallelism， the use of Graphics Processing Unit （GPU） to accelerate processing is the current mainstream. However， the existing irregular task scheduling strategies either use an exclusive GPU approach or use the traditional optimization methods to map tasks to GPU devices. The former leads to the idleness of GPU resources， and the latter cannot make maximum use of GPU computing resources. Based on the analysis of existing problems， an idea of multi-knapsack optimization was adopted to enable more irregular tasks to share GPU equipment in the best way. Firstly， according to the characteristics of GPU clusters， a distributed GPU job scheduling framework consisting of schedulers and executions was given. Then， with GPU memory as the cost， an Extended-grained Greedy Scheduling （EGS） algorithm based on GPU computing resources was designed. In the algorithm， as many irregular tasks as possible were scheduled on multiple available GPUs to maximize the use of GPU computing resources， and the problem of idle GPU resources was solved. Finally， the actual benchmark programs were used to randomly generate a target task set to verify the effectiveness of the proposed scheduling strategy. Experimental results show that， compared with the traditional greedy algorithm， the Minimum Completion Time （MCT） algorithm and the Min-min algorithm， when the number of tasks is equal to 1 000，the execution time of EGS algorithm is reduced to 58%， 64% and 80% of the original ones on average respectively， and the proposed algorithm can effectively improve the GPU resource utilization.

Task offloading method based on probabilistic performance awareness and evolutionary game strategy in “cloud + edge” hybrid environment

Ying LEI, Wanbo ZHENG, Wei WEI, Yunni XIA, Xiaobo LI, Chengwu LIU, Hong XIE

2021, 41(11): 3302-3308. DOI: 10.11772/j.issn.1001-9081.2020121932

Asbtract ( )

HTML ( )

PDF (1179KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problem of low multi-task offloading efficiency in the “cloud+edge” hybrid environment composed of “central cloud server and multiple edge servers”， a task offloading method based on probabilistic performance awareness and evolutionary game theory was proposed. Firstly， in a “cloud + edge” hybrid environment composed of “central cloud server and multiple edge servers”， assuming that all the edge servers distributed in it had time-varying volatility performance， the historical performance data of edge cloud servers was probabilistically analyzed by a task offloading method based on probabilistic performance awareness and evolutionary game theory for obtaining the evolutionary game model. Then， an Evolutionary Stability Strategy （ESS） of service offloading was generated to guarantee that each user could offload tasks on the premise of high satisfaction rate. Simulation experiments were carried out based on the cloud edge resource locations dataset and the cloud service performance test dataset， the test and comparison of different methods were carried out on 24 continuous time windows. Experimental results show that， the proposed method is better than traditional task offloading methods such as Greedy algorithm， Genetic Algorithm （GA）， and Nash-based Game algorithm in many performance indexes. Compared with the three comparison methods， the proposed method has the average user satisfaction rate higher by 13.7%， 117.0%， 13.8% respectively， the average offloading time lower by 6.5%， 24.9%， 8.3% respectively， and the average monetary cost lower by 67.9%， 88.7%， 18.0% respectively.

Virtual software defined network mapping algorithm based on topology segmentation and clustering analysis

Gang CHEN, Xiangru MENG, Qiaoyan KANG, Yong YANG

2021, 41(11): 3309-3318. DOI: 10.11772/j.issn.1001-9081.2021010015

Asbtract ( )

HTML ( )

PDF (2050KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problem that most mapping algorithms based on virtual Software Defined Network （vSDN） do not fully consider the correlation between nodes and links， a vSDN mapping algorithm based on network topology segmentation and clustering analysis was proposed. Firstly， the complexity of physical network was reduced by the topology segmentation method based on the shortest hop count. Then， the request acceptance rate of mapping algorithm was improved by the clustering analysis method based on node topology and resource attributes. Finally， the nodes that do not meet the link requirements were remapped， by dispersing the link constraints to the bandwidth resources of nodes and the degrees of nodes to perform the consideration with constraints， so that the mapping process between nodes and links was optimized. Experimental results show that， the proposed algorithm can effectively improves the request acceptance rate of virtual network mapping algorithm based on Software Defined Network （SDN） architecture in physical networks with low connectivity probability.

Computation offloading and resource allocation strategy in NOMA-based 5G ultra-dense network

Yongpeng SHI, Junjie ZHANG, Yujie XIA, Ya GAO, Shangwei ZHANG

2021, 41(11): 3319-3324. DOI: 10.11772/j.issn.1001-9081.2021020214

Asbtract ( )

HTML ( )

PDF (639KB) ( )

Figures and Tables | References | Related Articles | Metrics

A Non-Orthogonal Multiple Access （NOMA） based computation offloading and bandwidth allocation strategy was presented to address the issues of insufficient computing capacity of mobile devices and limited spectrum resource in 5G ultra-dense network. Firstly， the system model was analyzed， on this basis， the research problem was defined formally with the objective of minimizing the computation cost of devices. Then， this problem was decomposed into three sub-problems： device computation offloading， system bandwidth allocation， and device grouping and matching， which were solved by adopting simulated annealing， interior point method， and greedy algorithm. Finally， a joint optimization algorithm was used to alternately solve the above sub-problems， and the optimal computation offloading and bandwidth allocation strategy was obtained. Simulation results show that， the proposed joint optimization strategy is superior to the traditional Orthogonal Multiple Access （OMA）， and can achieve lower device computation cost compared to NOMA technology with average bandwidth allocation.

Sparse adaptive filtering algorithm based on generalized maximum Versoria criterion

Yuefa OU, Mingkun YANG, Dejun MU, Jie KE, Wentao MA

2021, 41(11): 3325-3331. DOI: 10.11772/j.issn.1001-9081.2020121982

Asbtract ( )

HTML ( )

PDF (1089KB) ( )

Figures and Tables | References | Related Articles | Metrics

The traditional sparse adaptive filtering has the problems of poor steady-state performance and even unable to converge in impulse noise interface environment. In order to solve the problems and improve the accuracy of sparse parameter identification without increasing too much computational cost， a sparse adaptive filtering algorithm based on Generalized Maximum Versoria Criterion （GMVC） was proposed， namely the GMVC with CIM constraints （CIMGMVC）. Firstly， the generalized Versoria function was employed as the learning criterion， which contained the reciprocal form of the error p-order moment. And thus the purpose of suppressing impulse noise was able to be achieved because the GMVC would approach to 0 when the error caused by the impulse interference was very large. Then， a novel cost function was constructed by combining the Correntropy Induced Metric （CIM） used as the sparse penalty constraint and the GMVC， where the CIM was based on the Gaussian probability density function， and it was able to be infinitely close to $l 0$ -norm when the appropriate kernel width was selected. Finally， the CIMGMVC algorithm was derived by using the gradient method， and the mean square convergence of the proposed algorithm was analyzed. The simulation was performed on Matlab platform， and the $α$ -stable distribution model was used to generate impulse noise. Experimental results show that， the proposed CIMGMVC algorithm can effectively suppress the interference of non-Gaussian impulse noise， it has the better robustness than the traditional sparse adaptive filtering， and has the steady-state error lower than the GMVC algorithm.

Moving object detection and static map reconstruction with hybrid vision system

Yusheng HU, Bingwei HE, Qingkang DENG

2021, 41(11): 3332-3336. DOI: 10.11772/j.issn.1001-9081.2021010021

Asbtract ( )

HTML ( )

PDF (1596KB) ( )

Figures and Tables | References | Related Articles | Metrics

Moving object detection and static map reconstruction in the environment with complex dynamic background are prone to incomplete moving object detection. In order to solve the problem， a new moving object detection method with hybrid vision system assisted by point cloud segmentation was proposed. Firstly， the PassThrough+RANdom SAmple Consensus （RANSAC） method was proposed to overcome large-area wall interference， so as to realize the point cloud ground point recognition. Secondly， the non-ground point data were projected to the image as feature points， and their optical flow motion vectors and artificial motion vectors were estimated to detect the dynamic points. Then， the dynamic threshold strategy was used to perform Euclidean clustering to the point cloud. Finally， the results of dynamic point detection and point cloud segmentation were integrated to completely extract the moving objects. In addition， the Octomap tool was used to convert the point cloud map into a 3D grid map in order to complete the map construction. Through the experimental results and data analysis， it can be seen that the proposed method can effectively improve the integrity of moving object detection， and reconstruct a low-loss， highly-practical static grid map.

Visual simultaneous localization and mapping based on semantic and optical flow constraints in dynamic scenes

Hao FU, Hegen XU, Zhiming ZHANG, Shaohua QI

2021, 41(11): 3337-3344. DOI: 10.11772/j.issn.1001-9081.2021010003

Asbtract ( )

HTML ( )

PDF (2125KB) ( )

Figures and Tables | References | Related Articles | Metrics

For the localization and static semantic mapping problems in dynamic scenes， a Simultaneous Localization And Mapping （SLAM） algorithm in dynamic scenes based on semantic and optical flow constraints was proposed to reduce the impact of moving objects on localization and mapping. Firstly， for each frame of the input， the masks of the objects in the frame were obtained by semantic segmentation， then the feature points that do not meet the epipolar constraint were filtered out by the geometric method. Secondly， the dynamic probability of each object was calculated by combining the object masks with the optical flow， the feature points were filtered by the dynamic probabilities to obtain the static feature points， and the static feature points were used for the subsequent camera pose estimation. Then， the static point cloud was created based on RGB-D images and object dynamic probabilities， and the semantic octree map was built by combining the semantic segmentation. Finally， the sparse semantic map was created based on the static point cloud and the semantic segmentation. Test results on the public TUM dataset show that， in highly dynamic scenes， the proposed algorithm improves the performance on both the absolute trajectory error and relative pose error by more than 95% compared with ORB-SLAM2， and reduces the absolute trajectory error by 41% and 11% compared with DS-SLAM and DynaSLAM respectively， which verifies that the proposed algorithm has better localization accuracy and robustness in highly dynamic scenes. The experimental results of mapping show that the proposed algorithm creates a static semantic map， and the storage space requirement of the sparse semantic map is reduced by 99% compared to that of the point cloud map.

Solar speckle image deblurring method with gradient guidance based on generative adversarial network

Fuhai LI, Murong JIANG, Lei YANG, Junyi CHEN

2021, 41(11): 3345-3352. DOI: 10.11772/j.issn.1001-9081.2020121898

Asbtract ( )

HTML ( )

PDF (1303KB) ( )

Figures and Tables | References | Related Articles | Metrics

With the existing deep learning algorithms， it is difficult to restore the highly blurred solar speckle images taken by Yunnan Observatories， and it is difficult to reconstruct the high-frequency information of images. In order to solve the problems， a deblurring method for restoring the solar speckle images and recovering the high-frequency information of images based on Generative Adversarial Network （GAN） and gradient information was proposed. The proposed method was consisted of one generator and two discriminators. Firstly， the image multi-scale features were obtained by the generator with the Feature Pyramid Network （FPN） framework， and these features were input into the gradient branch hierarchically to capture the smaller details in the form of gradient map， and the solar speckle image with high-frequency information was reconstructed by combining the gradient branch results and the FPN results. Then， based on the conventional adversarial discriminator， another discriminator was added to ensure the gradient map generated by the gradient branch more realistic. Finally， a joint training loss including pixel content loss， perceptual loss and adversarial loss was introduced to guide the model to perform high-resolution reconstruction of solar speckle images. Experimental results show that， compared with the existing deep learning deblurring method， the proposed method with image preprocessing has stronger ability to recover the high-frequency information， and significantly improves the Peak Signal-to-Noise Ratio （PSNR） and Structural SIMilarity （SSIM） indicators， reaching 27.801 0 dB and 0.851 0 respectively. The proposed method can meet the needs for high-resolution reconstruction of solar observation images.

Global-scale radar data restoration algorithm based on total variation and low-rank group sparsity

Chenyu GE, Liang DONG, Yikun XU, Yi CHANG, Hongming ZHANG

2021, 41(11): 3353-3361. DOI: 10.11772/j.issn.1001-9081.2020122047

Asbtract ( )

HTML ( )

PDF (3343KB) ( )

Figures and Tables | References | Related Articles | Metrics

The mixed noise formed by a large number of spikes， speckles and multi-directional stripe errors in Shuttle Radar Terrain Mission （SRTM） will cause serious interference to the subsequent applications. In order to solve the problem， a Low-Rank Group Sparsity_Total Variation （LRGS_TV） algorithm was proposed. Firstly， the uniqueness of the data in the local range low-rank direction was used to regularize the global multi-directional stripe error structure， and the variational idea was used to perform unidirectional constraints. Secondly， the non-local self-similarity of the weighted kernel norm was used to eliminate the random noise， and the Total Variation （TV） regularity was combined to constrain the data gradient， so as to reduce the difference of local range changes. Finally， the low-rank group sparse model was solved by the alternating direction multiplier optimization to ensure the convergence of model. Quantitative evaluation shows that， compared with four algorithms such as TV， Unidirectional Total Variation （UTV）， Low-Rank-based Single-Image Decomposition （LRSID） and Low-Rank Group Sparsity （LRGS） model， the proposed LRGS_TV has the Peak Signal-to-Noise Ratio （PSNR） of 38.53 dB and the Structural SIMilarity （SSIM） of 0.97， which are both better than the comparison algorithms. At the same time， the slope and aspect results show that after LRGS_TV processing， the subsequent applications of the data can be significantly improved. The experimental results show that， the proposed LRGS_TV can repair the original data better while ensuring that the terrain contour features are basically unchanged， and can provide important support to the reliability improvement and subsequent applications of SRTM.

Super-resolution and multi-view fusion based on magnetic resonance image inter-layer interpolation

Meng LI, Pinle QIN, Jianchao ZENG, Junbo LI

2021, 41(11): 3362-3367. DOI: 10.11772/j.issn.1001-9081.2020122065

Asbtract ( )

HTML ( )

PDF (650KB) ( )

Figures and Tables | References | Related Articles | Metrics

The high resolution in Magnetic Resonance （MR） image slices and low resolution between the slices lead to the lack of medical diagnostic significance of MR in the coronal and sagittal planes. In order to solve the problem， a medical image processing algorithm based on inter-layer interpolation and multi-view fusion network was proposed. Firstly， the inter-layer interpolation module was introduced to cut the MR volume data from three-dimensional data into two-dimensional images along the coronal and sagittal directions. Then， after the feature extraction on the coronal and sagittal planes， the weights were dynamically calculated by the spatial matrix filter and used for upsampling factor with any size to magnify the image. Finally， the results of the coronal and sagittal images obtained in the inter-layer interpolation module were aggregated into three-dimensional data and then cut into two-dimensional images along the axial direction. The obtained two-dimensional images were fused in pairs and corrected by the axial direction data. Experimental results show that， compared with other super-resolution algorithms， the proposed algorithm has improved the Peak Signal-to-Noise Ratio （PSNR） by about 1 dB in ×2， ×3， and ×4 scales. It can be seen that the proposed algorithm can effectively improve the quality of image reconstruction.

Universal vector flow mapping method combined with deep learning

Bo PENG, Yaru LUO, Shenghua XIE, Lixue YIN

2021, 41(11): 3368-3375. DOI: 10.11772/j.issn.1001-9081.2021010045

Asbtract ( )

HTML ( )

PDF (1719KB) ( )

Figures and Tables | References | Related Articles | Metrics

The traditional ultrasound Vector Flow Mapping （VFM） technology has the limitation that it requires the proprietary software to obtain raw Doppler and speckle tracking data. In order to solve the problem， a universal VFM method combined with deep learning was proposed. Firstly， the velocity scale was used to obtain the velocities along the acoustic beam direction provided by the color Doppler echocardiogram as the radial velocity components. Then， the U-Net model was used to automatically identify the contour of the left ventricular wall， the left ventricular wall velocities were calculated by the retrained CNNs for optical flow using Pyramid， Warping， and Cost volume （PWC-Net） model as the boundary condition of the continuity equation， and the velocity component of each blood particle perpendicular to the acoustic beam direction （that was the tangential velocity component） was obtained by solving the continuity equation. Finally， the velocity vector map of the heart flow field was synthesized， and the visualization of the streamline chart of the heart flow field was realized. Experimental results show that， the velocity vector map and streamline chart of the heart flow field obtained by the proposed method can accurately reflect the corresponding time phases of left ventricular， the obtained visualized results are consistent with the analysis results of the VFM workstation provided by Aloka， and conform to the characteristics of left ventricular fluid dynamics. As a universal and fast VFM method， the proposed method do not need any vendor’s technical support and proprietary software， and can further promote the application of VFM in clinical workflow.

Time-space distribution identification method of taxi shift based on trajectory data

Fumin ZOU, Sijie LUO, Zhihui CHEN, Lyuchao LIAO

2021, 41(11): 3376-3384. DOI: 10.11772/j.issn.1001-9081.2020122004

Asbtract ( )

HTML ( )

PDF (1483KB) ( )

Figures and Tables | References | Related Articles | Metrics

Concerning the problem of inaccurate identification of taxi shift behaviors， an accurate identification method of taxi shift behaviors based on trajectory data mining was proposed. Firstly， after analyzing the characteristics of taxi parking state data， a method for detecting taxi parking points in non-operating state was proposed. Secondly， by clustering the parking points， the potential taxi shift locations were obtained. Finally， based on the judgment indices of taxi shift event and the kernel density estimation of the taxi shift time， the locations and times of the taxi shift were identified effectively. Taking the trajectory data of 4 416 taxis in Fuzhou as the experimental samples， a total of 5 639 taxi shift locations were identified. These taxi shift locations are in the main working areas of citizens， transportation hubs， business districts and scenic spots. And the identified taxi shift time is mainly from 4：00 to 6：00 in the morning and from 16：00 to 18：00 in the evening， which is consistent with the travel patterns of Fuzhou citizens. Experimental results show that， the proposed method can effectively detect the time-space distribution of taxi shift， and provide reasonable suggestions for the planning and management of urban traffic resources. The proposed method can also help the people to take a taxi more conveniently， improve the operating efficiency of taxis， and provide references for the site selection optimization of urban gas stations， charging stations and other car related facilities.

Ship stowage optimization centered on automated terminal

Yi DING, Cong WANG

2021, 41(11): 3385-3393. DOI: 10.11772/j.issn.1001-9081.2020121897

Asbtract ( )

HTML ( )

PDF (694KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the low efficiency of ship stowage in automated terminals， a new Fixed Set Search （FSS） algorithm based on ship stowage characteristics was proposed in order to improve the utilization of equipment resources. Firstly， on the basis of considering the general principles of ship stowage， by introducing the block operation balance factor and taking the minimization of the number of rehandles and total loading on board time with as much block operation balance as possible as the objectives， a mixed integer programming model of ship stowage in automated terminals was established based on the quay crane working plan. Then， the optimal solution was searched by fixing the elements that appeared repeatedly in the better solutions. Experimental results show that， under the instances with different scales， compared with Cplex， the proposed FSS algorithm has the rehandle number and unbalanced container number reduced by 22.3% and 11.7% on average respectively， and the objective function value optimized by 6.5% on average.Compared with the Particle Swarm Optimization （PSO） algorithm， Genetic Algorithm （GA） and Ant Colony Optimization （ACO） algorithm， the proposed FSS algorithm has the objective function value optimized by 2.1% on average， highlighting the higher stowage efficiency of the FSS algorithm. In order to increase the diversity of instances， the distribution and proportion of block stacks were adjusted. Under this circumstance， compared with the above three algorithms， the FSS algorithm has the number of unbalanced containers reduced by 19.3% on average， and has higher utilization of equipment resources.

Review of computer-aided face diagnosis for obstructive sleep apnea in children

Jin ZHAO, Wen’ai SONG, Jun TAI, Jijiang YANG, Qing WANG, Xiaodan LI, Yi LEI, Yue QIU

2021, 41(11): 3394-3401. DOI: 10.11772/j.issn.1001-9081.2020121963

Asbtract ( )

HTML ( )

PDF (663KB) ( )

Figures and Tables | References | Related Articles | Metrics

Using face images in the diagnosis of Obstructive Sleep Apnea （OSA） in children can reduce the burden of doctors and improve the accuracy of diagnosis. Firstly， the current methods and their limitations of OSA in children clinical diagnosis were briefly described. Then， on the basis of studying the existing methods， combining with the methods of computer-aided face diagnosis of other diseases， the computer-aided face diagnosis methods of OSA in children were divided into three types： traditional computer-aided face diagnosis methods， transfer learning based diagnosis methods， and 3D face data based diagnosis methods. The key steps of the three types of methods were summarized， and the methods used in these key steps were compared， which provides different entry points for the future research of computer-aided face diagnosis for OSA in children. Finally， the opportunities and challenges in the future research of OSA in children diagnosis were analyzed.

Pulse condition recognition method based on optimized reinforcement learning path feature classification

Jiaqi ZHANG, Yueqin ZHANG, Jian CHEN

2021, 41(11): 3402-3408. DOI: 10.11772/j.issn.1001-9081.2021010008

Asbtract ( )

HTML ( )

PDF (606KB) ( )

Figures and Tables | References | Related Articles | Metrics

Pulse condition recognition is one of the important ways of traditional Chinese medical diagnosis. For a long time， recognizing pulse condition based on personal experience restricts the promotion and development of traditional Chinese medicine. Therefore， the researches on using sensing devices for recognizing pulse condition are more and more. In order to solve the problems such as large training datasets， “black box” processing and high time cost in the research of recognizing pulse condition by neural network， a new pulse condition diagram analysis method using Markov decision and Monte Carlo search on the framework of reinforcement learning was proposed. Firstly， based on the theory of traditional Chinese medicine， the paths of specific pulse conditions were classified， and then the representative features for different paths were selected on this basis. Finally， the pulse condition recognition was realized by comparing the threshold values of the representative features. Experimental results show that， the proposed method can reduce the training time and the required resources， retain the complete experience track， and can solve the “black box” problem during the data processing with the accuracy of pulse condition recognition improved.

Table of Content