Journal of Computer Applications

Network representation learning model based on node attribute bipartite graph

Le ZHOU, Tingting DAI, Chun LI, Jun XIE, Boce CHU, Feng LI, Junyi ZHANG, Qiao LIU

2022, 42(8): 2311-2318. DOI: 10.11772/j.issn.1001-9081.2021060972

Asbtract ( )

HTML ( )

PDF (843KB) ( )

Figures and Tables | References | Related Articles | Metrics

It is an important task to carry out reasoning and calculation on graph structure data. The main challenge of this task is how to represent graph-structured knowledge so that machines can easily understand and use graph structure data. After comparing the existing representation learning models， it is found that the models based on random walk methods are likely to ignore the special effect of attributes on the association between nodes. Therefore， a hybrid random walk method based on node adjacency and attribute association was proposed. Firstly the attribute weights were calculated through the common attribute distribution among adjacent nodes， and the sampling probability from the node to each attribute was obtained. Then， the network information was extracted from adjacent nodes and non-adjacent nodes with common attributes respectively. Finally， the network representation learning model based on node attribute bipartite graph was constructed， and the node vector representations were obtained through the above sampling sequence learning. Experimental results on Flickr， BlogCatalog and Cora public datasets show that the Micro-F1 average accuracy of node classification by the node vector representations obtained by the proposed model is 89.38%， which is 2.02 percentage points higher than that of GraphRNA （Graph Recurrent Networks with Attributed random walk） and 21.12 percentage points higher than that of classical work DeepWalk. At the same time， by comparing different random walk methods， it is found that increasing the sampling probabilities of attributes that promote node association can improve the information contained in the sampling sequence.

Adversarial example generation method based on image flipping transform

Bo YANG, Hengwei ZHANG, Zheming LI, Kaiyong XU

2022, 42(8): 2319-2325. DOI: 10.11772/j.issn.1001-9081.2021060993

Asbtract ( )

HTML ( )

PDF (1609KB) ( )

Figures and Tables | References | Related Articles | Metrics

In the face of adversarial example attack， deep neural networks are vulnerable. These adversarial examples result in the misclassification of deep neural networks by adding human-imperceptible perturbations on the original images， which brings a security threat to deep neural networks. Therefore， before the deployment of deep neural networks， the adversarial attack is an important method to evaluate the robustness of models. However， under the black-box setting， the attack success rates of adversarial examples need to be improved， that is， the transferability of adversarial examples need to be increased. To address this issue， an adversarial example method based on image flipping transform， namely FT-MI-FGSM （Flipping Transformation Momentum Iterative Fast Gradient Sign Method）， was proposed. Firstly， from the perspective of data augmentation， in each iteration of the adversarial example generation process， the original input image was flipped randomly. Then， the gradient of the transformed images was calculated. Finally， the adversarial examples were generated based on this gradient， so as to alleviate the overfitting in the process of adversarial example generation and to improve the transferability of adversarial examples. In addition， the method of attacking ensemble models was used to further enhance the transferability of adversarial examples. Extensive experiments on ImageNet dataset demonstrated the effectiveness of the proposed algorithm. Compared with I-FGSM （Iterative Fast Gradient Sign Method） and MI-FGSM （Momentum I-FGSM）， the average black-box attack success rate of FT-MI-FGSM on the adversarially training networks is improved by 26.0 and 8.4 percentage points under the attacking ensemble model setting， respectively.

Time series prediction model based on multimodal information fusion

Minghui WU, Guangjie ZHANG, Canghong JIN

2022, 42(8): 2326-2332. DOI: 10.11772/j.issn.1001-9081.2021061053

Asbtract ( )

HTML ( )

PDF (658KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problem that traditional single factor methods cannot make full use of the relevant information of time series and has the poor accuracy and reliability of time series prediction， a time series prediction model based on multimodal information fusion，namely Skip-Fusion， was proposed to fuse the text data and numerical data in multimodal data. Firstly， different types of text data were encoded by pre-trained Bidirectional Encoder Representations from Transformers （BERT） model and one-hot encoding. Then， the single vector representation of the multi-text feature fusion was obtained by using the pre-trained model based on global attention mechanism. After that， the obtained single vector representation was aligned with the numerical data in time order. Finally， the fusion of text and numerical features was realized through Temporal Convolutional Network （TCN） model， and the shallow and deep features of multimodal data were fused again through skip connection. Experiments were carried out on the dataset of stock price series， Skip-Fusion model obtains the results of 0.492 and 0.930 on the Root Mean Square Error （RMSE） and daily Return （R） respectively， which are better than the results of the existing single-modal and multimodal fusion models. Experimental results show that Skip-Fusion model obtains the goodness of fit of 0.955 on the R-squared， indicating that Skip-Fusion model can effectively carry out multimodal information fusion and has high accuracy and reliability of prediction.

Container throughput prediction based on optimal variational mode decomposition and kernel extreme learning machine

Fengting ZHANG, Juhua YANG, Jinhui REN, Kun JIN

2022, 42(8): 2333-2342. DOI: 10.11772/j.issn.1001-9081.2021050816

Asbtract ( )

HTML ( )

PDF (1097KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the complexity of port container throughput data， a short-term hybrid prediction model of container throughput based on Optimal Variational Mode Decomposition （OVMD） and Kernel Extreme Learning Machine （KELM） was proposed. Firstly， the outliers were removed by Hampel Identifier （HI） from the original time series， and the preprocessed series was decomposed into several sub-modes with obvious characteristics by OVMD. Then， in order to improve the prediction efficiency， the decomposed sub-modes were divided into three categories according to the values of Sample Entropy （SE）： high frequency low amplitude， medium frequency medium amplitude and low frequency high amplitude. At the same time， the wavelet， Gauss and linear kernel functions carried in KELM were used to capture the trends of sub-modes with different characteristics. Finally， the final prediction result was obtained by linearly adding the prediction results of all sub- modes together. Taking the monthly container throughput data at Shenzhen Port as a sample for empirical research， the proposed model has the Mean Absolute Error （MAE） of 0.914?9， the Mean Absolute Percentage Error （MAPE） of 0.199%， the Root Mean Square Error （RMSE） of 7.886?0 and the coefficient of determination （R²） of 0.994?4. Compared with four comparison models， the proposed model has advantages in prediction accuracy and efficiency. At the same time， it overcomes the mode mixing problem in traditional Complementary Ensemble Empirical Mode Decomposition （CEEMD） and Ensemble Empirical Mode Decomposition （EEMD） as well as overfitting defect in Extreme Learning Machine （ELM）， and has practical application potential.

Time series classification by LSTM based on multi-scale convolution and attention mechanism

Yinglü XUAN, Yuan WAN, Jiahui CHEN

2022, 42(8): 2343-2352. DOI: 10.11772/j.issn.1001-9081.2021061062

Asbtract ( )

HTML ( )

PDF (711KB) ( )

Figures and Tables | References | Related Articles | Metrics

The multi-scale features of time series contain abundant category information which has different importance for classification. However， the existing univariate time series classification models conventionally extract series features by convolutions with a fixed kernel size， resulting in being unable to acquire and focus on important multi-scale features effectively. In order to solve the above problem， a Multi-scale Convolution and Attention mechanism （MCA） based Long Short-Term Memory （LSTM） model （MCA-LSTM） was proposed， which was capable of concentrating and fusing important multi-scale features to achieve more accurate classification effect. In this structure， by using LSTM， the transmission of series information was controlled through memory cells and gate mechanism， and the correlation information of time series was extracted fully； by using Multi-scale Convolution Module （MCM）， the multi-scale features of the series were extracted through Convolutional Neural Networks （CNNs） with different kernel sizes； by using Attention Module （AM）， the channel information was fused to obtain the importance of features and assign attention weights， which enabled the network to focus on important time series features. Experimental results on 65 univariate time series datasets of UCR archive show that compared with the state-of-the-art time series classification methods： Unsupervised Scalable Representation Learning-FordA （USRL-FordA）， Unsupervised Scalable Representation Learning-Combined （1-Nearest Neighbor）（USRL-Combined （1-NN））， Omni-Scale Convolutional Neural Network （OS-CNN）， Inception-Time and Robust Temporal Feature Network for time series classification （RTFN），MCA-LSTM has the Mean Error （ME） reduced by 7.48， 9.92， 2.43， 2.09 and 0.82 percentage points， respectively； and achieved the highest Arithmetic Mean Rank （AMR） and Geometric Mean Rank （GMR）， which are 2.14 and 3.23 respectively. These results fully demonstrate the effectiveness of MCA-LSTM in the classification of univariate time series.

Lightweight attention mechanism module based on squeeze and excitation

Zhenhu LYU, Xinzheng XU, Fangyan ZHANG

2022, 42(8): 2353-2360. DOI: 10.11772/j.issn.1001-9081.2021061037

Asbtract ( )

HTML ( )

PDF (1124KB) ( )

Figures and Tables | References | Related Articles | Metrics

Focusing on the issue that embedding the attention mechanism module into Convolutional Neural Network （CNN） to improve the application accuracy will increase the parameters and the computational cost， the lightweight Height Dimensional Squeeze and Excitation （HD-SE） module and Width Dimensional Squeeze and Excitation （WD-SE） module based on squeeze and excitation were proposed. To make full use of the potential information in the feature maps， two kinds of height and width dimensional weight information of feature maps was respectively extracted by HD-SE and WD-SE through squeeze and excitation operations， then the obtained weight information was respectively applied to corresponding tensors of the feature maps of two dimensions to improve the application accuracy of the model. Experiments were implemented on CIFAR10 and CIFAR100 datasets after embedding HD-SE and WD-SE into Visual Geometry Group 16 （VGG16）， Residual Network 56 （ResNet56）， MobileNetV1 and MobileNetV2 models respectively. Experimental results show fewer parameters and computational cost added by HD-SE and WD-SE to the network models when the models achieve the same or even better accuracy， compared with the state-of-the-art attention mechanism modules， such as Squeeze and Excitation （SE） module， Coordinate Attention （CA） block， Convolutional Block Attention Module （CBAM） and Efficient Channel Attention （ECA） module.

Decision optimization of traffic scenario problem based on reinforcement learning

Fei LUO, Mengwei BAI

2022, 42(8): 2361-2368. DOI: 10.11772/j.issn.1001-9081.2021061012

Asbtract ( )

HTML ( )

PDF (735KB) ( )

Figures and Tables | References | Related Articles | Metrics

The traditional reinforcement learning algorithm has limitations in convergence speed and solution accuracy when solving the taxi path planning problem and the traffic signal control problem in traffic scenarios. Therefore， an improved reinforcement learning algorithm was proposed to solve this kind of problems. Firstly， by applying the optimized Bellman equation and Speedy Q-Learning （SQL） mechanism， and introducing experience pool technology and direct strategy， an improved reinforcement learning algorithm， namely Generalized Speedy Q-Learning with Direct Strategy and Experience Pool （GSQL-DSEP）， was proposed. Then， GSQL-DSEP algorithm was applied to optimize the path length in the taxi path planning decision problem and the total waiting time of vehicles in the traffic signal control problem. The error of GSQL-DSEP algorithm was reduced at least 18.7% than those of the algorithms such as Q-learning， SQL， Generalized Speedy Q-Learning （GSQL） and Dyna-Q， the decision path length determined by GSQL-DSEP algorithm was reduced at least 17.4% than those determined by the compared algorithms， and the total waiting time of vehicles determined by GSQL-DSEP algorithm was reduced at most 51.5% than those determined by compared algorithms for the traffic signal control problem. Experimental results show that， GSQL-DSEP algorithm has advantages in solving traffic scenario problems over the compared algorithms.

TODIM group decision-making method under trust network

Yicong LIU, Junfeng CHU, Yanyan WANG, Yingming WANG

2022, 42(8): 2369-2377. DOI: 10.11772/j.issn.1001-9081.2021050872

Asbtract ( )

HTML ( )

PDF (644KB) ( )

Figures and Tables | References | Related Articles | Metrics

To make use of the social relationship between experts and to consider the limited rationality of decision-making experts in group decision-making， a TODIM （TOmada de Decis?o Interativa Multicritério） group decision-making method under trust network was proposed. Firstly， according to the number of discussions of the experts， in each discussion， each expert would refer to his/her trustee’s decision matrix according to the degree of trust acceptance， and the decision matrices would be modified through information interaction and negotiation. Then， when the set number of expert discussions was met， the final group decision-making matrix was calculated. Finally， the TODIM group decision-making method under trust network and TODIM group decision-making method were applied to calculate the ranking results of different schemes. The ranking results were compared and analyzed， and the sensitivity analysis was performed on the number of expert discussions and trust acceptance. The case analysis results show that the TODIM group decision-making method under trust network can fully integrate trust network， ensure the multi-stage information interaction and feedback process in the decision-making process， and is superior to the general TODIM group decision-making method in comparison analysis and sensitivity analysis.

Traffic sign detection algorithm based on improved attention mechanism

Xinyu ZHANG, Sheng DING, Zhipei YANG

2022, 42(8): 2378-2385. DOI: 10.11772/j.issn.1001-9081.2021061005

Asbtract ( )

HTML ( )

PDF (1664KB) ( )

Figures and Tables | References | Related Articles | Metrics

In some scenes， the low resolution， coverage and other environmental factors of traffic signs lead to missed and false detections in object detection tasks. Therefore， a traffic sign detection algorithm based on improved attention mechanism was proposed. First of all， in response to the problem of low image resolution due to damage， lighting and other environmental impacts of traffic signs， which leaded to the limited extraction of image feature information by the network， an attention module was added to the backbone network to enhance the key features of the object area. Secondly， the local features between adjacent channels in the feature map had a certain correlation due to the overlap of the receptive fields， a one-dimensional convolution of size k was used to replace the fully connected layer in the channel attention module to aggregate different channel information and reduce the number of additional parameters. Finally， the receptive field module was introduced in the medium- and small-scale feature layers of Path Aggregation Network （PANet） to increase the receptive field of the feature map to fuse the context information of the object area and improve the network’s ability to detect traffic signs. Experimental results on CSUST Chinese Traffic Sign Detection Benchmark （CCTSDB） dataset show that the proposed improved You Only Look Once v4 （YOLOv4） algorithm achieve an average detection speed with a small amount of parameters introduced and the detection speed is not much different from that of the original algorithm. The mean Accuracy Precision （mAP） reached 96.88%， which was increased by 1.48%； compared with the lightweight network YOLOv5s， with the single frame detection speed of 10?ms slower， the mAP of the proposed algorithm is 3.40 percentage points higher than that of YOLOv5s， and the speed reached 40?frame/s， indicating that the algorithm meets the real-time requirements of object detection completely.

Semantic extraction of domain-dependent mathematical text

Xiaoyu CHEN, Wei WANG

2022, 42(8): 2386-2393. DOI: 10.11772/j.issn.1001-9081.2021060924

Asbtract ( )

HTML ( )

PDF (791KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problem of insufficient acquisition of document semantic information in the field of science and technology，a set of rule-based methods for extracting semantics from domain-dependent mathematical text were proposed. Firstly， domain concepts were extracted from the text and semantic mapping between mathematical entities and domain concepts were realized. Secondly， through context analysis for mathematical symbols， entity mentions or corresponding text descriptions of mathematical symbols were obtained and the semantics of the symbols were extracted. Finally， the semantic analysis of expressions was completed based on the extracted semantics of mathematical symbols. Taking linear algebra texts as research examples， a semantic tagging dataset was constructed for experiments. Experimental results show that the proposed methods achieve a precision higher than 93% and a recall higher than 91% on semantic extraction of identifiers， linear algebra entities and expressions.

Handwritten English text recognition based on convolutional neural network and Transformer

Xianjie ZHANG, Zhiming ZHANG

2022, 42(8): 2394-2400. DOI: 10.11772/j.issn.1001-9081.2021091564

Asbtract ( )

HTML ( )

PDF (703KB) ( )

Figures and Tables | References | Related Articles | Metrics

Handwritten text recognition technology can transcribe handwritten documents into editable digital documents. However， due to the problems of different writing styles， ever-changing document structures and low accuracy of character segmentation recognition， handwritten English text recognition based on neural networks still faces many challenges. To solve the above problems， a handwritten English text recognition model based on Convolutional Neural Network （CNN） and Transformer was proposed. Firstly， CNN was used to extract features from the input image. Then， the features were input into the Transformer encoder to obtain the prediction of each frame of the feature sequence. Finally， the Connectionist Temporal Classification （CTC） decoder was used to obtain the final prediction result. A large number of experiments were conducted on the public Institut für Angewandte Mathematik （IAM） handwritten English word dataset. Experimental results show that this model obtains a Character Error Rate （CER） of 3.60% and a Word Error Rate （WER） of 12.70%， which verify the feasibility of the proposed model.

Multi-source and multi-label pedestrian attribute recognition based on domain adaptation

Nanjiang CHENG, Zhenxia YU, Lin CHEN, Hezhe QIAO

2022, 42(8): 2401-2406. DOI: 10.11772/j.issn.1001-9081.2021060950

Asbtract ( )

HTML ( )

PDF (658KB) ( )

Figures and Tables | References | Related Articles | Metrics

The current public datasets of Pedestrian Attribute Recognition （PAR） have the characteristics of complicated attribute annotations and various collection scenarios， leading to the large variations of the pedestrian attributes in different datasets， so that it is hard to directly utilize the existing labeled information in the public datasets for PAR in practice. To address this issue， a multi-source and multi-label PAR method based on domain adaptation was proposed. Firstly， to transfer the styles of the different datasets into a unified one， the features of the samples were aligned by the domain adaption method. Then， a multi-attribute one-hot coding and weighting algorithm was proposed to align the labels with the common attribute in multiple datasets. Finally， the multi-label semi-supervised loss function was combined to perform joint training across datasets to improve the attribute recognition accuracy. The proposed feature alignment and label alignment algorithms were able to effectively solve the heterogeneity problem of attributes in multiple PAR datasets. Experimental results after aligning three pedestrian attribute datasets PETA， RAPv1 and RAPv2 with PA-100K dataset show that the proposed method improves the average accuracy by 1.22 percentage points， 1.62 percentage points and 1.53 percentage points respectively， compared to the method StrongBaseline， demonstrating that this method has a strong advantage in cross dataset PAR.

Lightweight human pose estimation based on attention mechanism

Kun LI, Qing HOU

2022, 42(8): 2407-2414. DOI: 10.11772/j.issn.1001-9081.2021061103

Asbtract ( )

HTML ( )

PDF (876KB) ( )

Figures and Tables | References | Related Articles | Metrics

To solve the problems such as large number of parameters and high computational complexity of the high-resolution human pose estimation networks， a lightweight Sandglass Coordinate Attention Network （SCANet） based on High-Resolution Network （HRNet） was proposed for human pose estimation. The Sandglass module and the Coordinate Attention （CoordAttention） module were first introduced； then two lightweight modules， the Sandglass Coordinate Attention bottleneck （SCAneck） module and the Sandglass Coordinate Attention basicblock （SCAblock） module， were built on this basis to obtain the long-range dependence and accurate position information of the spatial direction of the feature map while reducing the amount of model parameters and computational complexity. Experimental results show that with the same image resolution and environmental configuration， SCANet model reduces the number of parameters by 52.6% and the computational complexity by 60.6% compared with HRNet model on Common Objects in COntext （COCO） validation set； the number of parameters and computational complexity of SCANet model are reduced by 52.6% and 61.1% respectively compared with those of HRNet model on Max Planck Institute for Informatics （MPII） validation set； compared with common human pose estimation networks such as Stacked Hourglass Network （Hourglass）， Cascaded Pyramid Network （CPN） and SimpleBaseline， SCANet model can still achieve high-precision prediction of key points of the human body with fewer parameters and lower computational complexity.

Video facial landmark tracking by multi-view constrained cascade regression

Shaosheng DAI, Kun XIONG, Yunduo WU, Jiawei XIAO

2022, 42(8): 2415-2422. DOI: 10.11772/j.issn.1001-9081.2021060996

Asbtract ( )

HTML ( )

PDF (2970KB) ( )

Figures and Tables | References | Related Articles | Metrics

In recent years， the algorithms of detecting facial landmarks in static images have been greatly improved. However， facial landmark detection and tracking are still challenging due to the changes of the factors such as head posture， occlusion and illumination in real videos. In order to solve this problem， a video facial landmark tracking algorithm based on multi-view constrained cascade regression was proposed. Firstly， the 3-dimensional and 2-dimensional sparse point sets were used to establish a transformation relationship and estimate the initial shape. Secondly， due to the large posture difference between face images， affine transformation was used to correct the pose of the face images. When the shape regression model was constructed， the multi-view constrained cascade regression model was used to reduce the shape variance， so that the learned regression model had stronger robustness to the shape variance. Finally， a reinitialization mechanism was adopted， and Normalized Cross Correlation （NCC） template matching tracking algorithm was used to establish the shape relationship between consecutive frames when the feature points were correctly located. The experimental results on the public data set used for testing show that the average error of the proposed algorithm is less than 10% of the interocular distance.

Multi-scale object detection algorithm based on improved YOLOv3

Liying ZHANG, Chunjiang PANG, Xinying WANG, Guoliang LI

2022, 42(8): 2423-2431. DOI: 10.11772/j.issn.1001-9081.2021060984

Asbtract ( )

HTML ( )

PDF (1714KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to further improve the speed and precision of multi-scale object detection， and to solve the situations such as miss detection， wrong detection and repeated detection caused by small object detection， an object detection algorithm based on improved You Only Look Once v3 （YOLOv3） was proposed to realize automatic detection of multi-scale object. Firstly， the network structure was improved in the feature extraction network， and the attention mechanism was introduced into the spatial dimensions of residual module to pay attention to small objects. Then， Dense Convulutional Network （DenseNet） was used to fully integrate shallow information of the network， and the depthwise separable convolution was used to replace the normal convolution of the backbone network， thereby reducing the number of model parameters and improving the detection speed. In the feature fusion network， the bidirectional fusion of the shallow and deep features was realized through the bidirectional feature pyramid structure， and the 3-scale prediction was changed to 4-scale prediction， which improved the learning ability of multi-scale features. In terms of loss function， Generalized Intersection over Union （GIoU） was selected as the loss function， so that the precision of identifying objects was increased， and the object miss rate was reduced. Experimental results show that on Pascal VOC datasets， the mean Average Precision （mAP） of the improved YOLOv3 algorithm is as high as 83.26%， which is 5.89 percentage points higher than that of the original YOLOv3 algorithm， and the detection speed of the improved algorithm reaches 22.0 frame/s. Compared with the original YOLOv3 algorithm on Common Objects in COntext （COCO） dataset， the improved algorithm has the mAP improved by 3.28 percentage points. At the same time， in multi-scale object detection， the mAP of the algorithm has been improved， which verifies the effectiveness of the object detection algorithm based on the improved YOLOv3.

Recommendation model incorporating multimodal DeepWalk and bias calibration factor

Ziteng WU, Chengyun SONG

2022, 42(8): 2432-2439. DOI: 10.11772/j.issn.1001-9081.2021061086

Asbtract ( )

HTML ( )

PDF (799KB) ( )

Figures and Tables | References | Related Articles | Metrics

Exposure bias seriously affects the recommendation accuracy of collaborative filtering model， resulting in the prediction results deviating from the real interests of users. However， the modeling ability of the existing models for exposure bias is limited， and these models even magnify the bias. Therefore， a recommendation model that integrates Multimodal DeepWalk and Bias Calibration factor （MmDW-BC） was proposed. Firstly， the multimodal attribute features of items were introduced as the connected edges in item graph to alleviate the problem of interactive data sparsity of low-exposure items. On this basis， the graph embedding module， Multimodal DeepWalk （MmDW）， was constructed to obtain rich node representation by integrating item multimodal information into the embedding vectors. Finally， a new bias calibration algorithm was designed based on the calibration strategy to predict user preferences. Experimental results on Amazon and ML-1M datasets show that definitely considering exposure bias to improve the recommendation accuracy in MmDW-BC recommendation model is necessary and effective.

Rotary machine fault diagnosis based on improved residual convolutional auto-encoding network and class adaptation

Jian ZHANG, Peiyuan CHENG, Siyu SHAO

2022, 42(8): 2440-2449. DOI: 10.11772/j.issn.1001-9081.2021060905

Asbtract ( )

HTML ( )

PDF (1320KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the insufficient deep network model training problem caused by limited rotary machine sensor signal samples， a fault diagnosis model combining improved residual convolutional auto-encoding network and class adaption method was proposed to deal with the data with small sample size. Firstly， paired samples were created by a small number of labeled source domain data and target domain data， and an improved one-dimensional residual convolutional auto-encoding network was designed to extract features from two types of original vibration signals with different distributions. Secondly， the Maximum Mean Discrepancy （MMD） was used to reduce the distribution difference， and the data space of the same fault category from two domains was mapped to a common feature space. Finally， the accurate fault diagnosis was realized. Experimental results show that the proposed model is able to effectively improve the fault diagnosis accuracy of the target domain vibration data with few labels under different working conditions compared with the fine-tuning and domain adaptation methods.

Varied density clustering algorithm based on border point detection

Yanwei CHEN, Xingwang ZHAO

2022, 42(8): 2450-2460. DOI: 10.11772/j.issn.1001-9081.2021061083

Asbtract ( )

HTML ( )

PDF (10686KB) ( )

Figures and Tables | References | Related Articles | Metrics

The density clustering algorithm has been widely used because of its robustness to noise and the ability to find clusters of any shapes. However， in practical applications， this type of algorithms faces the problem of poor clustering effect due to the uneven distribution of the densities of different clusters in the dataset and the difficulty of distinguishing the borders between clusters. In order to solve the above problem， a Varied Density Clustering algorithm based on Border point Detection （VDCBD） was proposed. Firstly， the border points between varied density clusters were recognized based on the given relative density measurement method to enhance the separability of adjacent clusters. Secondly， the points in the non-border area were clustered to find the core class structures of the dataset. Secondly， the detected border points were allocated to the corresponding core class structures according to the principle of high-density neighbor allocation. Finally， the noise points in the dataset were recognized based on the class structure information. The proposed algorithm was compared and analyzed with the clustering algorithms such as K-means， Density-Based Spatial Clustering of Applications with Noise （DBSCAN）algorithm， Density Peaks Clustering Algorithm （DPCA）， CLUstering based on Backbone （CLUB）algorithm， Border Peeling clustering （BP）algorithm on artificial datasets and UCI datasets. Experimental results show that the proposed algorithm can effectively solve the problems of uneven distribution of density and indistinguishable borders， and is superior to the existing algorithms on the evaluation indicators of Adjusted Rand Index （ARI）， Normalized Mutual Information （NMI）， F-Measure （FM）， and Accuracy （ACC）； in the analysis of operating efficiency， when the data size is relatively large， the operating efficiency of VDCBD is higher than those of DPCA， CLUB and BP algorithms.

Deep asymmetric discrete cross-modal hashing method

Xiaoyu WANG, Zhanqing WANG, Wei XIONG

2022, 42(8): 2461-2470. DOI: 10.11772/j.issn.1001-9081.2021061017

Asbtract ( )

HTML ( )

PDF (1048KB) ( )

Figures and Tables | References | Related Articles | Metrics

Most deep supervised cross-modal hashing methods adopt a symmetric strategy to learn hash code， so that the supervision information in large-scale datasets cannot be used effectively. And for the problem of discrete constraints of hash code， relaxation-based strategy is typically adopted， resulting in large quantization error which leads to the sub-optimal hash code. Aiming at the above problems， a Deep Asymmetric Discrete Cross-modal Hashing （DADCH） method was proposed. Firstly， an asymmetric learning framework combining deep neural networks and dictionary learning was proposed to learn the hash code of query instances and database instances， thereby mining the supervision information of the data more effectively and reducing the training time of the model. Then， the discrete optimization algorithm was used to optimize the hash code matrix column by column to reduce the quantization error of the hash code binarization. At the same time， in order to fully mine the semantic information of the data， a label layer was added to the neural network for label prediction， and the semantic information embedding was used to embed discrimination information of different categories into the hash code through linear mapping to make the hash code more discriminative. Experimental results show that on IAPR-TC12， MIRFLICKR-25K and NUS-WIDE datasets， the mean Average Precision （mAP） of the proposed method on retrieval text by image is about 11.6， 5.2 and 14.7 percentage points higher than that of the advanced deep cross-modal retrieval method — Self-Supervised Adversarial Hashing （SSAH） proposed in recent years respectively.

Data management method for building internet of things based on Hashgraph

Xu WANG, Yumin SHEN, Xiaoyun XIONG, Peng LI, Jinlong WANG

2022, 42(8): 2471-2480. DOI: 10.11772/j.issn.1001-9081.2021060958

Asbtract ( )

HTML ( )

PDF (1264KB) ( )

Figures and Tables | References | Related Articles | Metrics

A Hashgraph-based data management method for building Internet of Things （IoT） was proposed to address the problems of severe lack of throughput and high response delay when applying blockchain to the building IoT scenarios. In this method， Directed Acyclic Graph （DAG） was used for data storage to increase the throughput performance of blockchain because of the high concurrency of schematic structure； Hashgraph algorithm was applied to reach consensus on the data stored in DAG to reduce the time consumption of consensus； the smart contracts were designed to realize access control to prevent unauthorized users from operating data. Caliper， a blockchain performance testing tool， was adopted for performance test. The results show that in a medium-scale simulation environment with 32 nodes， the throughput of the proposed method is 1 063.1 transactions per second， which is 6 times and 3 times than that of the edge computing and the cross-chain methods； the data storage delay and control delay of the proposed method are 4.57 seconds and 4.92 seconds respectively， indicating that the proposed method has the response speed better than the comparison methods； and the transaction success rate of this method reaches 87.4% in spike testing. At the same time， the prototype system based on this method can run stably for 120 hours in stability testing. The above illustrates that the proposed method can effectively improve the throughput and response speed of blockchain， and meets actual needs in the building IoT scenarios.

Data storage scheme based on hybrid algorithm blockchain and node identity authentication

Hongliang TIAN, Jiayue WANG, Chenxi LI

2022, 42(8): 2481-2486. DOI: 10.11772/j.issn.1001-9081.2021061127

Asbtract ( )

HTML ( )

PDF (650KB) ( )

Figures and Tables | References | Related Articles | Metrics

To enhance the integrity and security of cloud data storage， a data storage scheme based on hybrid algorithm blockchain and a decentralized framework integrating identity authentication and privacy protection were proposed in Wireless Sensor Network （WSN）. Firstly， the collected information was transmitted to the base station by the cluster heads， and all the key parameters were recorded on the distributed blockchain and transmitted to the cloud storage by the base station. Then， in order to obtain a higher security level， the 160-bit key of Elliptic Curve Cryptography （ECC） and the 128-bit key of Advanced Encryption Standard （AES） were combined， and the key pairs were exchanged between the cloud storage layers. The proposed blockchain is based on a hybrid algorithm and combined with an identity verification scheme， which can well ensure the secure storage of cloud data， thus achieving excellent security. In addition， malicious nodes were able to be directly removed from the blockchain and also their authentication was able to be revoked through the base stations. And this operation is convenient and fast. Simulation results show that compared with schemes of decentralized Blockchain Information Management （BIM） scheme， secure localization algorithm based on trust and Decentralized Blockchain Evaluation （DBE） and Key Derivation Encryption and Data Analysis （KDE-DA） management scheme， the proposed scheme has some advantages in delay， throughput and computational overhead.

Review of mobile edge caching optimization technologies for 5G/Beyond 5G

Yanpei LIU, Ningning CHEN, Yunjing ZHU, Liping WANG

2022, 42(8): 2487-2500. DOI: 10.11772/j.issn.1001-9081.2021060952

Asbtract ( )

HTML ( )

PDF (2498KB) ( )

Figures and Tables | References | Related Articles | Metrics

With the widespread use of mobile devices and emerging mobile applications， the exponential growth of traffic in mobile networks has caused problems such as network congestion， large delay， and poor user experience that cannot satisfy the needs of mobile users. Edge caching technology can greatly relieve the transmission pressure of wireless networks through the reuse of hot contents in the network. At the same time， it has become one of the key technologies in 5G/Beyond 5G Mobile Edge Computing （MEC） to reduce the network delay of user requests and thus improve the network experience of users. Focusing on mobile edge caching technology， firstly， the application scenarios， main characteristics， execution process， and evaluation indicators of mobile edge caching were introduced. Secondly， the edge caching strategies with energy efficiency， delay， hit ratio， and revenue maximization as optimization goals were analyzed and compared， and their key research points were summarized. Thirdly， the deployment of the MEC servers supporting 5G was described， based on this， the green mobility-aware caching strategy in 5G network and the caching strategy in 5G heterogeneous cellular network were analyzed. Finally， the research challenges and future development directions of edge caching strategies were discussed from the aspects of security， mobility-aware caching， edge caching based on reinforcement learning and federated learning and edge caching for Beyond 5G/6G networks.

Hierarchical resource allocation mechanism of cooperative mobile edge computing

Jieqin WANG, Shihyang LIN, Shiming PENG, Shuo JIA, Miaohui YANG

2022, 42(8): 2501-2510. DOI: 10.11772/j.issn.1001-9081.2021060901

Asbtract ( )

HTML ( )

PDF (1262KB) ( )

Figures and Tables | References | Related Articles | Metrics

Concerning the large number of computing needs of vehicle task offloading and the limited computing capacity of local edge servers in the Internet of Vehicles （IoV）， a Hierarchical Resource Allocation Mechanism of cooperative mobile edge computing （HRAM） was proposed. In this algorithm， the computing resources of Mobile Edge Computing （MEC） servers were reasonably allocated and effectively utilized with a multi-layer architecture，so that the data multi-hop forwarding delay between different MEC servers was reduced， and the delay of task offloading requests was optimized. Firstly， the system model， communication model， decision model， and calculation model of the IoV edge computing were built. Next， the Analytic Hierarchy Process （AHP） was used to comprehensively consider multiple factors to determine the target server the offloaded task transferred to. Finally， a task routing strategy with dynamic weights was proposed to make use of communication capabilities of the overall network to shorten the request delay of task offloading. Simulation results show that compared with Resource Allocation of Task Offloading in Single-hop （RATOS） algorithm and Resource Allocation of Task Offloading in Multi-hop （RATOM） algorithm， HRAM algorithm reduces the request delay of task offloading by 40.16% and 19.01% respectively， and this algorithm can satisfy the computing needs of more offloaded tasks under the premise of meeting the maximum tolerable delay.

Evolutionary algorithm based on approximation technique for solving bilevel programming problems

Yu SHEN, Hecheng LI, Lijuan CHEN

2022, 42(8): 2511-2518. DOI: 10.11772/j.issn.1001-9081.2021061079

Asbtract ( )

HTML ( )

PDF (701KB) ( )

Figures and Tables | References | Related Articles | Metrics

Bilevel programming involves two optimization problems located at upper-level （leader） and lower-level （follower）. The constraint domain of the leader is determined by the follower implicitly， the leader objective dominates in a bilevel optimization procedure， and the follower objective must be optimized with respect of the follower variables. The hierarchical structure of the bilevel optimization problem causes large computational complexity. Especially， the frequent computations of the follower can accumulate a large amount of computational cost. In order to solve this kind of problem effectively， an evolutionary algorithm based on approximation technique was developed. Firstly， a multi-population co-evolution approach was applied， and the crossover and the mutation operators were used respectively to balance the exploitation and exploration capabilities of the algorithm. Secondly， based on the sensitivity analysis theory， an approximation evaluation method for new individuals was designed to reduce the computation frequency of the follower carried out by the algorithm. The demonstration results of the approximate effect of a numerical example show that most positions of the approximate offspring individuals and the exact offspring individual are mostly coincident. In addition， the results on 10 common examples show that the proposed algorithm can find better optimal solutions than the multi-valued mapping algorithm. CPU time comparison shows that the approximate technique improves the speed of finding the optimal solution effectively， thereby reducing the running time. Therefore， the effectiveness of the approximate technique adopted by the algorithm is demonstrated.

Differential disturbed heap-based optimizer

Xinming ZHANG, Shaochen WEN, Shangwang LIU

2022, 42(8): 2519-2527. DOI: 10.11772/j.issn.1001-9081.2021061104

Asbtract ( )

HTML ( )

PDF (737KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to solve the problems， such as insufficient search ability and low search efficiency of Heap-Based optimizer （HBO） in solving complex problems， a Differential disturbed HBO （DDHBO） was proposed. Firstly， a random differential disturbance strategy was proposed to update the best individual’s position to solve the problem of low search efficiency caused by not updating of this individual by HBO. Secondly， a best worst differential disturbance strategy was used to update the worst individual’s position and strengthen its search ability. Thirdly， the ordinary individual’s position was updated by a multi-level differential disturbance strategy to strengthen information communication among individuals between multiple levels and improve the search ability. Finally， a dimension-based differential disturbance strategy was proposed for other individuals to improve the probability of obtaining effective solutions in initial stage of original updating model. Experimental results on a large number of complex functions from CEC2017 show that compared with HBO， DDHBO has better optimization performance on 96.67% functions and less average running time （3.445 0 s）， and compared with other state-of-the-art algorithms， such as Worst opposition learning and Random-scaled differential mutation Biogeography-Based Optimization （WRBBO）， Differential Evolution and Biogeography-Based Optimization （DEBBO）， Hybrid Particle Swarm Optimization and Grey Wolf Optimizer （HGWOP）， etc.， DDHBO also has significant advantages.

Software quality evaluation method considering decision maker’s psychological behaviors

Yanhao SUN, Wei XU, Tao ZHANG, Ningxin LIU

2022, 42(8): 2528-2533. DOI: 10.11772/j.issn.1001-9081.2021060999

Asbtract ( )

HTML ( )

PDF (611KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the lack of consideration of the psychological behaviors of decision makers in software quality evaluation methods， a TOmada de Decisao Interativa e Multicritevio （TODIM） software quality evaluation method based on interval 2-tuple linguistic information was proposed. Firstly， interval 2-tuple linguistic information was used to characterize the evaluation information of experts for software quality. Secondly， the subjective and objective weights of software quality attributes were calculated by subjective weighting method and Technique for Order Preference by Similarity to Ideal Solution （TOPSIS） respectively. On this basis， the comprehensive weights of software quality attributes were obtained by combined weighting method. Thirdly， in order to better describe the psychological behaviors of experts in the process of software quality evaluation， TODIM was introduced into software quality evaluation. Finally， the method was used to evaluate the software quality of assistant dispatcher terminal in high-speed railway dispatching system. The result shows that the third assistant dispatcher terminal software provided by the railway software supplier has the highest dominance value and its quality is the best. The results of comparing this method with the regret theory and Preference Ranking Organization METHod for Enrichment Evaluations （PROMETHEE-II） show that the three methods are consistent in the selection of the best quality software， but the overall rankings of the three methods are somewhat different， indicating that the constructed method has strong superiority in describing the interaction between multiple criteria and the psychological behaviors of decision makers.

Test suite selection method based on commit prioritization and prediction model

Meiying LIU, Qiuhui YANG, Xiao WANG, Chuang CAI

2022, 42(8): 2534-2539. DOI: 10.11772/j.issn.1001-9081.2021061016

Asbtract ( )

HTML ( )

PDF (694KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to reduce the regression test set and improve the efficiency of regression test in the Continuous Integration （CI） environment， a regression test suite selection method for the CI environment was proposed. First， the commits were prioritized based on the historical failure rate and execution rate of each test suite related to each commit. Then， the machine learning method was used to predict the failure rates of the test suites involved in each commit， and the test suite with the higher failure rate were selected. In this method， the commit prioritization technology and the test suite selection technology were combined to ensure the increase of the failure detection rate and the reduction of the test cost. Experimental results on Google’s open-source dataset show that compared to the methods with the same commit prioritization method and test suite selection method， the proposed method has the highest improvement in the Average Percentage of Faults Detected per cost （APFDc） by 1% to 27%； At the same cost of test time， the TestRecall of this method increases by 33.33 to 38.16 percentage points， the ChangeRecall increases by 15.67 to 24.52 percentage points， and the test suite SelectionRate decreases by about 6 percentage points.

Identifier obfuscation method based on low level virtual machine

Dajiang TIAN, Chengyang LI, Tianbo HUANG, Weiping WEN

2022, 42(8): 2540-2547. DOI: 10.11772/j.issn.1001-9081.2021071166

Asbtract ( )

HTML ( )

PDF (901KB) ( )

Figures and Tables | References | Related Articles | Metrics

Most of the existing code obfuscation solutions are limited to a specific programming language or a platform， which are not widespread and general. Moreover， control flow obfuscation and data obfuscation introduce additional overhead. Aiming at the above problems， an identifier obfuscation method was proposed based on Low Level Virtual Machine （LLVM）. Four identifier obfuscation algorithms were implemented in the method， including random identifier algorithm， overload induction algorithm， abnormal identifier algorithm， and high-frequency word replacement algorithm. At the same time， a new hybrid obfuscation algorithm was designed by combining these algorithms. In the proposed method， firstly， in the intermediate files compiled by the front-ends， the function names， which met the obfuscation criteria， were selected. Secondly， these function names were processed by using specific obfuscation algorithms. Finally， the obfuscated files were transformed into binary files by using specific compilation back-ends. The identifier obfuscation method based on LLVM is suitable for the languages supported by LLVM and does not affect the normal functions of the program. For different programming languages， the time overhead is within 20% and the space overhead hardly increases. At the same time， the average confusion ratio of the program is 77.5%， and compared with the single replacement algorithm and overload algorithm， the proposed mixed identifier algorithm can provide stronger concealment in theoretical analysis. Experimental results show that the proposed method has the characteristics of low-performance overhead， strong concealment， and wide versatility.

Rethinking errors in human pose estimation heatmap

Feiyu YANG, Zhan SONG, Zhenzhong XIAO, Yaoyang MO, Yu CHEN, Zhe PAN, Min ZHANG, Yao ZHANG, Beibei QIAN, Chaowei TANG, Wu JIN

2022, 42(8): 2548-2555. DOI: 10.11772/j.issn.1001-9081.2021050805

Asbtract ( )

HTML ( )

PDF (870KB) ( )

Figures and Tables | References | Related Articles | Metrics

Recently， the leading human pose estimation algorithms are heatmap-based algorithms. Heatmap decoding （i.e. transforming heatmaps to coordinates of human joint points） is a basic step of these algorithms. The existing heatmap decoding algorithms neglect the effect of systematic errors. Therefore， an error compensation based heatmap decoding algorithm was proposed. Firstly， an error compensation factor of the system was estimated during training. Then， the error compensation factor was used to compensate the prediction errors including both systematic error and random error of human joint points in the inference stage. Extensive experiments were carried out on different network architectures， input resolutions， evaluation metrics and datasets. The results show that compared with the existing optimal algorithm， the proposed algorithm achieves significant accuracy gain. Specifically， by using the proposed algorithm， the Average Precision （AP） of the HRNet-W48-256×192 model is improved by 2.86 percentage points on Common Objects in COntext （COCO）dataset， and the Percentage of Correct Keypoints with respect to head （PCKh） of the ResNet-152-256×256 model is improved by 7.8 percentage points on Max Planck Institute for Informatics （MPII）dataset. Besides， unlike the existing algorithms， the proposed algorithm did not need Gaussian smoothing preprocessing and derivation operation， so that it is 2 times faster than the existing optimal algorithm. It can be seen that the proposed algorithm has applicable values to performing fast and accurate human pose estimation.

Decoupled visual servoing control method based on point and line features

Jinyan LU, Xiaoke QI

2022, 42(8): 2556-2563. DOI: 10.11772/j.issn.1001-9081.2021071178

Asbtract ( )

HTML ( )

PDF (905KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problem of automatic alignment for robot， a decoupled visual servoing control method based on point and line features was proposed. In the method， the points and lines were used as image features， and the interactive matrix of image features was used to decouple attitude control and position control， so as to realize six degrees of freedom alignment. Firstly， the attitude control law was designed according to the lines and their interactive matrix to eliminate the rotational deviation. Then， the position control law was designed according to the points and their interactive matrix to eliminate the positional deviation. Finally， the automatic alignment between the robot end-effector and the target was realized. In the alignment control process， based on the amount of camera motion and the variation of features before and after camera motion， the online estimation of depth was able to be realized. In addition， a monitor was designed to adjust the motion speed of the camera， thereby ensuring that the features were always in the field of view of the camera. The six degrees of freedom alignment of the robot on the Eye-in-Hand platform was completed by the proposed method and the traditional image based visual servoing method， respectively. The proposed method realizes the automatic alignment of the robot in 16 steps， and has the maximum translation error of 3.26 mm and the maximum rotation error of 0.72° of the robot end-effector after alignment. Compared with the comparison method， the proposed method has more efficient control process， faster convergence of control error and less alignment error. Experimental results show that the proposed method can realize fast and high-precision automatic alignment， improving the autonomy and intelligent level of robot operation， and is expected to be appied in the fields of target tracking， picking and positioning， automatic assembly， welding， service robot and so on.

High-accuracy video image stabilization algorithm incorporating temporal and spatial saliency

Lihua YIN, Liang KANG, Wenhua ZHU

2022, 42(8): 2564-2570. DOI: 10.11772/j.issn.1001-9081.2021061061

Asbtract ( )

HTML ( )

PDF (1745KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to eliminate the interference of complicated moving foreground on the accuracy of video stabilization， and to combine with the unique advantages of temporal and spatial saliency in moving target detection， a high-accuracy video stabilization algorithm incorporating temporal and spatial saliency was proposed. In the proposed algorithm， the spatio-temporal saliency detection technology was used to identify and eliminate the moving targets. At the same time， the multi-grid motion paths were adopted for motion compensation. The proposed algorithm specifically includes： Speeded Up Robust Features （SURF） feature point extraction and matching， spatio-temporal saliency target detection， grid division and motion vector calculation， motion trajectory generation， multi-path smoothing， motion compensation and so on. Experimental results show that compared with traditional image stabilization algorithms， the proposed algorithm has outstanding performance in stability index. For videos with moving large range of foreground interference，the stability of the proposed algorithm is improved by about 9.6% compared with Robust Traffic Video Stabilization Method assisted by foreground feature trajectories （RTVSM）. For videos with multiple moving foreground interference，the stability of the proposed algorithm is improved by about 5.8% compared with Bundled-paths algorithm， which fully verify the proposed algorithm’s image stability advantage.

Image denoising model based on approximate U-shaped network structure

Huazhong JIN, Xiuyang ZHANG, Zhiwei YE, Wenqi ZHANG, Xiaoyu XIA

2022, 42(8): 2571-2577. DOI: 10.11772/j.issn.1001-9081.2021061126

Asbtract ( )

HTML ( )

PDF (952KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problem of poor denoising effect and long training period in image denoising， an image denoising model based on approximate U-shaped network structure was proposed. Firstly， the original linear network structure was modified to an approximate U-shaped network structure by using convolutional layers with different strides. Then， the image information of different receptive fields was superimposed on each other to preserve the original information of the image as much as possible. Finally， the deconvolutional network layer was introduced for image restoration and further noise removal. Experimental results show that on Set12 and BSD68 test sets： compared with Denoising Convolutional Neural Network （DnCNN） model， the proposed model has an average increase of 0.04 to 0.14 dB on Peak Signal-to-Noise Ratio （PSNR）， and an average reduction of 41% on training time， verifying that the proposed model has better denoising effect and shorter training time.

De-raining algorithm based on joint attention mechanism for single image

Chengxia XU, Qing YAN, Teng LI, Kaichao MIAO

2022, 42(8): 2578-2585. DOI: 10.11772/j.issn.1001-9081.2021061072

Asbtract ( )

HTML ( )

PDF (1959KB) ( )

Figures and Tables | References | Related Articles | Metrics

It is challenging for the existing single image de-raining algorithms to fully explore the interaction of attention mechanisms in different dimensions. Therefore， an algorithm based on joint attention mechanism was proposed to realize single image de-raining. The algorithm contains a channel attention mechanism and a spatial attention mechanism. Specifically， in the channel attention mechanism， the distribution of rain streak features in each channel was detected and the importance of each feature channel was differentiated. In the spatial attention mechanism， aiming at the spatial relationship of rain streak distribution within channels， the context information was accumulated in a local to global manner to realize efficient and accurate de-raining. Additionally， a deep residual shrinkage network with a soft threshold nonlinear transformation sub-network embedded in the residual module was used to zero out redundant information via a soft threshold function， thereby improving the ability of the CNN in retaining image details in noise. Experiments were carried out on open rainfall data sets and self constructed rainfall data sets. Compared with spatial attention， the joint attention rain removal algorithm improved Peak Signal-to-Noise Ratio （PSNR） by 4.5% and the Structural SIMilarity （SSIM） by 0.3%. Experimental results show that the proposed algorithm can effectively perform single image de-raining and image detail preserving. At the same time， this algorithm outperforms the comparison algorithms in terms of visual effect and quantitative metrics.

Super-resolution reconstruction algorithm of medical image based on lightweight dense neural network

Yining WANG, Qingshan ZHAO, Pinle QIN, Yulan HU, Chunmei ZONG

2022, 42(8): 2586-2592. DOI: 10.11772/j.issn.1001-9081.2021061093

Asbtract ( )

HTML ( )

PDF (1357KB) ( )

Figures and Tables | References | Related Articles | Metrics

The clarity of medical images directly affects the clinical diagnosis. Due to the limitations of imaging equipment and environmental factors， it is often impossible to directly obtain high-resolution images， and the hardware of most smart terminals is not suitable for running large-scale deep neural network models. Therefore， a lightweight dense neural network model with fewer layers and parameters was proposed. First of all， dense block and skip layer structure were used in the network for global and local image feature learning， and more feature information was introduced into the activation function， so that the shallow low-level image features in the network were able to be propagated to the high-layers more easily， thereby improving the super-resolution reconstruction quality of medical images. Then， the multi-stage method was adopted to train the network and the dual-task loss was used to strengthen the supervision and guidance in network learning， which solved the problem of difficulty increase in network training caused by highly magnified image super-resolution reconstruction. Compared with Nearest Neighbor （NN）， bilinear interpolation， bicubic interpolation， Convolutional Neural Network （CNN） based algorithm and the residual neural network based algorithm， the proposed model is of high practical value on better reconstructing the texture details of medical images， achieving higher Peak Signal-to-Noise Ratio （PSNR） and Structural SIMilarity （SSIM）， as well as achieving good result in both training speed and hardware consumption.

Few-shot diatom detection combining multi-scale multi-head self-attention and online hard example mining

Jiehang DENG, Wenquan GUO, Hanjie CHEN, Guosheng GU, Jingjian LIU, Yukun DU, Chao LIU, Xiaodong KANG, Jian ZHAO

2022, 42(8): 2593-2600. DOI: 10.11772/j.issn.1001-9081.2021061075

Asbtract ( )

HTML ( )

PDF (1490KB) ( )

Figures and Tables | References | Related Articles | Metrics

The detection precision is low when the diatom training sample size is small， so a Multi-scale Multi-head Self-attention （MMS） and Online Hard Example Mining （OHEM） based few-shot diatom detection model， namely MMSOFDD was proposed based on the few-shot object detection model Two-stage Fine-tuning Approach （TFA）. Firstly， a Transformer-based feature extraction network Bottleneck Transformer Network-101 （BoTNet-101） was constructed by combining ResNet-101 with a multi-head self-attention mechanism to make full use of the local and global information of diatom images. Then， multi-head self-attention was improved to MMS， which eliminated the limitation of processing single object scale of the original multi-head self-attention. Finally， OHEM was introduced to the model predictor， and the diatoms were identified and localized. Ablation and comparison experiments between the proposed model and other few-shot object detection models were conducted on a self-constructed diatom dataset. Experiment results show that the mean Average Precision （mAP） of MMSOFDD is 69.60%， which is improved by 5.89 percentage points compared with 63.71% of TFA； and compared with 61.60% and 60.90% the few-shot object detection models Meta R-CNN and Few-Shot In Wild （FSIW）， the proposed model has the mAP improved by 8.00 percentage points and 8.70 percentage points respectively. Moreover， MMSOFDD can effectively improve the detection precision of the detection model for diatoms with small size of diatom training samples.

Surface detection algorithm of multi-shape small defects for section steel based on deep learning

Yajiao LIU, Haitao YU, Jiang WANG, Lifeng YU, Chunhui ZHANG

2022, 42(8): 2601-2608. DOI: 10.11772/j.issn.1001-9081.2021060971

Asbtract ( )

HTML ( )

PDF (1530KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to solve the problems of low detection efficiency and poor detection precision caused by various surface defects and numerous small defects of section steel， a detection algorithm for surface defects of section steel， namely Steel-YOLOv3， was proposed on the basis of the deformable convolution and multi-scale dense feature pyramid. Firstly， the deformable convolution was used to replace the convolutional layers of part of the residual units in Darknet53 network， which strengthened the feature learning ability of feature extraction network for multi-type defects on the surface of section steel. Secondly， a multi-scale dense feature pyramid module was designed， which means that a shallower prediction scale was added to the 3 prediction scales of the original YOLOv3 algorithm and the multi-scale feature maps were connected across layers， thereby enhancing the ability to characterize dense small defects. Finally， according to the defect size distribution characteristics of section steel， the K-means dimension clustering method was used to optimize the scales of anchor boxes， and the anchor boxes were evenly distributed to 4 corresponding prediction scales. Experimental results show that Steel-YOLOv3 algorithm has a detection mean Average Precision （mAP） of 89.24%， which is improved by 3.51%， 26.46%， 12.63% and 5.71% compared with those of Faster Region-based Convolutional Neural Network （Faster R-CNN）， Single Shot multibox Detector （SSD）， YOLOv3 and YOLOv5 algorithms respectively. And the detection rate of tiny spalling defects is significantly improved by the proposed algorithm. Moreover， the proposed algorithm can detect 25.62 images per second， which means the requirement of real-time detection can be met and the algorithm can be applied to the online detection for the surface defects of section steel.

Positive influence maximization based on reverse influence sampling

Shuxin YANG, Jingfeng XU

2022, 42(8): 2609-2616. DOI: 10.11772/j.issn.1001-9081.2021071185

Asbtract ( )

HTML ( )

PDF (746KB) ( )

Figures and Tables | References | Related Articles | Metrics

Existing works on influence maximization mainly focus on unsigned network and neglect the hostile relationship between the individuals in the network. Aiming at the positive influence maximization problem in signed network， based on Polarity-related Independent Cascade （IC-P） model， a Reverse Influence Sampling in Signed network （RIS-S） algorithm was proposed to maximize positive influence. Firstly， in order to apply to the signed network， the polarity relationships of nodes in the stage of generating reverse reachable sets were considered. Secondly， to improve the effectiveness of reverse reachable sets， the traversal depth of sampling was limited. Finally， the positive influence ranges and running times of RIS-S， Influence Maximization via Martingales （IMM）， Positive Out-Degree （POD） and Effective Degree algorithm were compared on three real signed network data sets to verify the effectiveness of the proposed algorithm. Experimental results show that RIS-S algorithm can obtain wider positive influence range by selecting more accurate seeds， and the proposed algorithm has the running time less than the same type algorithm IMM.It can be thought that RIS-S algorithm can solve the problem of positive influence maximization in signed network.

Multi-objective hybrid evolutionary algorithm for solving open-shop scheduling problem with controllable processing time

Kuineng CHEN, Xiaofang YUAN

2022, 42(8): 2617-2627. DOI: 10.11772/j.issn.1001-9081.2021061071

Asbtract ( )

HTML ( )

PDF (1515KB) ( )

Figures and Tables | References | Related Articles | Metrics

The open-shop scheduling problem is a typical NP-hard problem. Most of the existing research assumes that the processing time of a procedure is fixed. However， in real-world production scenarios， the processing time can be controlled by adjusting the processing power. At the same time， optimizing the two conflicting objectives of completion time and energy consumption is significant for the high-efficiency and energy-saving open-shop production. Therefore， the Multi-objective Open-shop Scheduling Problem with Controllable Processing Time （MOOSPCPT） was studied， a mixed-integer programming model was constructed with the objectives of minimizing makespan and total extra energy consumption， and a Multi-objective Hybrid Evolutionary Algorithm （MOHEA） was proposed to solve MOOSPCPT. Several strategies were developed in the MOHEA： 1） the migration strategy and mutation strategy in the biogeographic-based optimization algorithm were improved for global search， which facilitated the diversity of the population effectively； 2） a self-adjusting variable neighborhood search strategy was designed based on the critical path， which enhanced the local search performance of the algorithm； 3） a processing time resetting operator was designed， which improved the search efficiency of the algorithm significantly. Simulation results show that the proposed strategies are effective in improving algorithm performance； MOHEA solves MOOSPCPT more effectively compared with Non-dominated Sorting Genetic Algorithm Ⅱ （NSGA-Ⅱ）， Non-dominated Sorting Genetic Algorithm Ⅲ （NSGA-Ⅲ） and Strength Pareto Evolutionary Algorithm 2 （SPEA2）.

Estimation of distribution algorithm for hot rolling rescheduling with order disturbance

Yidi WANG, Zhiwei LI, Wenxin ZHANG, Tieke LI, Bailin WANG

2022, 42(8): 2628-2636. DOI: 10.11772/j.issn.1001-9081.2021061106

Asbtract ( )

HTML ( )

PDF (757KB) ( )

Figures and Tables | References | Related Articles | Metrics

As the core of steel production， hot rolling process has demands of strict production continuity and complex production technology. The random arrival of rush orders and urgent delivery requirements have adverse impacts on production continuity and quality stability. Aiming at those kind of dynamic events of rush order insertion， a hot rolling rescheduling optimization method was proposed. Firstly， the influence of order disturbance factor on the scheduling scheme was analyzed， and a mathematical model of hot rolling rescheduling was established with the optimization objective of minimizing the weighted sum of tardiness of orders and jump penalty of slabs. Then， an Estimation of Distribution Algorithm （EDA） for hot rolling rescheduling was designed. In this algorithm， aiming at the insertion processing of rush orders， an integer encoding scheme was proposed based on the insertion position， the probability model based on the characteristics of the model was designed， and the fitness function based on the penalty value was defined by considering the targets and constraints comprehensively. The feasibility and validity of the model and the algorithm were verified by the simulation experiment on the actual production data.

Control method of quadrotor UAV with manipulator based on expert PID

Bao CHEN, Zupeng ZHOU, Huan WEI, Yanzhao LYU, Zhicheng SUI

2022, 42(8): 2637-2642. DOI: 10.11772/j.issn.1001-9081.2021060975

Asbtract ( )

HTML ( )

PDF (1392KB) ( )

Figures and Tables | References | Related Articles | Metrics

Compared with the Unmanned Aerial Vehicle （UAV） without manipulator， the UAV with manipulator has large deviation in the flight trajectory and is more difficult to control stably. In order to solve the precise trajectory control problem of UAV with manipulator， a control method of quadrotor UAV with manipulator based on expert PID was proposed. Firstly， the manipulator was equipped to the UAV and the two was considered as a whole， and the kinematics and dynamics system models of UAV with manipulator was established through Lagrange equation. Secondly， an expert PID controller was designed to control the stability of the system. Thirdly， the trajectory planning of the manipulator of UAV with manipulator was carried out by using quintic polynomial. Finally， the effectiveness of expert PID control method for the stability control of UAV with manipulator is verified by simulation. The experimental results show that compared with conventional PID control， the proposed control method based on expert PID improves the response speed of the system and can effectively suppress external disturbances. This method can track the trajectory of the manipulator stably under the action， and has good immunity and robustness.

Table of Content