Journal of Computer Applications

Survey of single target tracking algorithms based on Siamese network

Mengting WANG, Wenzhong YANG, Yongzhi WU

2023, 43(3): 661-673. DOI: 10.11772/j.issn.1001-9081.2022010150

Asbtract ( )

HTML ( )

PDF (2647KB) ( )

Figures and Tables | References | Related Articles | Metrics

Single object tracking is an important research direction in the field of computer vision， and has a wide range of applications in video surveillance， autonomous driving and other fields. For single object tracking algorithms， although a large number of summaries have been conducted， most of them are based on correlation filter or deep learning. In recent years， Siamese network-based tracking algorithms have received extensive attention from researchers for their balance between accuracy and speed， but there are relatively few summaries of this type of algorithms and it lacks systematic analysis of the algorithms at the architectural level. In order to deeply understand the single object tracking algorithms based on Siamese network， a large number of related literatures were organized and analyzed. Firstly， the structures and applications of the Siamese network were expounded， and each tracking algorithm was introduced according to the composition classification of the Siamese tracking algorithm architectures. Then， the commonly used datasets and evaluation metrics in the field of single object tracking were listed， the overall and each attribute performance of 25 mainstream tracking algorithms was compared and analyzed on OTB 2015 （Object Tracking Benchmark） dataset， and the performance and the reasoning speed of 23 Siamese network-based tracking algorithms on LaSOT （Large-scale Single Object Tracking） and GOT-10K （Generic Object Tracking） test sets were listed. Finally， the research on Siamese network-based tracking algorithms was summarized， and the possible future research directions of this type of algorithms were prospected.

Survey of label noise learning algorithms based on deep learning

Boyi FU, Yuncong PENG, Xin LAN, Xiaolin QIN

2023, 43(3): 674-684. DOI: 10.11772/j.issn.1001-9081.2022020198

Asbtract ( )

HTML ( )

PDF (2083KB) ( )

PDF（mobile） (733KB) ( 50 )

Figures and Tables | References | Related Articles | Metrics

In the field of deep learning， a large number of correctly labeled samples are essential for model training. However， in practical applications， labeling data requires high labeling cost. At the same time， the quality of labeled samples is affected by subjective factors or tool and technology of manual labeling， which inevitably introduces label noise in the annotation process. Therefore， existing training data available for practical applications is subject to a certain amount of label noise. How to effectively train training data with label noise has become a research hotspot. Aiming at label noise learning algorithms based on deep learning， firstly， the source， classification and impact of label noise learning strategies were elaborated； secondly， four label noise learning strategies based on data， loss function， model and training method were analyzed according to different elements of machine learning； then， a basic framework for learning label noise in various application scenarios was provided； finally， some optimization ideas were given， and challenges and future development directions of label noise learning algorithms were proposed.

Improved method of convolution neural network based on matrix decomposition

Zhenliang LI, Bo LI

2023, 43(3): 685-691. DOI: 10.11772/j.issn.1001-9081.2022010032

Asbtract ( )

HTML ( )

PDF (1694KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the difficulty of optimizing the traditional Convolutional Neural Network （CNN） in the training process， an improved method of CNN based on matrix decomposition was proposed. Firstly， the convolution kernel parameter tensor of the model convolution layer during training was converted into the product of multiple parameter matrices through matrix decomposition to form overparameterization. Secondly， these additional linear parameters were added to the back propagation of the network and updated synchronously with other parameters of the model to improve the optimization process of gradient descent. After completing the training， the matrix product was restored to the standard convolution kernel parameters， so that the computational complexity of forward propagation during inference was able to be the same as before the improvement. With thin QR decomposition and reduced Singular Value Decomposition （SVD） applied， the classification effect experiments were carried out on CIFAR-10 （Canadian Institute For Advanced Research， 10 classes） dataset， and further generalization experiments were carried out by using different image classification datasets and different initialization methods. Experimental results show that the classification accuracies of 7 models of different depths of Visual Geometry Group （VGG） and Residual Network （ResNet） based on matrix decomposition are higher than those of the original convolutional neural network models. It can be seen that the matrix decomposition method can make CNN achieve higher classification accuracy， and eventually converge to a better local optimum.

Fusion imaging-based recurrent capsule classification network for time series

Rongjun CHEN, Xuanhui YAN, Chaocheng YANG

2023, 43(3): 692-699. DOI: 10.11772/j.issn.1001-9081.2022010089

Asbtract ( )

HTML ( )

PDF (2586KB) ( )

Figures and Tables | References | Related Articles | Metrics

To address the problem of lack of temporal correlations and spatial location relationships in imaging time series， Fusion-Imaing Recurrent Capsule Neural Network （FIR-Capsnet） for time series was proposed to fuse and extract spatial-temporal information from time series images. Firstly， the multi-level spatial-temporal features of time series images were captured by using Gramian Angular Field （GAF）， Markov Transition Field （MTF） and Recurrence Plot （RP）. Then， the spatial relationships of time series images were learnt by the rotation invariance of capsule neural network and iterative routing algorithm. Finally， the temporal correlations hidden in the time series data were learnt by the gate mechanism of Long-Short Term Memory （LSTM） network. Experimental results show that FIR-Capsnet achieves 15 wins on 30 UCR public datasets and outperforms Fusion-CNN by 7.2 percentage points in classification accuracy on Human Activity Recognition （HAR） dataset， illustrating the advantages of FIR-Capsnet in processing time series data.

Bimodal emotion recognition method based on graph neural network and attention

Lubao LI, Tian CHEN, Fuji REN, Beibei LUO

2023, 43(3): 700-705. DOI: 10.11772/j.issn.1001-9081.2022020216

Asbtract ( )

HTML ( )

PDF (1917KB) ( )

Figures and Tables | References | Related Articles | Metrics

Considering the issues of physiological signal emotion recognition， a bimodal emotion recognition method based on Graph Neural Network （GNN） and attention was proposed. Firstly， the GNN was used to classify ElectroEncephaloGram （EEG） signals. Secondly， an attention-based Bi-directional Long Short-Term Memory （Bi-LSTM） network was used to classify ElectroCardioGram （ECG） signals. Finally， the results of EEG and ECG classification were fused by Dempster-Shafer evidence theory， thus improving the comprehensive performance of the emotion recognition task. To verify the effectiveness of the proposed method， 20 subjects were invited to participate in the emotion elicitation experiment， and the EEG signals and ECG signals of the subjects were collected. Experimental results show that the binary classification accuracies of the proposed method are 91.82% and 88.24% in the valence dimension and arousal dimension， respectively， which are 2.65% and 0.40% higher than those of the single-modal EEG method respectively， and are 19.79% and 24.90% higher than those of the single-modal ECG method respectively. It can be seen that the proposed method can effectively improve the accuracy of emotion recognition and provide decision support for medical diagnosis and other fields.

Sentiment boosting model for emotion recognition in conversation text

Yu WANG, Yubo YUAN, Yi GUO, Jiajie ZHANG

2023, 43(3): 706-712. DOI: 10.11772/j.issn.1001-9081.2022010044

Asbtract ( )

HTML ( )

PDF (1123KB) ( )

Figures and Tables | References | Related Articles | Metrics

To address the problems that many existing studies ignore the correlation between interlocutors’ emotions and sentiments， a sentiment boosting model for emotion recognition in conversation text was proposed， namely Sentiment Boosting Graph Neural network （SBGN）. Firstly， themes and dialogue intent were integrated into the text， and the reconstructed text features were extracted by fine-tuning the pre-trained language model. Secondly， a symmetric learning structure for emotion analysis was given， with the reconstructed features fed into a Graph Neural Network （GNN） emotion analysis model and a Bi-directional Long Short-Term Memory （Bi-LSTM） sentiment classification model. Finally， by fusing emotion analysis and sentiment classification models， a new loss function was constructed with sentiment classification loss function as a penalty， and the optimal penalty factor was adjusted and obtained by learning. Experimental results on public dataset DailyDialog show that SBGN model improves 16.62 percentage points compared with Dialogue Graph Convolutional Network （DialogueGCN） model， and improves 14.81 percentage points compared with the state-of-art model Directed Acyclic Graph-Emotion Recognition from Conversation （DAG-ERC） in micro-average F1. It can be seen that SBGN model can effectively improve the performance of emotion analysis in dialogue system.

Local and global context attentive fusion network for traffic scene parsing

Zeyu WANG, Shuhui BU, Wei HUANG, Yuanpan ZHENG, Qinggang WU, Xu ZHANG

2023, 43(3): 713-722. DOI: 10.11772/j.issn.1001-9081.2022020245

Asbtract ( )

HTML ( )

PDF (5305KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to solve the local and global contextual information adaptive aggregation problem in traffic scene parsing， a Local and Global Context Attentive Fusion Network （LGCAFN） with three-module architecture was proposed. The front-end feature extraction module consisted of the improved 101-layer Residual Network （ResNet-101） which was based on Cascaded Atrous Spatial Pyramid Pooling （CASPP） unit， and was able to extract object’s multi-scale local features more effectively. The mid-end structural learning module was composed of eight Long Short-Term Memory （LSTM） branches， and was able to infer spatial structural features of object’s adjacent scene regions in eight different directions more accurately. In the back-end feature fusion module， a three-stage fusion method based on attention mechanism was adopted to adaptively aggregate useful contextual information and shield from noisy contextual information， and the generated multi-modal fusion features were able to represent object’s semantic information in a more comprehensive and accurate way. Experimental results on Cityscapes standard and extended datasets demonstrate that compared to the existing state-of-the-art methods such as Inverse Transformation Network （ITN）， and Object Contextual Representation Network （OCRN）， LGCAFN achieves the best mean Intersection over Union （mIoU）， reaching 84.0% and 86.3% respectively， showing that LGCAFN can parse traffic scenes accurately and is helpful to realize autonomous driving of vehicles.

Hidden state initialization method for recurrent neural network-based human motion model

Nanfan LI, Wenwen SI, Siyuan DU, Zhiyong WANG, Chongyang ZHONG, Shihong XIA

2023, 43(3): 723-727. DOI: 10.11772/j.issn.1001-9081.2022020175

Asbtract ( )

HTML ( )

PDF (1866KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problem of the jump existed in the first frame of human motion synthesis method based on Recurrent Neural Network （RNN）， which affects the quality of generated motion， a human motion synthesis method with hidden state initialization was proposed. The initial hidden state was used as independent variable， the objective function of the neural network was used as optimization goal， and the gradient descent method was used to optimize and solve the problem to obtain a suitable initial hidden state. Compared with Encoder-Recurrent-Decoder （ERD） model and Residual Gate Recurrent Unit （RGRU） model， the proposed method with initial hidden state estimation reduces the prediction error of the first frame by 63.51% and 6.90% respectively， and decreases the total error of 10 frames by 50.00% and 4.89% respectively. Experimental results show that the proposed method is better than the method without initial hidden state estimation in both motion synthesis quality and motion prediction accuracy. And the proposed method accurately estimates the hidden state of the first frame of RNN-based human motion model， which improves the quality of motion synthesis and provides reliable data support for action recognition model in real-time security monitoring.

Video-based person re-identification method based on graph convolution network and self-attention graph pooling

Yingmao YAO, Xiaoyan JIANG

2023, 43(3): 728-735. DOI: 10.11772/j.issn.1001-9081.2022010034

Asbtract ( )

HTML ( )

PDF (2665KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the bad effect of video person re-identification caused by factors such as occlusion， spatial misalignment and background clutter in cross-camera network videos， a video-based person re-identification method based on Graph Convolutional Network （GCN） and Self-Attention Graph Pooling （SAGP） was proposed. Firstly， the correlation information of different regions between frames in the video was mined through the patch relation graph modeling.In order to alleviate the problems such as occlusion and misalignment， the region features in the frame-by-frame images were optimized by using GCN. Then， the regions with low contribution to person features were removed by SAGP mechanism to avoid the interference of background clutter regions. Finally， a weighted loss function strategy was proposed， the center loss was used to optimize the classification learning results， and Online soft mining and Class-aware attention Loss （OCL） were used to solve the problem that the available samples were not fully used in the process of hard sample mining. Experimental results on MARS dataset show that compared with the sub-optimal Attribute-aware Identity-hard Triplet Loss （AITL）， the proposed method has the mean Average Precision （mAP） and Rank-1 increased by 1.3 percentage points and 2.0 percentage points. The proposed method can better utilize the spatial-temporal information in the video to extract more discriminative person features， and improve the effect of person re-identification tasks.

Pedestrian trajectory prediction based on multi-head soft attention graph convolutional network

Tao PENG, Yalong KANG, Feng YU, Zili ZHANG, Junping LIU, Xinrong HU, Ruhan HE, Li LI

2023, 43(3): 736-743. DOI: 10.11772/j.issn.1001-9081.2022020207

Asbtract ( )

HTML ( )

PDF (5673KB) ( )

PDF（mobile） (2752KB) ( 32 )

Figures and Tables | References | Related Articles | Metrics

The complexity of pedestrian interaction is a challenge for pedestrian trajectory prediction， and the existing algorithms are difficult to capture meaningful interaction information between pedestrians， which cannot intuitively model the interaction between pedestrians. To address this problem， a multi-head soft attention graph convolutional network was proposed. Firstly， a Multi-head Soft ATTention （MS ATT） combined with involution network was used to extract sparse spatial adjacency matrix and sparse temporal adjacency matrix from spatial and temporal graph inputs respectively to generate sparse spatial directed graph and sparse temporal directed graph. Then， a Graph Convolutional Network （GCN） was used to learn interaction and motion trend features from sparse spatial and sparse temporal directed graphs. Finally， the learned trajectory features were input into a Temporal Convolutional Network （TCN） to predict double Gaussian distribution parameters， thereby generating the predicted pedestrian trajectories. Experiments on Eidgenossische Technische Hochschule （ETH） and University of CYprus （UCY） datasets show that， compared with Space-time sOcial relationship pooling pedestrian trajectory Prediction Model （SOPM）， the proposed algorithm reduces the Average Displacement Error （ADE） by 2.78%， and compared to Sparse Graph Convolution Network （SGCN）， the proposed algorithm reduces the Final Displacement Error （FDE） by 16.92%.

Efficient person search algorithm and optimization with Sophon SC5+ chip architecture

Jie SUN, Shaoxin WU, Xuejun WANG, Jing HUA

2023, 43(3): 744-751. DOI: 10.11772/j.issn.1001-9081.2022020252

Asbtract ( )

HTML ( )

PDF (3221KB) ( )

Figures and Tables | References | Related Articles | Metrics

The computational costs of traditional deep neural network-based person search algorithms are very high， so that these algorithms are difficult to deploy on devices with limited hardware resources and budgets because of high cost and low speed. Aiming at the above problems， a person detection and person re-identification algorithm based on the high-performance inference chip Sophon SC5+ was proposed to optimize the efficiency of deep learning from the algorithm end to the hardware end in a top-down approach. Firstly， by using the lightweight Ghost module to replace the backbone network of YOLOv5s， the parameters and computational cost of the model were greatly reduced. Secondly， Convolutional Block Attention Module （CBAM） attention mechanism was integrated to enhance the feature learning capability and improve the detection precision of the algorithm. Thirdly， the central loss constraint and Non-local attention mechanism were added to the person re-identification module， and the central constrained triple loss and the additional interval cross-entropy loss were combined to optimize the model and improve the performance of the person re-identification algorithm. Finally， based on Sophon SC+， person detection model and person re-identification model were quantized and the final inference model was generated. Experimental results on Market-1501 and DukeMTMC-ReID datasets show that， the mean Average Precisions （mAPs） of the person detection and person re-identification algorithms were improved by at least 43.8 and 25.7 percentage points compared with YOLOv4-tiny， Attribute-Complementary Re-ID Net （ACRN）， Singular Vector Decomposition Net （SVDNet） and other mainstream algorithms. After the implementation of int8 quantization based on Sophon SC5+ chip， although the proposed algorithm has the mAP decreased by 1.7 percentage points， it has the model size reduced by 74.4%. It can be seen that the proposed algorithm can be used in large-scale， city-level person search systems.

Table structure recognition model integrating edge features and attention

Xueqiang LYU, Yunan ZHANG, Jing HAN, Yunpeng CUI, Huan LI

2023, 43(3): 752-758. DOI: 10.11772/j.issn.1001-9081.2022010053

Asbtract ( )

HTML ( )

PDF (2113KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problems in the existing methods such as dependence on prior knowledge， insufficient robustness， and insufficient expression ability in table structure recognition， a new table structure recognition model integrating edge features and attention was proposed， namely Graph Edge-Attention Network based Table Structure Recognition model （GEAN-TSR）. Firstly， Graph Edge-Attention Network （GEAN） was proposed as the backbone network， and based on edge convolution structure， the graph attention mechanism was introduced and improved to aggregate graph node features， so as to solve the problem of information loss in the process of feature extraction of graph network， and improve the expression ability of graph network. Then， an edge feature fusion module was introduced to fuse the shallow graph node information with the graph network output to enhance the local information extraction and expression abilities of the graph network. Finally， the graph node text features extracted by Gated Recurrent Unit （GRU） were integrated into the text feature fusion module for edge’s classification and prediction. Comparative experiments on Scientific paper Table Structure Recognition-COMPlicated （SciTSR-COMP） dataset show that the recall and F1 score of GEAN-TSR are increased by 2.5 and 1.4 percentage points， respectively in comparison with the existing optimal model Split， Embed and Merge （SEM）. Ablation experiments show that all the indicators of GEAN-TSR have achieved the optimal values after using the feature fusion module， proving the effectiveness of the module. Experimental results show that GEAN-TSR can effectively improve the network performance and better complete the task of table structure recognition.

Performance optimization strategy of distributed storage for industrial time series big data based on HBase

Li YANG, Jianting CHEN, Yang XIANG

2023, 43(3): 759-766. DOI: 10.11772/j.issn.1001-9081.2022020211

Asbtract ( )

HTML ( )

PDF (2121KB) ( )

PDF（mobile） (619KB) ( 13 )

Figures and Tables | References | Related Articles | Metrics

In automated industrial scenarios， the amount of time series log data generated by a large number of industrial devices has exploded， and the demand for access to time series data in business scenarios has further increased. Although HBase， a distributed column family database， can store industrial time series big data， the existing strategies cannot meet the specific access requirements of industrial time series data well because the correlation between data and access behavior characteristics in specific business scenarios is not considered. In view of the above problem， based on the distributed storage system HBase， and using the correlation between data and access behavior characteristics in industrial scenarios， a distributed storage performance optimization strategy for massive industrial time series data was proposed. Aiming at the load tilt problem caused by characteristics of industrial time series data， a load balancing optimization strategy based on hot and cold data partition and access behavior classification was proposed. The data were classified into cold and hot ones by using a Logistic Regression （LR） model， and the hot data were distributed and stored in different nodes. In addition， in order to further reduce the cross-node communication overhead in storage cluster and improve the query efficiency of the high-dimensional index of industrial time series data， a strategy of putting the index and main data into a same Region was proposed. By designing the index RowKey field and splicing rules， the index was stored with its corresponding main data in the same Region. Experimental results on real industrial time series data show that the data load distribution tilt degree is reduced by 28.5% and the query efficiency is improved by 27.7% after introducing the optimization strategy， demonstrating the proposed strategy can mine access patterns for specific time series data effectively， distribute load reasonably， reduce data access overhead， and meet access requirements for specific time series big data.

Load balancing method based on local repair code in distributed storage

Yunbo LONG, Dan TANG

2023, 43(3): 767-775. DOI: 10.11772/j.issn.1001-9081.2022010074

Asbtract ( )

HTML ( )

PDF (1831KB) ( )

Figures and Tables | References | Related Articles | Metrics

For the low performance of hot data access in distributed storage， a load balancing method based on Local Repair Code （LRC） was proposed， which uses coding to avoid centralized access of nodes and improve the access efficiency of hot data. Firstly， a kind of special LRC suitable for small-scale storage systems was constructed by using Balanced Incomplete Block Design （BIBD）， and it was able to provide multiple access methods for encoded data. Secondly， based on Reed Solomon （RS） code and random array code， LRC was extended to a larger scale situation， and it was able to meet certain fault tolerance requirements of the storage system. Finally， a hot data access algorithm was given to reduce the pressure of hot data access， and combined with a reasonable data layout scheme， the load balancing of the storage system in high-frequency access scenarios was achieved. Theoretical analysis and experimental results show that the proposed method can achieve load balancing with very small cost， and its effect is significantly better than that of the load balancing method implemented by multiple copies and Maximum-Distance-Separable （MDS） code in the traditional method. Especially， the proposed method solves the load imbalance problem caused by uneven access to hot and cold data， which can effectively improve the access efficiency of hot data storage systems.

Multi-stage weighted concept drift detection method

Zhiqiang CHEN, Meng HAN, Hongxin WU, Muhang LI, Xilong ZHANG

2023, 43(3): 776-784. DOI: 10.11772/j.issn.1001-9081.2022020231

Asbtract ( )

HTML ( )

PDF (2112KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problem of the existing drift detection methods in balancing the detection delay， false positives， false negatives， and spatiotemporal efficiency， a new stage transition threshold parameter was proposed， and a multi-stage weighting mechanism including “stable stage-warning stage-drift stage” was introduced in the concept drift detection to weight the instances in stages， and the mechanism was applied to the double sliding window. Then a Multi-Stage weighted Drift Detection Method （MSDDM） based on Hoeffding inequality was proposed. On artificial datasets， MSDDM detected abrupt and gradual concept drift faster than Fast Hoeffding Drift Detection Method （FHDDM）， Drift Detection Method based on Hoeffding’s bound （HDDM） and other drift detection methods， while maintained a low false detection rate and a false alarm rate. At the same time， MSDDM had the highest classification accuracy in most cases compared with other methods on real-world datasets. Experimental results show that MSDDM can detect concept drift in data streams with high drift detection performance and great spatiotemporal efficiency.

Review on blockchain smart contract vulnerability detection and automatic repair

Juncheng TONG, Bo ZHAO

2023, 43(3): 785-793. DOI: 10.11772/j.issn.1001-9081.2022020179

Asbtract ( )

HTML ( )

PDF (2782KB) ( )

PDF（mobile） (582KB) ( 39 )

Figures and Tables | References | Related Articles | Metrics

Smart contract technology， as a milestone of blockchain 2.0， has received widespread attention from both academic and industry circles. It runs on an underlying infrastructure without trusted computing environment and has characteristics that distinguish it from traditional programs， and there are many vulnerabilities with huge influence in its own security， so that the research on security auditing for it has become a popular and urgent key scientific problem in the field of blockchain security. Aiming at the detection and automatic repair of smart contract vulnerabilities， firstly， main types and classifications of smart contract vulnerabilities were introduced. Secondly， three most important methods of smart contract vulnerability detection in the past five years were reviewed， and representative and innovative research techniques of each method were introduced. Thirdly， smart contract upgrade schemes and cutting-edge automatic repair technologies were introduced in detail. Finally， challenges and future work of smart contract vulnerability detection and automatic repair technologies for online， real-time， multi-platform， automatic， and intelligent requirements were analyzed and prospected as a framework of technical solutions.

Research progress in public-key encryption with keyword search

Wenshuai SONG, Miaolei DENG, Mimi MA, Haochen LI

2023, 43(3): 794-803. DOI: 10.11772/j.issn.1001-9081.2022020234

Asbtract ( )

HTML ( )

PDF (1835KB) ( )

Figures and Tables | References | Related Articles | Metrics

With the continuous development of big data and cloud computing technology， cloud platforms have become the first choice for massive data storage， and user data privacy and security has become one of the most important issues in cloud computing environment. In order to ensure security of data， users usually encrypt sensitive data and then store it in cloud servers. And how to efficiently retrieve ciphertext data on the cloud becomes a challenge. Searchable encryption technology provides an effective method for solving efficient retrieval of ciphertext by allowing users to directly retrieve ciphertext data through keywords， which protects data privacy while reducing communication and computing overhead. In recent years， in order to cope with different platforms and application scenarios， Public-key Encryption with Keyword Search （PEKS） technology has produced a large number of extension schemes based on different difficult problems， query methods， and changing structures. For security extensions and functional extensions， PEKS extension schemes were reviewed in terms of permission sharing， key management issues， fine-grained search and access control capabilities of current application requirements， and the performance of the specifically described solutions were compared and analyzed in depth， pointing out the advantages and shortcomings. Finally， the development trends of PEKS technology was summarized and prospected.

Unsupervised time series anomaly detection model based on re-encoding

Chunyong YIN, Liwen ZHOU

2023, 43(3): 804-811. DOI: 10.11772/j.issn.1001-9081.2022010006

Asbtract ( )

HTML ( )

PDF (1769KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to deal with the problem of low accuracy of anomaly detection caused by data imbalance and highly complex temporal correlation of time series， a re-encoding based unsupervised time series anomaly detection model based on Generative Adversarial Network （GAN）， named RTGAN （Re-encoding Time series based on GAN）， was proposed. Firstly， multiple generators with cycle consistency were used to ensure the diversity of generated samples and thereby learning different anomaly patterns. Secondly， the stacked Long Short-Term Memory-dropout Recurrent Neural Network （LSTM-dropout RNN） was used to capture temporal correlation. Thirdly， the differences between the generated samples and the real samples were compared in the latent space by improved re-encoding. As the re-encoding errors， these differences were served as a part of anomaly score to improve the accuracy of anomaly detection. Finally， the new anomaly score was used to detect anomalies on univariate and multivariate time series datasets. The proposed model was compared with seven baseline anomaly detection models on univariate and multivariate time series. Experimental results show that the proposed model obtains the highest average F1-score （0.815） on all datasets. And the overall performance of the proposed model is 36.29% and 8.52% respectively higher than those of the original AutoEncoder （AE） model Dense-AE （Dense-AutoEncoder） and latest benchmark model USAD （UnSupervised Anomaly Detection on multivariate time series）. The robustness of the model was detected by different Signal-to-Noise Ratio （SNR）. The results show that the proposed model consistently outperforms LSTM-VAE （Variational Autoencoder based on LSTM）， USAD and OmniAnomaly， especially in the case of 30% SNR， the F1-score of RTGAN is 13.53% and 10.97% respectively higher than those of USAD and OmniAnomaly. It can be seen that RTGAN can effectively improve the accuracy and robustness of anomaly detection.

Improved slime mould algorithm with multi-strategy fusion

Zhongrui QIU, Hong MIAO, Chengbi ZENG

2023, 43(3): 812-819. DOI: 10.11772/j.issn.1001-9081.2022020243

Asbtract ( )

HTML ( )

PDF (880KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problems of easily falling into local optimum， slow convergence and low solution accuracy of standard Slime Mould Algorithm （SMA）， an Improved Slime Mould Algorithm with Multi-Strategy fusion （MSISMA） was proposed. Firstly， Brownian motion and Levy flight were introduced to enhance the search ability of the algorithm. Secondly， according to different stages of the algorithm， the location update formula of the slime mould was improved to increase the convergence speed and accuracy of the algorithm. Thirdly， the Interval Adaptative Opposition-Based Learning （IAOBL） strategy was adopted to generate the reverse population， with which the diversity and quality of the population were improved， as a result， the convergence speed of the algorithm was improved. Finally， a convergence stagnation monitoring strategy was introduced， which would make the algorithm jump out of the local optimum by re-initializing the positions of some slime mould individuals. With 23 test functions selected，the proposed MSISMA was tested and compared with Equilibrium Slime Mould Algorithm （ESMA）， Slime Mould Algorithm combined to Adaptive Guided Differential Evolution Algorithm （SMA-AGDE）， SMA， Marine Predators Algorithm （MPA） and Equilibrium Optimizer （EO）. Moreover， the Wilcoxon rank-sum test was performed on the running results of all algorithms. Compared with the above algorithms， MSISMA achieves the best average value on 19 test functions and the best standard deviation on 12 test functions， and has the optimization accuracy improved by 23.39% to 55.97% on average. Experimental results show that the convergence speed， solution accuracy and robustness of MSISMA are significantly better.

Hybrid salp swarm and butterfly optimization algorithm combined with neighborhood centroid opposition-based learning

Junxing XIANG, Yonghong WU

2023, 43(3): 820-826. DOI: 10.11772/j.issn.1001-9081.2022010154

Asbtract ( )

HTML ( )

PDF (1499KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problems of slow convergence and premature convergence to local solutions of Butterfly Optimization Algorithm （BOA）， a neighborhood centroid opposition-based learning based Hybrid Salp Swarm and Butterfly Optimization Algorithm （HSSBOA） was proposed. Firstly， Salp Swarm Algorithm （SSA） was introduced into BOA to make the algorithm quickly deal with the local search stage， and update the population position. As a result， the optimization process was completed more effectively to avoid the algorithm falling into the local optimum. Then， neighborhood centroid opposition-based learning was introduced to make the algorithm search accurately in a small range of the neighborhood， increasing the accuracy of the algorithm. Finally， dynamic switching probability was introduced to improve the global and local proportion in the search， which accelerated the convergence of the algorithm. With ten benchmark functions selected for testing， HSSBOA was compared with several advanced algorithms from convergence accuracy， high-dimensional data， convergence speed， Wilcoxon rank sum test and Mean Absolute Error （MAE）. Research results show that HSSBOA achieves better results than other algorithms. In addition， the ablation experiment was used to further verify that the proposed improvements are positive. The performance on instance problems show that HSSBOA searches the optimal solution more effectively when solving constrained complex problems compared with other methods. It can be seen that HSSBOA has some advantages in optimization accuracy， stability and convergence efficiency， and it is able to solve complex practical problems.

Workload automatic mapper for spiking neural network based on precise communication modeling

Xia HUA, Zhenghao ZHU, Cong XU, Xihuang ZHANG, Zhilei CHAI, Wenjie CHEN

2023, 43(3): 827-834. DOI: 10.11772/j.issn.1001-9081.2022010078

Asbtract ( )

HTML ( )

PDF (1800KB) ( )

Figures and Tables | References | Related Articles | Metrics

Running a large-scale Spiking Neural Network （SNN） on a distributed computing platform is one of the basic means to improve the level of brain-like computing intelligence. The difficulty lies in how to deploy the SNN to the corresponding number of computing nodes in order to make the overall system run with the best energy efficiency. To solve this problem， on the basis of NEural Simulation Tool-based （NEST-based） Workload Automatic Mapper for SNN （SWAM） proposed by others before， a workload automatic mapper for SNN， named SWAM2， based on precise communication modeling was proposed. In SWAM2， based on the NEST simulator， the communication part of the SNN workload was further accurately modeled； the quantization method of the parameters in the workload model was improved； the maximum network scale prediction method was designed. Experimental results on typical cases of SNN show that， the average prediction errors of SWAM2 were reduced by about 12.62 and 5.15 percentage points respectively compared with those of SWAM in workload communication and computing time prediction. When predicting the optimal mapping of the workload， the average accuracy of SWAM2 reached 97.55%， which was 13.13 percentage points higher than that of SWAM. SWAM2 avoids the process of manual trial and error by automatically predicting the optimal deployment/mapping of SNN workload on computing platform.

Lighting control optimization based on improved sparrow search algorithm

Yujie ZHANG, Fan WANG

2023, 43(3): 835-841. DOI: 10.11772/j.issn.1001-9081.2022010031

Asbtract ( )

HTML ( )

PDF (5697KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the serious waste of energy in the current lighting environment， a lighting control optimization method based on Progressive Sparrow Search Algorithm （P-SSA） was proposed. Firstly， to increase the diversity of the initial population， avoid premature convergence and enhance the ability to search for optimization， the Logistic chaotic initialization， the Cauchy mutation and the memory function of the historical optimal position were introduced into SSA. Then， the presence of people in the light environment， the distribution of natural light and the coupling between multiple lamps and lanterns were comprehensively considered to establish a fitness function. DIALux evo professional lighting simulation software was used to obtain the artificial illuminance transfer matrix and natural illuminance distribution. Finally， the performance of P-SSA was verified， and several optimization algorithms were used to carry out experiments about optimization of the combination of dimming coefficients. Experimental results show that compared with optimization algorithms such as Particle Swarm Optimization algorithm （PSO） and Arithmetic Optimization Algorithm （AOA）， the lighting control optimization method based on P-SSA can find the combination of optimal dimming coefficients quickly and accurately， and meet the requirement of maximum energy saving under the premise of comfort.

Parameter identification model for time-delay chaotic systems based on temporal attention mechanism

Cong YIN, Hanping HU

2023, 43(3): 842-847. DOI: 10.11772/j.issn.1001-9081.2022010122

Asbtract ( )

HTML ( )

PDF (1452KB) ( )

Figures and Tables | References | Related Articles | Metrics

Concerning the problem of identification of parameters and time delay for chaotic systems with unknown delay， a parameter identification model for time-delay chaotic systems based on temporal attention mechanism was proposed， namely Parameter Identification Neural Network with Temporal Attention （PINN-TA）. Firstly， the time delay identification was implemented by applying temporal attention mechanism to extract correlation features within system state sequences. Then， the algebraic equations of system parameters were formed by implicitly approximating system differential equation with the use of recurrent neural network. Finally， the roots of these equations were taken as the results of parameter identification. With typical time-delay chaotic systems including delay Logistic equation， Ikeda differential equation and Mackey-Glass chaotic system used as identificated objects， PINN-TA model was compared with multiple intelligent search algorithms in experiments. Simulation results show that PINN-TA model has the identification error of parameters and time delay 90.31% to 99.36% lower in comparison with existing intelligent search algorithms such as Artificial Raindrop Algorithm （ARA）， Hybrid Cuckoo Search （HCS）， Global Flower Pollination Algorithm （GFPA） and Cellular Whale Algorithm （CWA）， while the identification time of the proposed model is shortened to 18.59 to 19.43 ms. It can be seen that PINN-TA model can meet the accuracy and real-time requirements， and provides a feasible solution for identification of parameters and time delay for time-delay chaotic systems.

Multivariate communication system based on discrete bidirectional associative memory neural network

Weikang CHEN, Qiqing ZHAI, Youguo WANG

2023, 43(3): 848-852. DOI: 10.11772/j.issn.1001-9081.2022010151

Asbtract ( )

HTML ( )

PDF (2244KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problem that noise increases the error probability of the transmission signals of nonlinear digital communication system， a multivariate communication system based on discrete Bidirectional Associative Memory （BAM） neural network was proposed. Firstly， the appropriate number of neurons and memory vectors were selected according to the signals to be transmitted， the weight matrix was calculated， and BAM neural network was generated. Secondly， the multivariate signals were mapped to the initial input vectors with modulation amplitude and continuously input into the system. The input was iterated through the neural network and Gaussian noise was added to each neuron. After that， the output was sampled according to the code element interval， and then transmitted in the lossless channel， and the decision was decoded by the receiver according to the decision rule. Finally， in the field of image processing， the proposed system was used to transmit the compressed image data and decode the recovered image. Simulation results show that for weakly modulated signals with large code element interval， with the increase of noise intensity， the error probability firstly decreases and then increases， and the stochastic resonance phenomenon is relatively obvious. At the same time， the error probability is positively correlated with the radix number of the signal， and negatively correlated with the signal amplitude， code element interval and the number of neurons. Under certain conditions， the error probability can reach 0. These results show that BAM neural network can improve the reliability of digital communication system through noise. In addition， the similarity of the image restored by decoding shows the improvement of moderate noise on image restoration effect， extending the application of BAM neural network and stochastic resonance in image compression coding.

Fast link failure recovery method for software-defined internet of vehicles

Yuan GU, Zhen ZHANG, Tong DUAN

2023, 43(3): 853-859. DOI: 10.11772/j.issn.1001-9081.2022010058

Asbtract ( )

HTML ( )

PDF (2543KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the single link failure problem in the vehicle-road real-time query communication scenario of Software-Defined Internet of Vehicles （SDIV）， a fast link failure recovery method for SDIV was proposed， which considered link recovery delay and path transmission delay after link recovery. Firstly， the failure recovery delay was modeled， and the optimization goal of minimizing the delay was transformed into a 0-1 integer linear programming problem. Then， this problem was analyzed， two algorithms were proposed according to different situations， which tried to maximize the reuse of the existing calculation results. In specific， Path Recovery Algorithm based on Topology Partition （PRA-TP） was proposed when the flow table update delay was not able to be ignored compared with the path transmission delay， and Path Recovery Algorithm based on Single Link Search （PRA-SLS） was proposed when the flow table update delay was negligible because being farless than the path transmission delay. Experimental results show that compared with Dijkstra algorithm， PRA-TP can reduce the algorithm calculation delay by 25% and the path recovery delay by 40%， and PRA-SLS can reduce the algorithm calculation delay by 60%， realizing fast single link failure recovery at vehicle end.

Service function chain deployment optimization method based on node comprehensive importance ranking

Haiyan HU, Qiaoyan KANG, Shuo ZHAO, Jianfeng WANG, Youbin FU

2023, 43(3): 860-868. DOI: 10.11772/j.issn.1001-9081.2022020257

Asbtract ( )

HTML ( )

PDF (3406KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to meet the requirements of high reliability and low latency in the 5G network environment， and reduce the resource consumption of network bandwidth at the same time， a Service Function Chain （SFC） deployment method based on node comprehensive importance ranking for traffic and reliability optimization was proposed. Firstly， Virtualized Network Function （VNF） was aggregated based on the rate of traffic change， which reduced the deployed physical nodes and improved link reliability. Secondly， node comprehensive importance was defined by the degree， reliability， comprehensive delay and link hop account of the node in order to sort the physical nodes. Then， the VNFs were mapped to the underlying physical nodes in turn. At the same time， by restricting the number of links， the “ping-pong effect” was reduced and the traffic was optimized. Finally， the virtual link was mapped through k-shortest path algorithm to complete the deployment of the entire SFC. Compared with the original aggregation method， the proposed method has the SFC reliability improved by 2%， the end-to-end delay of SFC reduced by 22%， the bandwidth overhead reduced by 29%， and the average long-term revenue-to-cost ratio increased by 16%. Experimental results show that the proposed method can effectively improve the link reliability， reduce end-to-end delay and bandwidth resource consumption， and play a good optimization effect.

Greedy synchronization topology algorithm based on formal concept analysis for traffic surveillance based sensor network

Qing YE, Xin SHI, Mengwei SUN, Jian ZHU

2023, 43(3): 869-875. DOI: 10.11772/j.issn.1001-9081.2022010141

Asbtract ( )

HTML ( )

PDF (1587KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the energy efficiency and scene adaptability problems of synchronization topology， a Greedy Synchronization Topology algorithm based on Formal Concept Analysis for traffic surveillance based sensor network （GST-FCA） was proposed. Firstly， scene adaptability requirements and energy efficiency model of the synchronization topology in traffic surveillance based sensor network were analyzed. Secondly， correlation analysis was performed on the adjacent features of sensor nodes in the same layer and adjacent layers by using Formal Concept Analysis （FCA）. Afterward， Broadcast Tuples （BT） were built and synchronization sets were divided according to the greedy strategy with the maximum number of neighbors. Thirdly， a backtracking broadcast was used to improve the broadcast strategy of layer detection in Timing-synchronization Protocol of Sensor Network （TPSN） algorithm. Meanwhile， an upward hosting mechanism was designed to not only extend the information sharing range of synchronous nodes but also further alleviate the locally optimal solution problem caused by the greedy strategy. Finally， GST-FCA was verified and tested in terms of energy efficiency and scene adaptability. Simulation results show that compared with algorithms such as TPSN， Linear Estimation of Clock Frequency Offset （LECFO）， GST-FCA decreases the synchronization packet overhead by 11.54%， 24.59% and 39.16% at lowest in the three test scenarios of deployment location， deployment scale and road deployment. Therefore， GST-FCA can alleviate the locally optimal solution problem and reduce the synchronization packet overhead， and it is excellent in energy efficiency when the synchronization topology meets the scene adaptability requirements of the above three scenarios.

Quality of service and loss evaluation method for multi-variant system

Yumei CHEN, Hongchao HU, Yawen WANG

2023, 43(3): 876-884. DOI: 10.11772/j.issn.1001-9081.2022010119

Asbtract ( )

HTML ( )

PDF (2404KB) ( )

Figures and Tables | References | Related Articles | Metrics

Multi-variant system uses diversified technology and dynamic redundancy strategy to achieve high security and high reliability of the system from the architecture level. However， the existing researches are rarely related to the quantitative evaluation of Quality of Service （QoS） for multi-variant system. For this problem， a QoS and loss evaluation method for multi-variant system was proposed. Firstly， on the basis of formal modeling of multi-variant system architecture， the QoS evaluation model and process were proposed based on QoS attribute and weight matrices， and the importances of QoS attributes were evaluated by information entropy method. Secondly， some typical examples of multi-variant system were constructed， and QoS attributes that affect the performance and security of the system were designed and selected. Finally， QoS attribute weight， QoS value， QoS difference and loss of the system were evaluated quantitatively， in which the average loss of performance attributes is 15.86 percentage points and the gain of safety attribute is 4.98 percentage points. Evaluation results show that the multi-variant mechanism has QoS loss， but its security attribute brings a certain degree of QoS gain， which provides a reference for constructing a multi-variant system with high QoS and low QoS loss.

Task scheduling algorithm for service-oriented architecture-based industrial software

Mingchao NING, Junbo ZHANG, Ge CHEN

2023, 43(3): 885-893. DOI: 10.11772/j.issn.1001-9081.2022010055

Asbtract ( )

HTML ( )

PDF (1439KB) ( )

Figures and Tables | References | Related Articles | Metrics

To address the task scheduling problem of industrial software using Service-Oriented Architecture （SOA）， a task scheduling algorithm for SOA-based industrial software was proposed， considering the multiple attributes of tasks， the randomness， time-varying and coupling relationships of attributes， and the requirements of real-time scheduling and parallel processing of tasks. Firstly， the task scheduling problem was modeled， and a utility function was designed to evaluate the importance of the task. Then， Importance Ranking-based Scheduling Algorithm （IRSA） was proposed to schedule tasks in descending order of importance. Finally， a resource reservation mechanism and a preemptive scheduling mechanism were designed in IRSA to improve the efficiency of task scheduling. Experimental results show that compared with the four online scheduling algorithms such as First Come First Serve（FCFS）， Earliest Deadline First（EDF）， Least Laxity First（LLF）， and Fixed Priority Scheduling（FPS）， when the number of arrival tasks per second reaches 7.99， IRSA reduces the average response time of tasks by 55.83% to 61.27%， respectively， and has significant advantages on all performance metrics. Therefore， IRSA can achieve efficient task scheduling for SOA-based industrial software.

Multi-depth-of-field 3D shape reconstruction with global spatio-temporal feature coupling

Jiangfeng ZHANG, Tao YAN, Bin CHEN, Yuhua QIAN, Yantao SONG

2023, 43(3): 894-902. DOI: 10.11772/j.issn.1001-9081.2022101589

Asbtract ( )

HTML ( )

PDF (2603KB) ( )

Figures and Tables | References | Related Articles | Metrics

In response to the inability of existing 3D shape reconstruction models to effectively fuse global spatio-temporal information， a Depth Focus Volume （DFV） module was proposed to retain the transition information of focus and defocus， on this basis， a Global Spatio-Temporal Feature Coupling （GSTFC） model was proposed to extract local and global spatio-temporal feature information of multi-depth-of-field image sequences. Firstly， the 3D-ConvNeXt module and 3D convolutional layer were interspersed in the shrinkage path to capture multi-scale local spatio-temporal features. Meanwhile， the 3D-SwinTransformer module was added to the bottleneck module to capture the global correlations of local spatio-temporal features of multi-depth-of-field image sequences. Then， the local spatio-temporal features and global correlations were fused into global spatio-temporal features through the adaptive parameter layer， which were input into the expansion path to guide and generate focus volume. Finally， the sequence weight information of the focus volume was extracted by DFV and the transition information of focus and defocus was retained to obtain the final depth map. Experimental results show that GSTFC decreases the Root Mean Square Error （RMSE） index by 12.5% on FoD500 dataset compared with the state-of-the-art All-in-Focus Depth Net （AiFDepthNet） model， and retains more depth-of-field transition relationships compared with the traditional Robust Focus Volume Regularization in Shape from Focus （RFVR-SFF） model.

Sparse representation-based reconstruction algorithm for filtered back-projection ultrasound tomography

Kai LUO, Liang CHEN, Wei LIANG, Yongqiang CHEN

2023, 43(3): 903-908. DOI: 10.11772/j.issn.1001-9081.2022010132

Asbtract ( )

HTML ( )

PDF (1939KB) ( )

Figures and Tables | References | Related Articles | Metrics

A Filtered Back-Projection （FBP） ultrasonic tomography reconstruction algorithm based on sparse representation was proposed to solve the difficulty of traditional ultrasonic Lamb wave in detecting and vividly describing the delamination defects composite materials. Firstly， the Lamb wave time-of-flight signals in the composite plate with defect were used as the projection values， the one-dimensional Fourier transform of the projection was equivalent to the two-dimensional Fourier transform of the original image， and the FBP reconstructed image was obtained by convolution with the filter function and projection along different directions. Then， the sparse super-resolution model was constructed and jointly trained by constructing a dictionary of low-resolution image blocks and high-resolution image blocks in order to strengthen the sparse similarity between low- and high-resolution blocks and real image blocks， and a complete dictionary was constructed using low- and high-resolution blocks. Finally， the images obtained by FBP were substituted into the constructed dictionary to obtain the complete high-resolution images. Experimental results show that the proposed algorithm improves Peak Signal-to-Noise Ratio （PSNR）， Structural Similarity （SSIM）， and Edge Structural Similarity （ESSIM） values in the reconstructed image by 9.22%， 2.90%， 80.77%， and 4.75%， 1.52%， 16.5%， respectively compared with the linear interpolation and bicubic spline interpolation algorithms. The proposed algorithm can detect delamination defects in composite materials， improve the resolution of the obtained images with delamination defects and enhance the edge details of the images.

Speech classification model based on improved Inception network

Qiuyu ZHANG, Yukun WANG

2023, 43(3): 909-915. DOI: 10.11772/j.issn.1001-9081.2022010047

Asbtract ( )

HTML ( )

PDF (1970KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the complicated process of extracting audio features by traditional audio classification models， and problems of the existing neural network models such as overfitting， low classification accuracy， and vanishing gradient， a speech classification model based on improved Inception network was proposed. Firstly， in order to avoid the vanishing gradient while increasing the depth of the network， the residual skip connection idea in Residual Network （ResNet） was added into the model to improve the traditional Inception V2 model. Secondly， the size of the convolution kernel in the Inception module was optimized， and the deep features of Log-Mel spectrogram of the original speech were extracted by using different sizes of convolutions， so that the model was able to select the appropriate convolution to process the data through self-learning. At the same time， the model was improved in depth and width dimensions in order to increase the classification accuracy. Finally， the trained network model was used to classify and predict the speech data， and the classification result was obtained through the Softmax function. Experimental results on Tsinghua University Chinese speech database THCHS-30 and ambient sound dataset UrbanSound8K show that the classification accuracy of the improved Inception network model on the above two datasets is 92.76% and 93.34% respectively. Compared with models such as Visual Geometry Group （VGG16）， InceptionV2 and GoogLeNe， the classification accuracy of the proposed model is the best， with a maximum increase of 27.30 percentage points. It can be seen that the proposed model has stronger feature fusion ability and more accurate classification results， can solve problems such as overfitting and vanishing gradient.

Object detection algorithm for remote sensing images based on geometric adaptation and global perception

Yongxiang GU, Xin LAN, Boyi FU, Xiaolin QIN

2023, 43(3): 916-922. DOI: 10.11772/j.issn.1001-9081.2022010071

Asbtract ( )

HTML ( )

PDF (2184KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problems such as small object size， arbitrary object direction and complex background of remote sensing images， on the basis of YOLOv5 （You Only Look Once version 5） algorithm， an algorithm involved with geometric adaptation and global perception was proposed. Firstly， deformable convolutions and adaptive spatial attention modules were stacked alternately in series through dense connections. As a result， a Dense Context-Aware Module （DenseCAM） which can model local geometric features was constructed on the basis of taking full advantage of different levels of semantic and location information. Secondly， by introducing Transformer in the end of the backbone network， the global perception ability of the model was enhanced at a low cost and the relationships between objects and scenario content were modeled. On UCAS-AOD and RSOD datasets， compared with YOLOv5s6 algorithm， the proposed algorithm has the mean Average Precision （mAP） increased by 1.8 percentage points and 1.5 percentage points， respectively. Experimental results show that the proposed algorithm can effectively improve the precision of object detection in remote sensing images.

Lightweight ship target detection algorithm based on improved YOLOv5

Jiadong LI, Danpu ZHANG, Yaqiong FAN, Jianfeng YANG

2023, 43(3): 923-929. DOI: 10.11772/j.issn.1001-9081.2022071096

Asbtract ( )

HTML ( )

PDF (4960KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problem of low accuracy of ship target detection at sea， a lightweight ship target detection algorithm YOLOShip was proposed on the basis of the improved YOLOv5. Firstly， dilated convolution and channel attention were introduced into Spatial Pyramid Pooling-Fast （SPPF） module， which integrated spatial feature details of different scales， strengthened semantic information， and improved the model’s ability to distinguish foreground and background. Secondly， coordinate attention and lightweight mixed depthwise convolution were introduced into Feature Pyramid Network （FPN） and Path Aggregation Network （PAN） structures to strengthen important features in the network， obtain features with more detailed information， and improve model detection ability and positioning precision. Thirdly， considering the uneven distribution and relatively small scale changes of targets in the dataset， the model performance was further improved while the model was simplified by modifying the anchors and decreasing the number of detection heads. Finally， a more flexible Polynomial Loss （PolyLoss） was introduced to optimize Binary Cross Entropy Loss （BCE Loss） to improve the model convergence speed and model precision. Experimental results show that on dataset SeaShips， in comparison with YOLOv5s，YOLOShip has the Precision， Recall， mAP@0.5 and mAP@0.5：0.95 increased by 4.2， 5.7， 4.6 and 8.5 percentage points. Thus， by using the proposed algorithm， better detection precision can be obtained while meeting the requirements of detection speed， effectively achieving high-speed and high-precision ship detection.

Automatic detection of targets under airport pavement based on channel and spatial attention

Haifeng LI, Fan ZHANG, Minnan PIAO, Huaichao WANG, Nansha LI, Zhongcheng GUI

2023, 43(3): 930-935. DOI: 10.11772/j.issn.1001-9081.2022020168

Asbtract ( )

HTML ( )

PDF (1874KB) ( )

PDF（mobile） (1557KB) ( 11 )

Figures and Tables | References | Related Articles | Metrics

In the task of detecting targets under airport pavement， B-scan maps generated by Ground Penetrating Radar （GPR） have complex backgrounds and lots of noise， especially a single B-scan map cannot reflect the complete information of an underground target. To solve these problems， a Three-Dimensional Channel and Spatial Attention UNet （3D-CSA-UNet） model was established to automatically detect the underground targets. Firstly， a Three-Dimensional Channel and Spatial parallel attention Block （3D-CS-Block） was designed to make the model focus on the underground target information in radar C-scan and suppress the interference of backgrounds and noise. Secondly， in order to enhance the capability of 3D-CS-Block in feature extraction， a multi-scale 3D segmentation model was designed to extract feature maps of different sizes from the radar C-scan. Finally， the cross-entropy loss function was employed to calculate the loss value of feature map under each scale to improve the detection accuracy of the model. On a real dataset of targets under airport pavement， compared with 3D-Fully Convolutional Network （3D-FCN）， 3D-UNet and other algorithms， 3D-CSA-UNet has the average F1 score in terms of the pixel level segmentation for void， rebar and parallel rebar targets increased by at last 12.33， 9.05 and 11.05 percentage points. Experimental results show that 3D-CSA-UNet can meet the real engineering requirements well.

DeepLabV3+ image segmentation algorithm fusing cumulative distribution function and channel attention mechanism

Xuedong HE, Shibin XUAN, Kuan WANG, Mengnan CHEN

2023, 43(3): 936-942. DOI: 10.11772/j.issn.1001-9081.2022020210

Asbtract ( )

HTML ( )

PDF (2135KB) ( )

PDF（mobile） (1747KB) ( 8 )

Figures and Tables | References | Related Articles | Metrics

In order to solve the problems that the low-level features of the backbone are not fully utilized， and the effective features are lost due to large-times upsampling in DeepLabV3+ semantic segmentation， a Cumulative Distribution Channel Attention DeepLabV3+ （CDCA-DLV3+） model was proposed. Firstly， a Cumulative Distribution Channel Attention （CDCA） was proposed based on the cumulative distribution function and channel attention. Then， the cumulative distribution channel attention was used to obtain the effective low-level features of the backbone part. Finally， the Feature Pyramid Network （FPN） was adopted for feature fusion and gradual upsampling to avoid the feature loss caused by large-times upsampling. On validation set Pascal Visual Object Classes （VOC）2012 and dataset Cityscapes， the mean Intersection over Union （mIoU） of CDCA-DLV3+ model was 80.09% and 80.11% respectively， which was 1.24 percentage points and 1.02 percentage points higher than that of DeepLabV3+ model. Experimental results show that the proposed model has more accurate segmentation results.

Improved U-Net for seal segmentation of Republican archives

You YANG, Ruhui ZHANG, Pengcheng XU, Kang KANG, Hao ZHAI

2023, 43(3): 943-948. DOI: 10.11772/j.issn.1001-9081.2022020218

Asbtract ( )

HTML ( )

PDF (1722KB) ( )

Figures and Tables | References | Related Articles | Metrics

Achieving seal segmentation precisely， it is benefit to intelligent application of the Republican archives. Concerning the problems of serious printing invasion and excessive noise， a network for seal segmentation was proposed， namely U-Net for Seal （UNet-S）. Based on the encoder-decoder framework and skip connections of U-Net， this proposed network was improved from three aspects. Firstly， multi-scale residual module was employed to replace the original convolution layer of U-Net. In this way， the problems such as network degradation and gradient explosion were avoided， while multi-scale features were extracted effectively by UNet-S. Next improvement was using Depthwise Separable Convolution （DSConv） to replace the ordinary convolution in the multi-scale residual module， thereby greatly reducing the number of network parameters. Thirdly， Binary Cross Entropy Dice Loss （BCEDiceLoss） was used and weight factors were determined by experimental results to solve the data imbalance problem of archives of the Republic of China. Experimental results show that compared with U-Net， DeepLab v2 and other networks， the Dice Similarity Coefficient （DSC）， mean Intersection over Union （mIoU） and Mean Pixel Accuracy （MPA） of UNet-S have achieved the best results， which have increased by 17.38%， 32.68% and 0.6% at most， and the number of parameters have decreased by 76.64% at most. It can be seen that UNet-S has good segmentation effect in the dataset of Republican archives.

Reconfigurable test scheme for 3D stacked integrated circuits based on 3D linear feedback shift register

Tian CHEN, Jianyong LU, Jun LIU, Huaguo LIANG, Yingchun LU

2023, 43(3): 949-955. DOI: 10.11772/j.issn.1001-9081.2022020186

Asbtract ( )

HTML ( )

PDF (2075KB) ( )

PDF（mobile） (1205KB) ( 2 )

Figures and Tables | References | Related Articles | Metrics

Due to complex structure of Three-Dimensional Stacked Integrated Circuit （3D SIC）， it is more difficult to design an efficient test structure for it to reduce test cost than for Two-Dimensional Integrated Circuit （2D IC）. For decreasing cost of 3D SIC testing， a Three-Dimensional Linear Feedback Shift Register （3D-LFSR） test structure was proposed based on Linear Feedback Shift Register （LFSR）， which can effectively adapt to different test phases of 3D SIC. The structure was able to perform tests independently in the pre-stacking tests. After the stacking， the pre-stacking test structure was reused and reconfigured into a test structure suitable for the current circuit to be tested， and the reconfigured test structure was able to further reduce test cost. Based on this structure， the corresponding test data processing method and test flow were designed， and the mixed test mode was adopted to reduce the test time. Experimental results show that compared with the dual-LFSR structure， 3D-LFSR structure has the average power consumption reduced by 40.19%， the average area overhead decreased by 21.31%， and the test data compression rate increased by 5.22 percentage points. And， using the hybrid test mode reduces the average test time by 20.49% compared to using the serial test mode.

Controllable grid multi-scroll chaotic system family and its hardware circuit implementation

Yingjie MA, Jing XIAO, Geng ZHAO, Ping ZENG, Yatao YANG

2023, 43(3): 956-961. DOI: 10.11772/j.issn.1001-9081.2022020193

Asbtract ( )

HTML ( )

PDF (3333KB) ( )

PDF（mobile） (2074KB) ( 1 )

Figures and Tables | References | Related Articles | Metrics

In order to strengthen anti-interference and anti-interception performance of chaotic system in communication link， and improve complexity of chaotic system behavior， based on typical Chua’s circuit and step function， a new type of grid multi-scroll chaotic system family with controllable quantity was constructed. First， two sets of step functions were used as nonlinear controllers of the system， which respectively controlled the numbers of the odd and even columns and the arranging rows for the grid multi-scroll chaotic attractors， and kept the scrolls and bonds in chaotic attractors being interleaved with each other. As a result， the arbitrary number of odd and even columns for the grid multi-scroll were realized. Then， the dynamic properties of system such as equilibrium point， Lyapunov exponent and attractor were theoretically analyzed and numerically simulated. Finally， the hardware experiment results of up to 4 rows and 12 columns of grid multi-scroll were given by Field Programmable Gate Array （FPGA）. Hardware and software experimental results are in full agreement with theoretical analysis results， which furtherly proves the proposed system’s physical realizability.

Robust joint modeling and optimization method for visual manipulators

Xianbojun FAN, Lijia CHEN, Shen LI, Chenlu WANG, Min WANG, Zan WANG, Mingguo LIU

2023, 43(3): 962-971. DOI: 10.11772/j.issn.1001-9081.2022010037

Asbtract ( )

HTML ( )

PDF (6333KB) ( )

Figures and Tables | References | Related Articles | Metrics

To address the problems of low accuracy， difficult deployment and high calibration cost of visual manipulator in complex system environments， a robust joint modelling and optimization method for visual manipulators was proposed. Firstly， the subsystem models of the visual manipulator were integrated together， and the sample data such as servo motor rotation angles and manipulator end-effector coordinates were collected randomly in the workspace of the manipulator. Then， an Adaptive Multiple-Elites-guided Composite Differential Evolution algorithm with shift mechanism and Layered Optimization mechanism （AMECoDEs-LO） was proposed. Simultaneous optimization of the joint system parameters was completed by using the method of parameter identification. Principal Component Analysis （PCA） was performed by AMECoDEs-LO on stage data in the population， and with the idea of parameter dimensionality reduction， an implicit guidance for convergence accuracy and speed was realized. Experimental results show that under the cooperation of AMECoDEs-LO and the joint system model， the visual manipulator does not require additional instruments during calibration， achieving fast deployment and a 60% improvement in average accuracy compared to the conventional method. In the cases of broken manipulator linkages， reduced servo motor accuracy and increased camera positioning noise， the system still maintains high accuracy， which verifies the robustness of the proposed method.

Dynamic gait recognition method based on human model constraints

Jinyue LIU, Huiyu LI, Xiaohui JIA, Jiarui LI

2023, 43(3): 972-977. DOI: 10.11772/j.issn.1001-9081.2022010131

Asbtract ( )

HTML ( )

PDF (3439KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the issue of accurate recognition of human motion gait in exoskeleton robot human computer interaction and medical rehabilitation， a dynamic gait recognition method based on human model constraints was proposed. Firstly， Anybody Modeling System （AMS） simulation software was used to establish different motion simulation models， the gait phases were devided according to the model constraints， and the corresponding relationship between the real data and the simulation data was established through regression mapping. Then， the plantar pressure data collected by the flexible pressure sensor and the foot displacement data collected by the inertial measurement unit were fused into the foot motion data， and the motion data was dynamically segmented according to its dynamic changes and the model constraints to determine the gait phase. Finally， Convolutional Neural Network （CNN） was built to identify the walking gait phase. Experimental results show that the proposed method has the average recognition accuracy of walking action gait of 94.58%， and the average gait recognition accuracy for going upstairs and downstairs actions is 93.21% and 94.64% respectively， which has the gait recognition accuracy of the three actions （walking， going upstairs and downstairs） increased by 11.34， 12.19 and 16.03 percentage points， respectively. It can be seen that CNN recognition based on dynamically segmented foot motion data has a high accuracy， and is suitable for gait recognition of different actions.

Allocation model of urban emergency medical supplies based on random evolution from perspective of resilience

Zhinan LI, Qinming LIU, Haoyang LU

2023, 43(3): 978-985. DOI: 10.11772/j.issn.1001-9081.2022020236

Asbtract ( )

HTML ( )

PDF (2252KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the difference of health system resilience in urban areas and the random evolution of demand for emergency medical supplies， a multi-stage dynamic allocation model for emergency medical supplies based on resilience assessment was proposed. Firstly， combined with the entropy method and the K-means algorithm， the resilience assessment system and classification method of area’s health system were established. Secondly， the random evolution characteristic of demand state was designed as a Markov process， and triangular fuzzy numbers were used to deal with the fuzzy demand， thereby constructing a multi-stage dynamic allocation model of emergency medical supplies. Finally， the proposed model was solved by the binary Artificial Bee Colony （ABC） algorithm， and the effectiveness of the model was analyzed and verified by an actual example. Experimental results show that the proposed model can realize the dynamic allocation of supplies to stabilize the demand changes and prioritize the allocation of areas with weak resilience， reflecting the fairness and efficiency of emergency management requirements.

Estimation method of tunnel fire smoke velocity based on particle filtering

Qiong HUANG, Zhaoyun DING

2023, 43(3): 986-990. DOI: 10.11772/j.issn.1001-9081.2022010070

Asbtract ( )

HTML ( )

PDF (1735KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the current problems of high cost， low simulation accuracy and difficulty in ensuring real-time performance of tunnel fire smoke velocity measurement， a method for estimating smoke velocity of tunnel fire based on particle filtering was proposed. Firstly， the relevant system state equation and observation equation were established， and then the observation values were obtained by using the real-time sensor data. Finally， the real-time estimation of smoke velocity was realized by using the particle filtering algorithm. Experimental results show that the response time of the proposed method can reach the millisecond level， which can basically meet the real-time requirements； and the Mean Absolute Error （MAE） of smoke velocity can basically be controlled within 20% of the true value with high simulation estimation accuracy. The proposed method can provide useful key information for fire rescue and evacuation， and provide theoretical basis for smoke extraction system and fire planning strategy.

Table of Content