Table of Content

    10 June 2023, Volume 43 Issue 6
    The 37 CCF National Conference of Computer Applications (CCF NCCA 2022)
    Survey of online learning resource recommendation
    Yongfeng DONG, Yacong WANG, Yao DONG, Yahan DENG
    2023, 43(6):  1655-1663.  DOI: 10.11772/j.issn.1001-9081.2022091335
    Asbtract ( )   HTML ( )   PDF (824KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In recent years, more and more schools tend to use online education widely. However, learners are hard to search for their needs from the massive learning resources in the Internet. Therefore, it is very important to research the online learning resource recommendation and perform personalized recommendations for learners, so as to help learners obtain the high-quality learning resources they need quickly. The research status of online learning resource recommendation was analyzed and summarized from the following five aspects. Firstly, the current work of domestic and international online education platforms in learning resource recommendation was summed up. Secondly, four types of algorithms were analyzed and discussed: using knowledge point exercises, learning paths, learning videos and learning courses as learning resource recommendation targets respectively. Thirdly, from the perspectives of learners and learning resources, using the specific algorithms as examples, three learning resource recommendation algorithms based on learners’ portraits, learners’ behaviors and learning resource ontologies were introduced in detail respectively. Moreover, the public online learning resource datasets were listed. Finally, the current challenges and future research directions were analyzed.

    Overview of classification methods for complex data streams with concept drift
    Dongliang MU, Meng HAN, Ang LI, Shujuan LIU, Zhihui GAO
    2023, 43(6):  1664-1675.  DOI: 10.11772/j.issn.1001-9081.2022060881
    Asbtract ( )   HTML ( )   PDF (1939KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    The traditional classifiers are difficult to cope with the challenges of complex types of data streams with concept drift, and the obtained classification results are often unsatisfactory. Aiming at the methods of dealing with concept drift in different types of data streams, classification methods for complex data streams with concept drift were summarized from four aspects: imbalance, concept evolution, multi-label and noise-containing. Firstly, classification methods of four aspects were introduced and analyzed: block-based and online-based learning approaches for classifying imbalanced concept drift data streams, clustering-based and model-based learning approaches for classifying concept evolution concept drift data streams, problem transformation-based and algorithm adaptation-based learning approaches for classifying multi-label concept drift data streams and noisy concept drift data streams. Then, the experimental results and performance metrics of the mentioned concept drift complex data stream classification methods were compared and analyzed in detail. Finally, the shortcomings of the existing methods and the next research directions were given.

    Survey of high utility itemset mining methods based on intelligent optimization algorithm
    Zhihui GAO, Meng HAN, Shujuan LIU, Ang LI, Dongliang MU
    2023, 43(6):  1676-1686.  DOI: 10.11772/j.issn.1001-9081.2022060865
    Asbtract ( )   HTML ( )   PDF (1951KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    High Utility Itemsets Mining (HUIM) is able to mine the items with high significance from transaction database, thus helping users to make better decisions. In view of the fact that the application of intelligent optimization algorithms can significantly improve the mining efficiency of high utility itemsets in massive data, a survey of intelligent optimization algorithm-based HUIM methods was presented. Firstly, detailed analysis and summary of the intelligent optimization algorithm-based HUIM methods were performed from three aspects: swarm intelligence optimization-based, evolution-based and other intelligent optimization algorithms-based methods. Meanwhile, the Particle Swarm Optimization (PSO)-based HUIM methods were sorted out in detail from the aspect of particle update methods, including traditional update strategy-based, sigmoid function-based, greedy-based, roulette-based and ensemble-based methods. Additionally, the swarm intelligence optimization algorithm-based HUIM methods were compared and analyzed from the perspectives of population update methods, comparison algorithms, parameter settings, advantages and disadvantages, etc. Next, the evolution-based HUIM methods were summarized and outlined in terms of both genetic and bionic aspects. Finally, the next research directions were proposed for the problems of the existing intelligent optimization algorithm-based HUIM methods.

    Survey of Parkinson’s disease auxiliary diagnosis methods based on gait analysis
    Jing QIN, Xueqian MA, Fujie GAO, Changqing JI, Zumin WANG
    2023, 43(6):  1687-1695.  DOI: 10.11772/j.issn.1001-9081.2022060926
    Asbtract ( )   HTML ( )   PDF (2009KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Focused on the existing diagnosis methods of Parkinson's Disease (PD), the auxiliary diagnosis methods of PD based on gait analysis was reviewed. In clinical practice, the common diagnosis method of gait assessment for PD is based on scales, which is simple and convenient, but is highly subjective and requires well-experienced clinical doctors. With the development of computer technology, more methods of gait analysis are provided. Firstly, PD and its abnormal manifestations in gait were summarized. Then, the common methods of auxiliary diagnosis for PD based on gait analysis were reviewed. These methods were able to be roughly divided into two types: methods based on wearable or non-wearable devices. Wearable devices are small and have high accuracy for diagnosis, and with the use of them, the gait status of patients can be monitored for a long time. With the use of non-wearable devices, human gait data is captured through video sensors such as Microsoft Kinect, without wearing related devices and restricting patients' movements. Finally, the deficiencies in the existing gait analysis methods were pointed out, and the possible development trends in the future were discussed.

    Dynamic evolution method for microservice composition systems in cloud-edge environment
    Sheng YE, Jing WANG, Jianfeng XIN, Guiling WANG, Chenhong GUO
    2023, 43(6):  1696-1704.  DOI: 10.11772/j.issn.1001-9081.2022060882
    Asbtract ( )   HTML ( )   PDF (1942KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    As the uncertainty of user requirements in the cloud-edge environment causes the microservice composition logic to be dynamically adjusted with the changes of user needs, a Dynamic Evolution method for Microservice Composition system (DE4MC) in the cloud-edge environment was proposed. Firstly, the user's operation was automatically recognized to implement the corresponding algorithm strategy. Secondly, in the deployment stage, the better node was selected by the system for deployment through the deployment algorithm in the proposed method after the user submitting the business process. Finally, in the dynamic adjustment stage, the dynamic evolution was performed by the system through the dynamic adjustment algorithm in the proposed method after the user adjusting the business process instances. In both algorithms in the proposed method, the migration cost of microservice instances, the data communication cost between microservices and users, and the data flow transmission cost between microservices were comprehensively considered to select better nodes for deployment, which shortened the running time and reduced the evolution cost. In the simulation experiment, in the deployment stage, the deployment algorithm in the proposed method has average running time of all scales 9.7% lower and total evolution cost 16.8% lower than those of the combination algorithm of Heuristic Algorithm (HA) with Non-dominated Sorting Genetic Algorithm-Ⅱ (NSGA-Ⅱ); in the dynamic adjustment stage, compared with the combination algorithm of HA and NSGA-Ⅱ, the dynamic adjustment algorithm in the proposed method has the average running time of all scales 6.3% lower, and the total evolution cost 21.7% lower. Experimental results show that the proposed method ensures timely evolution of the microservice composition system in the cloud-edge environment with low evolution cost and short business process time, and provides users with satisfactory quality of service.

    Outlier detection algorithm based on hologram stationary distribution factor
    Zhongping ZHANG, Xin GUO, Yuting ZHANG, Ruibo ZHANG
    2023, 43(6):  1705-1712.  DOI: 10.11772/j.issn.1001-9081.2022060930
    Asbtract ( )   HTML ( )   PDF (3993KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Constructing the transition probability matrix for outlier detection by using traditional graph-based methods requires the use of the overall distribution of the data, and the local information of the data is easily ignored, resulting in the problem of low detection accuracy, and using the local information of the data may lead to “suspended link” problem. Aiming at these problems, an Outlier Detection algorithm based on Hologram Stationary Distribution Factor (HSDFOD) was proposed. Firstly, a local information graph was constructed by adaptively obtaining the set of neighbors of each data point through the similarity matrix. Then, a global information graph was constructed by the minimum spanning tree. Finally, the local information graph and the global information graph were integrated into a hologram to construct a transition probability matrix for Markov random walk, and the outliers were detected through the generated stationary distribution. On the synthetic datasets A1 to A4, HDFSOD has higher precision than SOD (Outlier Detection in axis-parallel Subspaces of high dimensional data), SUOD (accelerating large-Scale Unsupervised heterogeneous Outlier Detection), IForest (Isolation Forest) and HBOS (Histogram-Based Outlier Score); and AUC (Area Under Curve) also better than the four comparison algorithms generally. On the real datasets, the precision of HSDFOD is higher than 80%, and the AUC of HSDFOD is higher than those of SOD, SUOD, IForest and HBOS. It can be seen that the proposed algorithm has a good application prospect in outlier detection.

    Multi-view ensemble clustering algorithm based on view-wise mutual information weighting
    Jinghuan LAO, Dong HUANG, Changdong WANG, Jianhuang LAI
    2023, 43(6):  1713-1718.  DOI: 10.11772/j.issn.1001-9081.2022060925
    Asbtract ( )   HTML ( )   PDF (1573KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Many of the existing multi-view clustering algorithms lack the ability to estimate the reliability of different views and thus weight the views accordingly, and some multi-view clustering algorithms with view-weighting ability generally rely on the iterative optimization of specific objective function, whose real-world applications may be significantly influenced by the practicality of the objective function and the rationality of tuning some sensitive hyperparameters. To address these problems, a Multi-view Ensemble Clustering algorithm based on View-wise Mutual Information Weighting (MEC-VMIW) was proposed, whose overall process consists of two phases: the view-wise mutual weighting phase and the multi-view ensemble clustering phase. In the view-wise mutual weighting phase, multiple random down-samplings were performed to the dataset, so as to reduce the problem size in the evaluating and weighting process. After that, a set of down-sampled clusterings of multiple views was constructed. And, based on multiple runs of mutual evaluation among the clustering results of different views, the view-wise reliability was estimated and used for view weighting. In the multi-view ensemble clustering phase, the ensemble of base clusterings was constructed for each view, and multiple base clustering sets were weighted to model a bipartite graph structure. By performing efficient bipartite graph partitioning, the final multi-view clustering results were obtained. Experiments on several multi-view datasets confirm the robust clustering performance of the proposed multi-view ensemble clustering algorithm.

    Group buying recommendation method based on social relationship and time-series information
    Nannan SUN, Chunhui PIAO, Xinna MA
    2023, 43(6):  1719-1729.  DOI: 10.11772/j.issn.1001-9081.2022060860
    Asbtract ( )   HTML ( )   PDF (3041KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aiming at the problems that there are few researches on the combination of single users and group users in group buying recommendation methods, and the context-related information such as time interval and social relationship is not fully utilized, a group buying recommendation method based on social relationship and time series information was proposed. When recommending for single users, the Gated Recurrent Unit (GRU) of Recurrent Neural Network (RNN) do not consider the influence of time series information, and the irrelevant commodity data in the user-commodity interaction sequence will generate noise. Therefore, a group buying Recommendation model integrating Time-series aware GRU and Self-Attention (RTSA) was proposed. Firstly, a Time-series aware GRU (TGRU) model was constructed by calculating the personalized time interval between any two commodities purchased by the user. Then, the influence of the commodity locations and the personalized time intervals was studied by using a self-attention network. Finally, experimental results show that on Amazon Beauty dataset, compared with the optimal baseline model of recommending for single users — Time interval aware Self-Attention for Sequential Recommendation (TiSASRec), RTSA has the hit rate for top-10 commodities increased by 11.73%. When recommending for group users, the pre-defined fusion strategy in group buying group recommendation cannot dynamically obtain group user weights, and there is sparseness in group-item interaction data. Therefore, a Group buying Recommendation model integrating Social network and hierarchical Self-Attention (SSAGR) was proposed. Firstly, an RNN was employed to capture the complex potential interests of users in group buying changing over time. Secondly, a hierarchical self-attention network was used to integrate social network information into user representations, and a group preference aggregation strategy was implemented under different weights. Thirdly, the group-item interactions were mined through Neural Collaborative Filtering (NCF) to complete group buying recommendations. Finally, experimental results show that on MaFengWo dataset, compared with the optimal baseline model of recommending for group users — AGREE (Attentive Group REcommEndation), SSAGR has the hit rate for top-5 commodities improved by 3.53%.

    Color image information hiding algorithm based on style transfer process
    Pan YANG, Minqing ZHANG, Yu GE, Fuqiang DI, Yingnan ZHANG
    2023, 43(6):  1730-1735.  DOI: 10.11772/j.issn.1001-9081.2022060953
    Asbtract ( )   HTML ( )   PDF (2861KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    To solve the problem that information hiding algorithms based on neural style transfer do not solve the embedding problem of color images, a color image information hiding algorithm based on style transfer process was proposed. Firstly, the advantages of feature extraction of Convolutional Neural Network (CNN) were utilized to extract the semantic information of the carrier image, the style information of the style image and the feature information of the color image, respectively. Then, the semantic content of images and different styles were fused together. Finally the embedding of color image was completed while performing the style transfer of the carrier image through the decoder. Experimental results show that the proposed algorithm can integrate the secret image into the generated stylized image effectively, making the secret information embedding behavior indistinguishable from the style change behavior. Under the premise of maintaining the security of the algorithm, the proposed algorithm has the hiding capacity increased to 24 bpp, and the average values of Peak Signal-to-Noise Ratio (PSNR) and Structural SIMilarity (SSIM) reached 25.29 dB and 0.85 respectively, thereby solving the color image embedding problem effectively.

    Monocular depth estimation method based on pyramid split attention network
    Wenju LI, Mengying LI, Liu CUI, Wanghui CHU, Yi ZHANG, Hui GAO
    2023, 43(6):  1736-1742.  DOI: 10.11772/j.issn.1001-9081.2022060852
    Asbtract ( )   HTML ( )   PDF (2767KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aiming at the problem of inaccurate prediction of edges and the farthest region in monocular image depth estimation, a monocular depth estimation method based on Pyramid Split attention Network (PS-Net) was proposed. Firstly, based on Boundary-induced and Scene-aggregated Network (BS-Net), Pyramid Split Attention (PSA) module was introduced in PS-Net to process the spatial information of multi-scale features and effectively establish the long-term dependence between multi-scale channel attentions, thereby extracting the boundary with sharp change depth gradient and the farthest region. Then, the Mish function was used as the activation function in the decoder to further improve the performance of the network. Finally, training and evaluation were performed on NYUD v2 (New York University Depth dataset v2) and iBims-1 (independent Benchmark images and matched scans v1) datasets. Experimental results on iBims-1 dataset show that the proposed network reduced 1.42 percentage points compared with BS-Net in measuring Directed Depth Error (DDE), and has the proportion of correctly predicted depth pixels reached 81.69%. The above proves that the proposed network has high accuracy in depth prediction.

    Image segmentation model based on improved particle swarm optimization algorithm and genetic mutation
    Jun LIANG, Zehong HONG, Songsen YU
    2023, 43(6):  1743-1749.  DOI: 10.11772/j.issn.1001-9081.2022060945
    Asbtract ( )   HTML ( )   PDF (1649KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Image segmentation is a key step from image processing to image analysis. For the limitation that cluster partitioning has a large dependence on the initial cluster center, an image segmentation model PSOM-K (Particle Swarm Optimization Mutations-K-means) based on improved Particle Swarm Optimization (PSO) algorithm and genetic mutation was proposed. Firstly, the PSO formula was improved by increasing the influence of random neighbor particle positions on its own position, and expanding the search space of the algorithm, so that the algorithm was able to find out the global optimal solution quickly. Secondly, mutation operation of genetic algorithm was combined to improve the generalization ability of the model. Thirdly, the positions of the k-means cluster centers were initialized with the improved PSO algorithm from the three channels: Red (R), Green (G) and Blue (B). Finally, k-means was used to perform the image segmentation from the three channels: R, G, and B, and the images of the three channels were merged. Experimental results on Berkeley Segmentation Dataset (BSDS500) show that the improvement of Feature Similarity Index Measure (FSIM) at k=4 is 7.7% to 12.69% compared to CEFO (Chaotic Electromagnetic Field Optimization) method and 5.05% to 19.02% compared to WOA-DE (Whale Optimization Algorithm-Differential Evolution) method.Compared with the fine-grained segmentation algorithm HWOA (Hybrid Whale Optimization Algorithm), PSOM-K decreases at most 0.45% in FSIM but improves 7.59% to 13.58% in Peak Signal-to-Noise Ratio (PSNR) at k=40. Therefore, three independent channels, increasing the position influence of random neighbor particles in the particle swarm and genetic mutation are three effective strategies to find the better positions of k-means cluster centers, and they can improve the performance of image segmentation greatly.

    Few-shot recognition method of 3D models based on Transformer
    Hui WANG, Jianhong LI
    2023, 43(6):  1750-1758.  DOI: 10.11772/j.issn.1001-9081.2022060952
    Asbtract ( )   HTML ( )   PDF (3334KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aiming at the classification problems of Three-Dimensional (3D) models, a method of few-shot recognition of 3D models based on Transformer was proposed. Firstly, the 3D point cloud models of the support and query samples were fed into the feature extraction module to obtain feature vectors. Then, the attention features of the support samples were calculated in the Transformer module. Finally, the cosine similarity network was used to calculate the relation scores between the query samples and the support samples. On ModelNet 40 dataset, compared with the Dual-Long Short-Term Memory (Dual-LSTM) method, the proposed method has the recognition accuracy of 5-way 1-shot and 5-way 5-shot increased by 34.54 and 21.00 percentage points, respectively. At the same time, the proposed method also obtains high accuracy on ShapeNet Core dataset. Experimental results show that the proposed method can recognize new categories of 3D models more accurately.

    Remora optimization algorithm based on chaotic host switching mechanism
    Heming JIA, Shanglong LI, Lizhen CHEN, Qingxin LIU, Di WU, Rong ZHENG
    2023, 43(6):  1759-1767.  DOI: 10.11772/j.issn.1001-9081.2022060901
    Asbtract ( )   HTML ( )   PDF (1965KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    The optimization process of Remora Optimization Algorithm (ROA) includes three modes: attaching to host, empirical attack and host foraging, and the exploration ability and exploitation ability of this algorithm are relatively strong. However, because the original algorithm switches the host through empirical attack, it will lead to the poor balance between exploration and exploitation, slow convergence and being easy to fall into local optimum. Aiming at the above problems, a Modified ROA (MROA) based on chaotic host switching mechanism was proposed. Firstly, a new host switching mechanism was designed to better balance the abilities of exploration and exploitation. Then, in order to diversify the initial hosts of remora, Tent chaotic mapping was introduced for population initialization to further optimize the performance of the algorithm. Finally, MROA was compared with six algorithms such as the original ROA and Reptile Search Algorithm (RSA) in the CEC2020 test functions. Through the analysis of the experimental results, it can be seen that the best fitness value, average fitness value and fitness value standard deviation obtained by MROA are better than those obtained by ROA, RSA, Whale Optimization Algorithm (WOA), Harris Hawks Optimization (HHO) algorithm, Sperm Swarm Optimization (SSO) algorithm, Sine Cosine Algorithm (SCA), and Sooty Tern Optimization Algorithm (STOA) by 28%, 33%, and 12% averagely and respectively. The test results based on CEC2020 show that MROA has good optimization ability, convergence ability and robustness. At the same time, the effectiveness of MROA in engineering problems was further verified by solving the design problems of welded beam and multi-plate clutch brake.

    Algorithm path self-assembling model for business requirements
    Yao LIU, Xin TONG, Yifeng CHEN
    2023, 43(6):  1768-1778.  DOI: 10.11772/j.issn.1001-9081.2022060944
    Asbtract ( )   HTML ( )   PDF (1992KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    The algorithm platform, as the implementation way of automatic machine learning, has attracted the wide attention in recent years. However, the business processes of these platforms need to be built manually, and these platforms are faced with inflexible model calling and the incapability of customized automatic algorithm construction for specific business requirements. To address these problems, an algorithm path self-assembling model for business requirements was proposed. Firstly, the sequence features and structural features of code were modeled simultaneously based on Graph Convolutional Network (GCN) and word2vec representation. Secondly, functions in the algorithm set were further discovered through a clustering model, and the obtained function subsets were used for the preparation of the path discovery of algorithm components between subsets. Finally, based on the relationship discovery model and ranking model trained with prior knowledge, the self-assembled paths of candidate code components were mined, thus realizing the algorithm code self-assembling. Using the proposed evaluation indicators for comparison and analysis, the best result of the proposed algorithm path self-assembling model is 0.8, while that of the baseline model Okapi BM25+word2vec is 0.21. To a certain extent, the proposed model solves the problem of missing code structure and semantic information in traditional code representation methods and lays the foundation for the research of refinement of algorithm process self-assembling and automatic construction of algorithm pipelines.

    Self-adaptive Web crawler code generation method based on webpage source code structure comprehension
    Yao LIU, Ru LIU, Yu ZHAI
    2023, 43(6):  1779-1784.  DOI: 10.11772/j.issn.1001-9081.2022060929
    Asbtract ( )   HTML ( )   PDF (1224KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    To address the problems of Web crawler code failure and high manual maintenance cost caused by webpage source code changes led by frequent webpage redesigns, especially changes in element structures or attribute identifiers of target entities such as article dates, main body of text or source organizations, a self-adaptive Web crawler code generation method based on webpage source code structure comprehension was proposed. Firstly, the corresponding Web crawler code was extracted by analyzing the change patterns of webpage structural characteristics. Secondly, the changes in the webpage source code and code were represented by the Encoder-Decoder model. By fusing the semantic features of the webpage source code structure, the features of webpage source code changes and the features of webpage code changes, an adaptive code generation model was obtained. Finally, the perception, generation and activation mechanisms of the adaptive system were improved to form a Web crawler system with adaptive processing capability. Compared with TF-IDF+Seq2Seq and TriDNR+Seq2Seq models, the proposed adaptive code generation model was experimentally verified to show the superiority in the representation of webpage source code changes and the effectiveness of code generation with a final accuracy of 78.5%. With the proposed method, the Web crawler code operation problems caused by the webpage source code changes could be solved, and a new idea for the adaptive processing capability of Web resource acquisition — Web crawler technique was provided.

    Artificial intelligence
    Review of lifelong learning in computer vision
    Yichi CHEN, Bin CHEN
    2023, 43(6):  1785-1795.  DOI: 10.11772/j.issn.1001-9081.2022050766
    Asbtract ( )   HTML ( )   PDF (2053KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    LifeLong learning (LLL), as an emerging method, breaks the limitations of traditional machine learning and gives the models the ability to accumulate, optimize and transfer knowledge in the learning process like human beings. In recent years, with the wide application of deep learning, more and more studies attempt to solve catastrophic forgetting problem in deep neural networks and get rid of the stability-plasticity dilemma, as well as apply LLL methods to a wide varieties of real-world scenarios to promote the development of artificial intelligence from weak to strong. Aiming at the field of computer vision, firstly, LLL methods were classified into four types in image classification tasks: data-driven methods, optimization process based methods, network structure based methods and knowledge combination based methods. Then, typical applications of LLL methods in other visual tasks and related evaluation indicators were introduced. Finally, the deficiencies of LLL methods at current stage were discussed, and the future development directions of LLL methods were proposed.

    Aspect-based sentiment analysis model fused with multi-window local information
    Zhixiong ZHENG, Jianhua LIU, Shuihua SUN, Ge XU, Honghui LIN
    2023, 43(6):  1796-1802.  DOI: 10.11772/j.issn.1001-9081.2022060891
    Asbtract ( )   HTML ( )   PDF (1323KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Focused on the issue that the current Aspect-Based Sentiment Analysis (ABSA) models rely too much on the syntactic dependency tree with relatively sparse relationships to learn feature representations, which leads to the insufficient ability of the model to learn local information, an ABSA model fused with multi-window local information called MWGAT (combining Multi-Window local information and Graph ATtention network) was proposed. Firstly, the local contextual features were learned through the multi-window local feature learning mechanism, and the potential local information contained in the text was mined. Secondly, Graph ATtention network (GAT), which can better understand the syntactic dependency tree, was used to learn the syntactic structure information represented by the syntactic dependency tree, and syntax-aware contextual features were generated. Finally, these two types of features representing different semantic information were fused to form the feature representation containing both the syntactic information of syntactic dependency tree and the local information, so that the sentiment polarities of aspect words were discriminated by the classifier efficiently. Three public datasets, Restaurant, Laptop, and Twitter were used for experiment. The results show that compared with the T-GCN (Type-aware Graph Convolutional Network) model combined with the syntactic dependency tree, the proposed model has the Macro-F1 score improved by 2.48%, 2.37% and 0.32% respectively. It can be seen that the proposed model can mine potential local information effectively and predict the sentiment polarities of aspect words more accurately.

    Cross-modal person re-identification relation network based on dual-stream structure
    Yubin GUO, Xiang WEN, Pan LIU, Ximing LI
    2023, 43(6):  1803-1810.  DOI: 10.11772/j.issn.1001-9081.2022050665
    Asbtract ( )   HTML ( )   PDF (1787KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In visible-infrared cross-modal person re-identification, the modal differences will lead to low identification accuracy. Therefore, a dual-stream structure based cross-modal person re-identification relation network, named IVRNBDS (Infrared and Visible Relation Network Based on Dual-stream Structure), was proposed. Firstly, the dual-stream structure was used to extract the features of the visible light modal and the infrared modal person images respectively. Then, the feature map of the person image was divided into six segments horizontally to extract relationships between the local features of each segment and the features of other segments of the person and the relationship between the core features and average features of the person. Finally, when designing loss function, the Hetero-Center triplet Loss (HC Loss) function was introduced to relax the strict constraints of the ordinary triplet loss function, so that image features of different modals were able to be better mapped into the same feature space. Experimental results on public datasets SYSU-MM01 (SunYat-Sen University MultiModal re-identification) and RegDB (Dongguk Body-based person Recognition) show that the computational cost of IVRNBDS is slightly higher than those of the mainstream cross-modal person re-identification algorithms, but the proposed network has the Rank-1 (similarity Rank 1) and mAP (mean Average Precision) improved compared to the mainstream algorithms, increasing the recognition accuracy of the cross-modal people re-identification algorithm.

    Pedestrian fall detection algorithm in complex scenes
    Ke FANG, Rong LIU, Chiyu WEI, Xinyue ZHANG, Yang LIU
    2023, 43(6):  1811-1817.  DOI: 10.11772/j.issn.1001-9081.2022050754
    Asbtract ( )   HTML ( )   PDF (2529KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    With the deepening of population aging, fall detection has become a key issue in the medical and health field. Concerning the low accuracy of fall detection algorithms in complex scenes, an improved fall detection model PDD-FCOS (PVT DRFPN DIoU-Fully Convolutional One-Stage object detection) was proposed. Pyramid Vision Transformer (PVT) was introduced into the backbone network of baseline FCOS algorithm to extract richer semantic information without increasing the amount of computation. In the feature information fusion stage, Double Refinement Feature Pyramid Networks (DRFPN) were inserted to learn the positions and other information of sampling points between feature maps more accurately, and more accurate semantic relationship between feature channels was captured by context information to improve the detection performance. In the training stage, the bounding box regression was carried out by the Distance Intersection Over Union (DIoU) loss. By optimizing the distance between the prediction box and the center point of the object box, the regression box was made to converge faster and more accurately, which improved the accuracy of the fall detection algorithm effectively. Experimental results show that on the open-source dataset Fall detection Database, the mean Average Precision (mAP) of the proposed model reaches 82.2%, which is improved by 6.4 percentage points compared with that of the baseline FCOS algorithm, and the proposed algorithm has accuracy improvement and better generalization ability compared with other state-of-the-art fall detection algorithms.

    Semantic segmentation for 3D point clouds based on feature enhancement
    Bin LU, Jielin LIU
    2023, 43(6):  1818-1825.  DOI: 10.11772/j.issn.1001-9081.2022050688
    Asbtract ( )   HTML ( )   PDF (8463KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In order to mine and sense the geometric features of point clouds and further improve the semantic segmentation effect of point clouds by feature enhancement, a point clouds semantic segmentation network based on feature enhancement was proposed. Firstly, the Geometric Feature Sensing Of Point cloud (GFSOP) module was designed to make the network capable of sensing the local geometric structure of point clouds, semantic representations were enhanced by capturing spatial features between points, and multi-scale features were obtained by the idea of hierarchical extraction of features. At the same time, spatial attention and channel attention were fuseed to predict semantic labels of point clouds, and the segmentation performance was improved by strengthening spatial correlation and channel dependence. Experimental results on the indoor dataset S3DIS (Stanford large-scale 3D Indoor Spaces) show that compared with PointNet++, the proposed network improves the mean Intersection over Union (mIoU) by 5.7 percentage points and the Overall Accuracy (OA) by 3.1 percentage points, and has stronger generalization performance and more robust segmentation effect on point clouds with problems of noise, uneven point cloud density and unclear boundaries.

    Ancient mural dynasty identification based on attention mechanism and transfer learning
    Huibin ZHANG, Liping FENG, Yaojun HAO, Yining WANG
    2023, 43(6):  1826-1832.  DOI: 10.11772/j.issn.1001-9081.2022071008
    Asbtract ( )   HTML ( )   PDF (1804KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Convolutional Neural Networks (CNNs) have been successfully used to classify dynasties of ancient murals from Dunhuang. Aiming at the problem that using some data enhancement methods to expand the training set would reduce the prediction accuracy due to the limited amount of data of Dunhuang murals, a Residual Network (ResNet) model based on attention mechanism and transfer learning was proposed. Firstly, the residual connection method of the residual network was improved. Then, the POlarized Self-Attention (POSA) module was used to help the network model to extract the edge local detail features and global contour features of the images, and the learning ability of the network model in a small sample environment was enhanced. Finally, the algorithm for classifier was improved, so that the classification performance of the network model was improved. Experimental results show that the proposed model achieves 98.05% accuracy of dynastic classification on DH1926 small sample dataset of Dunhuang murals, and the dynasty identification accuracy of the proposed model is improved by 5.21 percentage points compared with that of the standard ResNet20 network model.

    Data science and technology
    Survey on anomaly detection algorithms for unmanned aerial vehicle flight data
    Chaoshuai QI, Wensi HE, Yi JIAO, Yinghong MA, Wei CAI, Suping REN
    2023, 43(6):  1833-1841.  DOI: 10.11772/j.issn.1001-9081.2022060808
    Asbtract ( )   HTML ( )   PDF (3156KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Focused on the issue of anomaly detection for Unmanned Aerial Vehicle (UAV) flight data in the field of UAV airborne health monitoring, firstly, the characteristics of UAV flight data, the common flight data anomaly types and the corresponding demands on anomaly detection algorithms for UAV flight data were presented. Then, the existing research on UAV flight data anomaly detection algorithms was reviewed, and these algorithms were classified into three categories: prior-knowledge based algorithms for qualitative anomaly detection, model-based algorithms for quantitative anomaly detection, and data-driven anomaly detection algorithms. At the same time, the application scenarios, advantages and disadvantages of the above algorithms were analyzed. Finally, the current problems and challenges of UAV anomaly detection algorithms were summarized, and key development directions of the field of UAV anomaly detection were prospected, thereby providing reference ideas for future research.

    Feature selection for imbalanced data based on neighborhood tolerance mutual information and whale optimization algorithm
    Lin SUN, Jinxu HUANG, Jiucheng XU
    2023, 43(6):  1842-1854.  DOI: 10.11772/j.issn.1001-9081.2022050691
    Asbtract ( )   HTML ( )   PDF (1713KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aiming at the problems that most feature selection algorithms do not fully consider class non-uniform distribution of data, the correlation between features and the influence of different parameters on the feature selection results, a feature selection method for imbalanced data based on neighborhood tolerance mutual information and Whale Optimization Algorithm (WOA) was proposed. Firstly, for the binary and multi-class datasets in incomplete neighborhood decision system, two kinds of feature importances of imbalanced data were defined on the basis of the upper and lower boundary regions. Then, to fully reflect the decision-making ability of features and the correlation between features, the neighborhood tolerance mutual information was developed. Finally, by integrating the feature importance of imbalanced data and the neighborhood tolerance mutual information, a Feature Selection for Imbalanced Data based on Neighborhood tolerance mutual information (FSIDN) algorithm was designed, where the optimal parameters of feature selection algorithm were obtained by using WOA, and the nonlinear convergence factor and adaptive inertia weight were introduced to improve WOA and avoid WOA from falling into the local optimum. Experiments were conducted on 8 benchmark functions, the results show that the improved WOA has good optimization performance; and the experimental results of feature selection on 13 binary and 4 multi-class imbalanced datasets show that the proposed algorithm can effectively select the feature subsets with good classification effect compared with the other related algorithms.

    Noise robust dynamic time warping algorithm
    Lianpeng QIU, Chengyun SONG
    2023, 43(6):  1855-1860.  DOI: 10.11772/j.issn.1001-9081.2022060885
    Asbtract ( )   HTML ( )   PDF (3337KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    The Dynamic Time Warping (DTW) algorithm measures the similarity between two time series by finding the best match between two time series. Aiming at the problem of excessive stretching and compression during time series matching due to noise existing in the sequence, a Noise robust Dynamic Time Warping (NoiseDTW) algorithm was proposed. Firstly, after introducing extra noise into the original signal, and the problem of one point aligning multiple points in sequence alignment was solved. Secondly, by finding an optimal matching path between two time series with multiple possible matching paths, the influence of randomness of noise on the time series similarity measure was reduced. Finally, the matching paths were mapped to the original sequence. Experimental results show that compared to Euclidean Distance (ED), DTW, Sakoe-Chiba window DTW (Sakoe-Chiba DTW) and Weighted DTW (WDTW) algorithms, combined with K-Nearest Neighbors (KNN), the proposed algorithm has the classification accuracy improved by 1 to 15 percentage points compared to the suboptimal algorithm on eight time series datasets, respectively, indicating that the proposed algorithm has good classification performance and is robust to noise.

    Cyber security
    Intrusion detection method for control logic injection attack against programmable logic controller
    Yiting SUN, Yue GUO, Changjin LI, Hongjun ZHANG, Kang LIU, Junjiao Liu, Limin SUN
    2023, 43(6):  1861-1869.  DOI: 10.11772/j.issn.1001-9081.2022050914
    Asbtract ( )   HTML ( )   PDF (3665KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Control logic injection attack against Programmable Logic Controller (PLC) manipulate the physical process by tampering with the control program, thereby achieving the purpose of affecting the control process or destroying the physical facilities. Aiming at PLC control logic injection attacks, an intrusion detection method based on automatic whitelist rules generation was proposed, called PLCShield (Programmable Logic Controller Shield). Based on the fact that PLC control program carries comprehensive and complete physical process control information, the proposed method mainly includes two stages: firstly, by analyzing the PLC program’s configuration file, instruction function, variable attribute, execution path and other information, the detection rules such as program attribute, address, value range and structure were extracted; secondly, combining actively requesting a “snapshot” of the PLC’s running and passively monitoring network traffic was used to obtain real-time information such as the current running status of PLC and the operation and status in the traffic, and the attack behavior was identified by comparing the obtained information with the detection rules. Four PLCs of different manufacturers and models were used as research cases to verify the feasibility of PLCShield. Experimental results show that the attack detection accuracy of the proposed method can reach more than 97.71%. The above prove that the proposed method is effective.

    Software Guard Extensions-based secure data processing framework for traffic monitoring of internet of vehicles
    Ruiqi FENG, Leilei WANG, Xiang LIN, Jinbo XIONG
    2023, 43(6):  1870-1877.  DOI: 10.11772/j.issn.1001-9081.2022050734
    Asbtract ( )   HTML ( )   PDF (1801KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Internet of Vehicles (IoV) traffic monitoring requires the transmission, storage and analysis of private data of users, making the security guarantee of private data particularly crucial. However, traditional security solutions are often hard to guarantee real-time computing and data security at the same time. To address the above issue, security protocols, including two initialization protocols and a periodic reporting protocol, were designed, and a Software Guard Extensions (SGX)-based IoV traffic monitoring Secure Data Processing Framework (SDPF) was built. In SDPF, the trusted hardware was used to enable the plaintext computation of private data in Road Side Unit (RSU), and efficient operation and privacy protection of the framework were ensured through security protocols and hybrid encryption scheme. Security analysis shows that SDPF is resistant to eavesdropping, tampering, replay, impersonation, rollback, and other attacks. Experiment results show that all computational operations of SDPF are at millisecond level, specifically, all data processing overhead of a single vehicle is less than 1 millisecond. Compared with PFCF (Privacy-preserving Fog Computing Framework for vehicular crowdsensing networks) based on fog computing and PPVF (Privacy-preserving Protocol for Vehicle Feedback in cloud-assisted Vehicular Ad hoc NETwork (VANET)) based on homomorphic encryption, SDPF has the security design more comprehensive: the message length of a single session is reduced by more than 90%, and the computational cost is reduced by at least 16.38%.

    Adaptive interaction feedback based trust evaluation mechanism for power terminals
    Xingshen WEI, Peng GAO, Zhuo LYU, Yongjian CAO, Jian ZHOU, Zhihao QU
    2023, 43(6):  1878-1883.  DOI: 10.11772/j.issn.1001-9081.2022050717
    Asbtract ( )   HTML ( )   PDF (1177KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In power system, the trust evaluation of terminals is a key technology to grade the access and securely collect data, which is critical to ensure the safe and stable operation of the power grid. Traditional trust evaluation models usually calculate the trust score directly based on identification, running states and interaction histories, etc. of the terminals, and show poor performance with indirect attacks and node collusion. To address these problems, an Adaptive Interaction Feedback based Trust evaluation (AIFTrust) mechanism was proposed. In the proposed mechanism, device trust level was measured comprehensively based on direct trust evaluation module, trust recommendation module and trust aggregation module, and accurate trust evaluation for massive collaborative terminals in power information systems was achieved. First, the interaction cost was introduced by the direct trust evaluation module, and the direct trust score of the malicious target terminal was calculated on the basis of the trust decay policy. Then, the experience similarity was introduced by the trust recommendation evaluation module, and similar terminals were recommended through secondary clustering to improve the reliability of the recommendation trust scoring. After the above, the trust aggregation module was used to adaptively aggregate the direct trust score and the recommendation trust score based on the trust score accuracy. Simulation results on real datasets and synthetic datasets show that when attack probability is 30% and trust decay rate is 0.05, AIFTrust improves the recommendation accuracy by 13.30% and 14.81% compared to the similarity-based trust evaluation method SFM (Similarity FraMework) and the trust evaluation method based on objective information entropy CRT (Reputation Trusted based on Cooperation), respectively.

    Advanced computing
    Integrated scheduling optimization of multiple data centers based on deep reinforcement learning
    Heping FANG, Shuguang LIU, Yongyi RAN, Kunhua ZHONG
    2023, 43(6):  1884-1892.  DOI: 10.11772/j.issn.1001-9081.2022050722
    Asbtract ( )   HTML ( )   PDF (2415KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    The purpose of the task scheduling strategy for multiple data centers is to allocate computing tasks to different servers in each data center to improve the resource utilization and energy efficiency. Therefore, a deep reinforcement learning-based integrated scheduling strategy for multiple data center was proposed, which is divided into two stages: data center selection and task allocation within the data centers. In the multiple data centers selection stage, the computing power resources were integrated to improve the overall resource utilization. Firstly, a Deep Q Network (DQN) with Prioritized Experience Replay (PER-DQN) was used to obtain the communication paths to each data center in the network with data centers as nodes. Then, the resource usage cost and network communication cost were calculated, and the optimal data center was selected according to the principle that the sum of the two costs is minimum. In the task allocation stage, firstly, in the selected data center the computing tasks were divided and added to the scheduling queue according to the First-Come First-Served (FCFS) principle. Then, combining the computing device status and ambient temperature, the task allocation algorithm based on Double DQN (Double DQN) was used to obtain the optimal allocation strategy, thereby selecting the server to perform the computing task, avoiding the generation of hot spots and reducing the energy consumption of refrigeration equipment. Experimental results show that the average total cost of PER-DQN-based data center selection algorithm is reduced by 3.6% and 10.0% respectively compared to those of Computing Resource First (CRF) and Shortest Path First (SPF) path selection methods. Compared to Round Robin scheduling (RR) and Greedy scheduling (Greedy) algorithms, the Double DQN-based task deployment algorithm reduces the average Power Usage Effectiveness (PUE) by 2.5% and 1.7% respectively. It can be seen that the proposed strategy can reduce the total cost and data center energy consumption effectively, and realize the efficient operation of multiple data centers.

    Task offloading algorithm for UAV-assisted mobile edge computing
    Xiaolin LI, Yusang JIANG
    2023, 43(6):  1893-1899.  DOI: 10.11772/j.issn.1001-9081.2022040548
    Asbtract ( )   HTML ( )   PDF (2229KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Unmanned Aerial Vehicle (UAV) is flexible and easy to deploy, and can assist Mobile Edge Computing (MEC) to help wireless systems improve coverage and communication quality. However, there are challenges such as computational latency requirements and resource management in the research of UAV-assisted MEC systems. Aiming at the delay problem of UAV providing auxiliary calculation services to multiple ground terminals, a Twin Delayed Deep Deterministic policy gradient (TD3) based Task Offloading Algorithm for Delay Minimization (TD3-TOADM) was proposed. Firstly, the optimization problem was modeled as the problem of minimizing the maximum computational delay under energy constraints. Secondly, TD3-TOADM was used to jointly optimize terminal equipment scheduling, UAV trajectory and task offloading ratio to minimize the maximum computational delay. Simulation analysis results show that compared with the task offloading algorithms based on Actor-Critic (AC), Deep Q-Network (DQN) and Deep Deterministic Policy Gradient (DDPG), TD3-TOADM reduces the computational delay by more than 8.2%. It can be seen that TD3-TOADM algorithm has good convergence and robustness, and can obtain the optimal offloading strategy with low delay.

    Network and communications
    Wireless traffic prediction based on federated learning
    Shangjing LIN, Ji MA, Bei ZHUANG, Yueying LI, Ziyi LI, Tie LI, Jin TIAN
    2023, 43(6):  1900-1909.  DOI: 10.11772/j.issn.1001-9081.2022050721
    Asbtract ( )   HTML ( )   PDF (4071KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Wireless communication network traffic prediction is of great significance to operators in network construction, base station wireless resource management and user experience improvement. However, the existing centralized algorithm models face the problems of complexity and timeliness, so that it is difficult to meet the traffic prediction requirements of the whole city scale. Therefore, a distributed wireless traffic prediction framework under cloud-edge collaboration was proposed to realize traffic prediction based on single grid base station with low complexity and communication overhead. Based on the distributed architecture, a wireless traffic prediction model based on federated learning was proposed. Each grid traffic prediction model was trained synchronously, JS (Jensen-Shannon) divergence was used to select grid traffic models with similar traffic distributions through the center cloud server, and Federated Averaging (FedAvg) algorithm was used to fuse the parameters of the grid traffic models with similar traffic distributions, so as to improve the model generalization and describe the regional traffic accurately at the same time. In addition, as the traffic in different areas within the city was highly differentiated in features, on the basis of the algorithm, a federated training method based on coalitional game was proposed. Combined with super-additivity criteria, the grids were taken as participants in the coalitional game, and screened. And the core of the coalitional game and the Shapley value were introduced for profit distribution to ensure the stability of the alliance, thereby improving the accuracy of model prediction. Experimental results show that taking Short Message Service (SMS) traffic as an example, compared with grid-independent training, the proposed model has the prediction error decreased most significantly in the suburb, with a decline range of 26.1% to 28.7%, the decline range is 0.7% to 3.4% in the urban area, and 0.8% to 4.7% in the downtown area. Compared with the grid-centralized training, the proposed model has the prediction error in the three regions decreased by 49.8% to 79.1%.

    Multimedia computing and computer simulation
    Weakly supervised salient object detection algorithm based on bounding box annotation
    Qiang WANG, Xiaoming HUANG, Qiang TONG, Xiulei LIU
    2023, 43(6):  1910-1918.  DOI: 10.11772/j.issn.1001-9081.2022050706
    Asbtract ( )   HTML ( )   PDF (3663KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aiming at the inaccurate positioning problem of salient object in the previous weakly supervised salient object detection algorithms, a weakly supervised salient object detection algorithm based on bounding box annotation was proposed. In the proposed algorithm, the minimum bounding rectangle boxes, which are the bounding boxes of all objects in the image were adopted as supervision information. Firstly, the initial saliency map was generated based on the bounding box annotation and GrabCut algorithm. Then, a correction module for missing object was designed to obtain the optimized saliency map. Finally, by combining the advantages of the traditional methods and deep learning methods, the optimized saliency map was used as the pseudo ground-truth to learn a salient object detection model through neural network. Comparison of the proposed algorithm and six unsupervised and four weakly supervised saliency detection algorithms was carried on four public datasets. Experimental results show that the proposed algorithm significantly outperforms comparison algorithms in both Max F-measure value (Max-F) and Mean Absolute Error (MAE) on four datasets. Compared with SBB (Sales Bounding Boxes), which is also a weakly supervised method based on boundary box annotation, the annotation method of the proposed algorithm is simpler. Experiments were conducted on four datasets, ECSSD, DUTS-TE, HKU-IS, DUT-OMRON, and the Max-F increased by 1.82%, 4.00%, 1.27% and 5.33% respectively, and the MAE decreased by 13.89%, 15.07%, 8.77% and 13.33%, respectively. It can be seen that the proposed algorithm is a weakly supervised salient object detection algorithm with good detection performance.

    Multi-object tracking method based on dual-decoder Transformer
    Li WANG, Shibin XUAN, Xuyang QIN, Ziwei LI
    2023, 43(6):  1919-1929.  DOI: 10.11772/j.issn.1001-9081.2022050753
    Asbtract ( )   HTML ( )   PDF (4498KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    The Multi-Object Tracking (MOT) task needs to track multiple objects at the same time and ensures the continuity of object identities. To solve the problems in the current MOT process, such as object occlusion, object ID Switch (IDSW) and object loss, the Transformer-based MOT model was improved, and a multi-object tracking method based on dual-decoder Transformer was proposed. Firstly, a set of trajectories was generated by model initialization in the first frame, and in each frame after the first one, attention was used to establish the association between frames. Secondly, the dual-decoder was used to correct the tracked object information. One decoder was used to detect the objects, and the other one was used to track the objects. Thirdly, the histogram template matching was applied to find the lost objects after completing the tracking. Finally, the Kalman filter was utilized to track and predict the occluded objects, and the occluded results were associated with the newly detected objects to ensure the continuity of the tracking results. In addition, on the basis of TrackFormer, the modeling of apparent statistical characteristics and motion features was added to realize the fusion between different structures. Experimental results on MOT17 dataset show that compared with TrackFormer, the proposed algorithm has the IDentity F1 Score (IDF1) increased by 0.87 percentage points, the Multiple Object Tracking Accuracy (MOTA) increased by 0.41 percentage points, and the IDSW number reduced by 16.3%. The proposed method also achieves good results on MOT16 and MOT20 datasets. Consequently, the proposed method can effectively deal with the object occlusion problem, maintain object identity information, and reduce object identity loss.

    Object tracking based on instance segmentation and Pythagorean fuzzy decision-making
    Yuanlong ZHAO, Yugang SHAN, Jie YUAN, Kangdi ZHAO
    2023, 43(6):  1930-1937.  DOI: 10.11772/j.issn.1001-9081.2022050674
    Asbtract ( )   HTML ( )   PDF (3011KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In order to solve the problems of scale change, similarity interference and occlusion in object tracking, an object tracking algorithm based on instance segmentation and Pythagorean fuzzy decision-making was proposed. Based on the instance segmentation network YOLACT++ (improved You Only Look At CoefficienTs), three different matching methods were integrated to predict the tracking results for different scenes. At the same time, a template update mechanism based on Pythagorean fuzzy decision-making was proposed by which whether to update the object template and replace the matching method was determined according to the quality of the prediction results. Experimental results show that the proposed algorithm can track the video sequences with scale change, similarity interference, occlusion and other problems more accurately. Compared with SiamMask algorithm, the proposed algorithm has the regional similarity on DAVIS 2016 and DAVIS 2017 datasets increased by 12.3 and 15.3 percentage points, respectively, and the Expected Average Overlap rate (EAO) on VOT2016 and VOT2018 datasets increased by 4.2 and 4.1 percentage points, respectively. Meanwhile, the average tracking speed of the proposed algorithm is 32.00 frames per second, meeting real-time requirements.

    Infrared small target tracking method based on state information
    Xin TANG, Bo PENG, Fei TENG
    2023, 43(6):  1938-1942.  DOI: 10.11772/j.issn.1001-9081.2022050762
    Asbtract ( )   HTML ( )   PDF (1552KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Infrared small targets occupy few pixels and lack features such as color, texture and shape, so it is difficult to track them effectively. To solve this problem, an infrared small target tracking method based on state information was proposed. Firstly, the target, background and distractors in the local area of the small target to be detected were encoded to obtain dense local state information between consecutive frames. Secondly, feature information of the current and the previous frames were input into the classifier to obtain the classification score. Thirdly, the state information and the classification score were fused to obtain the final degree of confidence and determine the center position of the small target to be detected. Finally, the state information was updated and propagated between the consecutive frames. After that, the propagated state information was used to track the infrared small target in the entire sequences. The proposed method was validated on an open dataset DIRST (Dataset for Infrared detection and tRacking of dim-Small aircrafT). Experimental results show that for infrared small target tracking, the recall of the proposed method reaches 96.2%, and the precision of the method reaches 97.3%, which are 3.7% and 3.7% higher than those of the current best tracking method KeepTrack. It proves that the proposed method can effectively complete the tracking of small infrared targets under complex background and interference.

    Small object detection algorithm of YOLOv5 for safety helmet
    Zongzhe LYU, Hui XU, Xiao YANG, Yong WANG, Weijian WANG
    2023, 43(6):  1943-1949.  DOI: 10.11772/j.issn.1001-9081.2022060855
    Asbtract ( )   HTML ( )   PDF (3099KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Safety helmet wearing is a powerful guarantee of workers’ personal safety. Aiming at the collected safety helmet wearing pictures have characteristics of high density, small pixels and difficulty to detect, a small object detection algorithm of YOLOv5 (You Only Look Once version 5) for safety helmet was proposed. Firstly, based on YOLOv5 algorithm, the bounding box regression loss function and confidence prediction loss function were optimized to improve the learning effect of the algorithm on the features of dense small objects in training. Secondly, slicing aided fine-tuning and Slicing Aided Hyper Inference (SAHI) were introduced to make the small object produce a larger pixel area by slicing the pictures input into the network, and the effect of network inference and fine-tuning was improved. In the experiments, a dataset containing dense small objects of safety helmets in the industrial scenes was used for training. The experimental results show that compared with the original YOLOv5 algorithm, the improved algorithm can increase the precision by 0.26 percentage points, the recall by 0.38 percentage points. And the mean Average Precision (mAP) of the proposed algorithm reaches 95.77%, which is improved by 0.46 to 13.27 percentage points compared to several algorithms such as the original YOLOv5 algorithm. The results verify that the introduction of slicing aided fine-tuning and SAHI improves the precision and confidence of small object detection and recognition in the dense scenes, reduces the false detection and missed detection cases, and can satisfy the requirements of safety helmet wearing detection effectively.

    Sinogram inpainting for sparse-view cone-beam computed tomography image reconstruction based on residual encoder-decoder generative adversarial network
    Xin JIN, Yangchuan LIU, Yechen ZHU, Zijian ZHANG, Xin GAO
    2023, 43(6):  1950-1957.  DOI: 10.11772/j.issn.1001-9081.2022050773
    Asbtract ( )   HTML ( )   PDF (5739KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Sparse-view projection can reduce the scan does and scan time of Cone-Beam Computed Tomography (CBCT) effectively but brings a lot of streak artifacts to the reconstructed images. Sinogram inpainting can generate projection data for missing angles and improve the quality of reconstructed images. Based on the above, a Residual Encoder-Decoder Generative Adversarial Network (RED-GAN) was proposed for sinogram inpainting to reconstruct sparse-view CBCT images. In this network, the U-Net generator in Pix2pixGAN (Pix2pix Generative Adversarial Network) was replaced with the Residual Encoder-Decoder (RED) module. In addition, the conditional discriminator based on PatchGAN (Patch Generative Adversarial Network) was used to distinguish between the repaired sinograms from the real sinograms, thereby further improving the network performance. After the network training using real CBCT projection data, the proposed network was tested under 1/2, 1/3 and 1/4 sparse-view sampling conditions, and compared with linear interpolation method, Residual Encoder-Decoder Convolutional Neural Network (RED-CNN) and Pix2pixGAN. Experimental results indicate that the sinogram inpainting results of RED-GAN are better than those of the comparison methods under all the three conditions. Under the 1/4 sparse-view sampling condition, the proposed network has the most obvious advantages. In the sinogram domain, the proposed network has the Root Mean Square Error (RMSE) decreased by 7.2%, Peak Signal-to-Noise Ratio (PSNR) increased by 1.5% and Structural Similarity (SSIM) increased by 1.4%; in the reconstructed image domain, the proposed network has the RMSE decreased by 5.4%, PSNR increased by 1.6% and SSIM increased by 1.0%. It can be seen that RED-GAN is suitable for high-quality CBCT reconstruction and has potential application value in the field of fast low-dose CBCT scanning.

    Frontier and comprehensive applications
    Review of application analysis and research progress of deep learning in weather forecasting
    Runting DONG, Li WU, Xiaoying WANG, Tengfei CAO, Jianqiang HUANG, Qin GUAN, Jiexia WU
    2023, 43(6):  1958-1968.  DOI: 10.11772/j.issn.1001-9081.2022050745
    Asbtract ( )   HTML ( )   PDF (1570KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    With the advancement of technologies such as sensor networks and global positioning systems, the volume of meteorological data with both temporal and spatial characteristics has exploded, and the research on deep learning models for Spatiotemporal Sequence Forecasting (STSF) has developed rapidly. However, the traditional machine learning methods applied to weather forecasting for a long time have unsatisfactory effects in extracting the temporal correlations and spatial dependences of data, while the deep learning methods can extract features automatically through artificial neural networks to improve the accuracy of weather forecasting effectively, and have a very good effect in encoding long-term spatial information modeling. At the same time, the deep learning models driven by observational data and Numerical Weather Prediction (NWP) models based on physical theories are combined to build hybrid models with higher prediction accuracy and longer prediction time. Based on these, the application analysis and research progress of deep learning in the field of weather forecasting were reviewed. Firstly, the deep learning problems in the field of weather forecasting and the classical deep learning problems were compared and studied from three aspects: data format, problem model and evaluation metrics. Then, the development history and application status of deep learning in the field of weather forecasting were looked back, and the latest progress in combining deep learning technologies with NWP was summarized and analyzed. Finally, the future development directions and research focuses were prospected to provide a certain reference for future deep learning research in the field of weather forecasting.

    Blockchain smart contract privacy authorization method based on TrustZone
    Luyu CHEN, Xiaofeng MA, Jing HE, Shengzhi GONG, Jian GAO
    2023, 43(6):  1969-1978.  DOI: 10.11772/j.issn.1001-9081.2022050719
    Asbtract ( )   HTML ( )   PDF (2561KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    To meet the needs of data sharing in the context of digitalization currently, and take into account the necessity of protecting private data security at the same time, a blockchain smart contract private data authorization method based on TrustZone was proposed. The blockchain system is able to realize data sharing in different application scenarios and meet regulatory requirements, and a secure isolation environment was provided by TrustZone Trusted Execution Environment (TEE) technology for private computing. In the integrated system, the uploading of private data was completed by the regulatory agency, the plaintext information of the private data was obtained by other business nodes only after obtaining the authorization of the user. In this way, the privacy and security of the user were able to be protected. Aiming at the problem of limited memory space in the TrustZone architecture during technology fusion, a privacy set intersection algorithm for small memory conditions was proposed. In the proposed algorithm, the intersection operation for large-scale datasets was completed on the basis of the ??grouping computing idea. The proposed algorithm was tested with datasets of different orders of magnitude. The results show that the time and space consumption of the proposed algorithm fluctuates in a very small range and is relatively stable. The variances are 1.0 s2 and 0.01 MB2 respectively. When the order of magnitudes of the dataset is increased, the time consumption is predictable. Furthermore, using a pre-sorted dataset can greatly improve the algorithm performance.

    circRNA-disease association prediction by two-stage fusion on graph auto-encoder
    Yi ZHANG, Zhenmei WANG
    2023, 43(6):  1979-1986.  DOI: 10.11772/j.issn.1001-9081.2022050727
    Asbtract ( )   HTML ( )   PDF (1805KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Most existing computational models for predicting associations between circular RNA (circRNA) and diseases usually use biological knowledge such as circRNA and disease-related data, and mine the potential association information by combining known circRNA-disease association information pairs. However, these models suffer from inherent problems such as sparsity and too few negative samples of networks composed of the known association, resulting in poor prediction performance. Therefore, inductive matrix completion and self-attention mechanism were introduced for two-stage fusion based on graph auto-encoder to achieve circRNA-disease association prediction, and the model based on the above is GIS-CDA (Graph auto-encoder combining Inductive matrix complementation and Self-attention mechanism for predicting CircRNA-Disease Association). Firstly, the similarity of circRNA integration and disease integration was calculated, and graph auto-encoder was used to learn the potential features of circRNAs and diseases to obtain low-dimensional representations. Secondly, the learned features were input to inductive matrix complementation to improve the similarity and dependence between nodes. Thirdly, the circRNA feature matrix and disease feature matrix were integrated into circRNA-disease feature matrix to enhance the stability and accuracy of prediction. Finally, a self-attention mechanism was introduced to extract important features in the feature matrix and reduce the dependence on other biological information. The results of five-fold crossover and ten-fold crossover validation show that the Area Under Receiver Operating Characteristic curve (AUROC) values of GIS-CDA are 0.930 3 and 0.939 3 respectively, the former of which is 13.19,35.73,13.28 and 5.01 percentage points higher than those of the prediction models based on computational model of KATZ measures for Human CircRNA-Disease Association (KATZHCDA), Deep Matrix Factorization for CircRNA-Disease Association (DMFCDA), RWR (Random Walk with Restart) and Speedup Inductive Matrix Completion for CircRNA-Disease Associations (SIMCCDA), respectively; the Area Under Precision-Recall curve (AUPR) values of GIS-CDA are 0.227 1 and 0.234 0 respectively, the former of which is 21.72, 22.43, 21.96 and 13.86 percentage points higher than those of the above comparison models respectively. In addition, ablation experiments and case studies on circRNADisease, circ2Disease and circR2Disease datasets, further validate the good performance of GIS-CDA in predicting the potential circRNA-disease association.

2023 Vol.43 No.11

Current Issue
Honorary Editor-in-Chief: ZHANG Jingzhong
Editor-in-Chief: XU Zongben
Associate Editor: SHEN Hengtao XIA Zhaohui
Domestic Post Distribution Code: 62-110
Foreign Distribution Code: M4616
No. 9, 4th Section of South Renmin Road, Chengdu 610041, China
Tel: 028-85224283-803
Website: www.joca.cn
E-mail: bjb@joca.cn
Join CCF