Most Read articles

    Published in last 1 year |  In last 2 years |  In last 3 years |  All

    All
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Embedded road crack detection algorithm based on improved YOLOv8
    Huantong GENG, Zhenyu LIU, Jun JIANG, Zichen FAN, Jiaxing LI
    Journal of Computer Applications    2024, 44 (5): 1613-1618.   DOI: 10.11772/j.issn.1001-9081.2023050635
    Abstract1701)   HTML60)    PDF (2002KB)(2197)       Save

    Deploying the YOLOv8L model on edge devices for road crack detection can achieve high accuracy, but it is difficult to guarantee real-time detection. To solve this problem, a target detection algorithm based on the improved YOLOv8 model that can be deployed on the edge computing device Jetson AGX Xavier was proposed. First, the Faster Block structure was designed using partial convolution to replace the Bottleneck structure in the YOLOv8 C2f module, and the improved C2f module was recorded as C2f-Faster; second, an SE (Squeeze-and-Excitation) channel attention layer was connected after each C2f-Faster module in the YOLOv8 backbone network to further improve the detection accuracy. Experimental results on the open source road damage dataset RDD20 (Road Damage Detection 20) show that the average F1 score of the proposed method is 0.573, the number of detection Frames Per Second (FPS) is 47, and the model size is 55.5 MB. Compared with the SOTA (State-Of-The-Art) model of GRDDC2020 (Global Road Damage Detection Challenge 2020), the F1 score is increased by 0.8 percentage points, the FPS is increased by 291.7%, and the model size is reduced by 41.8%, which realizes the real-time and accurate detection of road cracks on edge devices.

    Table and Figures | Reference | Related Articles | Metrics
    Technology application prospects and risk challenges of large language models
    Yuemei XU, Ling HU, Jiayi ZHAO, Wanze DU, Wenqing WANG
    Journal of Computer Applications    2024, 44 (6): 1655-1662.   DOI: 10.11772/j.issn.1001-9081.2023060885
    Abstract1249)   HTML94)    PDF (1142KB)(2079)       Save

    In view of the rapid development of Large Language Model (LLM) technology, a comprehensive analysis was conducted on its technical application prospects and risk challenges which has great reference value for the development and governance of Artificial General Intelligence (AGI). Firstly, with representative language models such as Multi-BERT (Multilingual Bidirectional Encoder Representations from Transformer), GPT (Generative Pre-trained Transformer) and ChatGPT (Chat Generative Pre-trained Transformer) as examples, the development process, key technologies and evaluation systems of LLM were reviewed. Then, a detailed analysis of LLM on technical limitations and security risks was conducted. Finally, suggestions were put forward for technical improvement and policy follow-up of the LLM. The analysis indicates that at a developing status, the current LLMs still produce non-truthful and biased output, lack real-time autonomous learning ability, require huge computing power, highly rely on data quality and quantity, and tend towards monotonous language style. They have security risks related to data privacy, information security, ethics, and other aspects. Their future developments can continue to improve technically, from “large-scale” to “lightweight”, from “single-modal” to “multi-modal”, from “general-purpose” to “vertical”; for real-time follow-up in policy, their applications and developments should be regulated by targeted regulatory measures.

    Table and Figures | Reference | Related Articles | Metrics
    Review of YOLO algorithm and its applications to object detection in autonomous driving scenes
    Yaping DENG, Yingjiang LI
    Journal of Computer Applications    2024, 44 (6): 1949-1958.   DOI: 10.11772/j.issn.1001-9081.2023060889
    Abstract963)   HTML39)    PDF (1175KB)(879)       Save

    Object detection in autonomous driving scenes is one of the important research directions in computer vision. The researches focus on ensuring real-time and accurate object detection of objects by autonomous vehicles. Recently, a rapid development in deep learning technology had been witnessed, and its wide application in the field of autonomous driving had prompted substantial progress in this field. An analysis was conducted on the research status of object detection by YOLO (You Only Look Once) algorithms in the field of autonomous driving from the following four aspects. Firstly, the ideas and improvement methods of the single-stage YOLO series of detection algorithms were summarized, and the advantages and disadvantages of the YOLO series of algorithms were analyzed. Secondly, the YOLO algorithm-based object detection applications in autonomous driving scenes were introduced, the research status and applications for the detection and recognition of traffic vehicles, pedestrians, and traffic signals were expounded and summarized respectively. Additionally, the commonly used evaluation indicators in object detection, as well as the object detection datasets and automatic driving scene datasets, were summarized. Lastly, the problems and future development directions of object detection were discussed.

    Table and Figures | Reference | Related Articles | Metrics
    Survey of incomplete multi-view clustering
    Yao DONG, Yixue FU, Yongfeng DONG, Jin SHI, Chen CHEN
    Journal of Computer Applications    2024, 44 (6): 1673-1682.   DOI: 10.11772/j.issn.1001-9081.2023060813
    Abstract620)   HTML17)    PDF (2050KB)(515)       Save

    Multi-view clustering has recently been a hot topic in graph data mining. However, due to the limitations of data collection technology or human factors, multi-view data often has the problem of missing views or samples. Reducing the impact of incomplete views on clustering performance is a major challenge currently faced by multi-view clustering. In order to better understand the development of Incomplete Multi-view Clustering (IMC) in recent years, a comprehensive review is of great theoretical significance and practical value. Firstly, the missing types of incomplete multi-view data were summarized and analyzed. Secondly, four types of IMC methods, based on Multiple Kernel Learning (MKL), Matrix Factorization (MF) learning, deep learning, and graph learning were compared, and the technical characteristics and differences among the methods were analyzed. Thirdly, from the perspectives of dataset types, the numbers of views and categories, and application fields, twenty-two public incomplete multi-view datasets were summarized. Then, the evaluation metrics were outlined, and the performance of existing incomplete multi-view clustering methods on homogeneous and heterogeneous datasets were evaluated. Finally, the existing problems, future research directions, and existing application fields of incomplete multi-view clustering were discussed.

    Table and Figures | Reference | Related Articles | Metrics
    Survey on hypergraph application methods: issues, advances, and challenges
    Li ZENG, Jingru YANG, Gang HUANG, Xiang JING, Chaoran LUO
    Journal of Computer Applications    2024, 44 (11): 3315-3326.   DOI: 10.11772/j.issn.1001-9081.2023111629
    Abstract592)   HTML23)    PDF (795KB)(347)       Save

    Hypergraph is the generalization of graph, which has significant advantages in representing higher-order features of complex relationships compared with ordinary graph. As a relatively new data structure, hypergraph is playing a crucial role in various application fields increasingly. By appropriately using hypergraph models and algorithms, specific problems in real world were modeled and solved with higher efficiency and quality. Existing surveys of hypergraph mainly focus on the theory and techniques of hypergraph itself, and lack of a summary of modeling and solving methods in specific scenarios. To this end, after summarizing and introducing some fundamental concepts of hypergraph, the application methods, techniques, common issues, and solutions of hypergraph in various application scenarios were analyzed; by summarizing the existing work, some problems and obstacles that still exist in the applications of hypergraph to real-world problems were elaborated. Finally, the future research directions of hypergraph applications were prospected.

    Table and Figures | Reference | Related Articles | Metrics
    Survey of visual object tracking methods based on Transformer
    Ziwen SUN, Lizhi QIAN, Chuandong YANG, Yibo GAO, Qingyang LU, Guanglin YUAN
    Journal of Computer Applications    2024, 44 (5): 1644-1654.   DOI: 10.11772/j.issn.1001-9081.2023060796
    Abstract546)   HTML22)    PDF (1615KB)(1146)       Save

    Visual object tracking is one of the important tasks in computer vision, in order to achieve high-performance object tracking, a large number of object tracking methods have been proposed in recent years. Among them, Transformer-based object tracking methods become a hot topic in the field of visual object tracking due to their ability to perform global modeling and capture contextual information. Firstly, existing Transformer-based visual object tracking methods were classified based on their network structures, an overview of the underlying principles and key techniques for model improvement were expounded, and the advantages and disadvantages of different network structures were also summarized. Then, the experimental results of the Transformer-based visual object tracking methods on public datasets were compared to analyze the impact of network structure on performance. in which MixViT-L (ConvMAE) achieved tracking success rates of 73.3% and 86.1% on LaSOT and TrackingNet, respectively, proving that the object tracking methods based on pure Transformer two-stage architecture have better performance and broader development prospects. Finally, the limitations of these methods, such as complex network structure, large number of parameters, high training requirements, and difficulty in deploying on edge devices, were summarized, and the future research focus was outlooked, by combining model compression, self-supervised learning, and Transformer interpretability analysis, more kinds of feasible solutions for Transformer-based visual target tracking could be presented.

    Table and Figures | Reference | Related Articles | Metrics
    Survey and prospect of large language models
    Xiaolin QIN, Xu GU, Dicheng LI, Haiwen XU
    Journal of Computer Applications    2025, 45 (3): 685-696.   DOI: 10.11772/j.issn.1001-9081.2025010128
    Abstract511)   HTML34)    PDF (2035KB)(390)       Save

    Large Language Models (LLMs) are a class of language models composed of artificial neural networks with a vast number of parameters (typically billions of weights or more). They are trained on a large amount of unlabeled text using self-supervised or semi-supervised learning and are the core of current generative Artificial Intelligence (AI) technologies. Compared to traditional language models, LLMs demonstrate stronger language understanding and generation capabilities, supported by substantial computational power, extensive parameters, and large-scale data. They are widely applied in tasks such as machine translation, question answering systems, and dialogue generation with good performance. Most of the existing surveys focus on the theoretical construction and training techniques of LLMs, while systematic exploration of LLMs’ industry-level application practices and evolution of the technological ecosystem remains insufficient. Therefore, based on introducing the foundational architecture, training techniques, and development history of LLMs, the current general key technologies in LLMs and advanced integration technologies with LLMs bases were analyzed. Then, by summarizing the existing research, challenges faced by LLMs in practical applications were further elaborated, including problems such as data bias, model hallucination, and computational resource consumption, and an outlook was provided on the ongoing development trends of LLMs.

    Table and Figures | Reference | Related Articles | Metrics
    Overview of research and application of knowledge graph in equipment fault diagnosis
    Jie WU, Ansi ZHANG, Maodong WU, Yizong ZHANG, Congbao WANG
    Journal of Computer Applications    2024, 44 (9): 2651-2659.   DOI: 10.11772/j.issn.1001-9081.2023091280
    Abstract501)   HTML51)    PDF (2858KB)(716)       Save

    Useful knowledge can be extracted from equipment fault diagnosis data for construction of a knowledge graph, which can effectively manage complex equipment fault diagnosis information in the form of triples (entity, relationship, entity). This enables the rapid diagnosis of equipment faults. Firstly, the related concepts of knowledge graph for equipment fault diagnosis were introduced, and the framework of knowledge graph for equipment fault diagnosis domain was analyzed. Secondly, the research status at home and abroad about several key technologies, such as knowledge extraction, knowledge fusion and knowledge reasoning for equipment fault diagnosis knowledge graph, was summarized. Finally, the applications of knowledge graph in equipment fault diagnosis were summarized, some shortcomings and challenges in the construction of knowledge graph in this field were proposed, and some new ideas were provided for the field of equipment fault diagnosis in the future.

    Table and Figures | Reference | Related Articles | Metrics
    Underwater target detection algorithm based on improved YOLOv8
    Dahai LI, Bingtao LI, Zhendong WANG
    Journal of Computer Applications    2024, 44 (11): 3610-3616.   DOI: 10.11772/j.issn.1001-9081.2023111550
    Abstract487)   HTML10)    PDF (1637KB)(132)       Save

    Due to the unique characteristics of underwater creatures, underwater images usually exit many small targets being hard to detect and often overlapping with each other. In addition, light absorption and scattering in underwater environment can cause underwater images' color offset and blur. To overcome those challenges, an underwater target detection algorithm, namely WCA-YOLOv8, was proposed. Firstly, the Feature Fusion Module (FFM) was designed to improve the focus on spatial dimension in order to improve the recognition ability for targets with color offset and blur. Secondly, the FReLU Coordinate Attention (FCA) module was added to enhance the feature extraction ability for overlapped and occluded underwater targets. Thirdly, Complete Intersection over Union (CIoU) loss function was replaced by Wise-IoU version 3 (WIoU v3) loss function to strengthen the detection performance for small size targets. Finally, the Downsampling Enhancement Module (DEM) was designed to preserve context information during feature extraction more completely. Experimental results show that WCA-YOLOv8 achieves 75.8% and 88.6% mean Average Precision (mAP0.5) and 60 frame/s and 57 frame/s detection speeds on RUOD and URPC datasets, respectively. Compared with other state-of-the-art underwater target detection algorithms, WCA-YOLOv8 can achieve higher detection accuracy with faster detection speed.

    Table and Figures | Reference | Related Articles | Metrics
    Logo detection algorithm based on improved YOLOv5
    Yeheng LI, Guangsheng LUO, Qianmin SU
    Journal of Computer Applications    2024, 44 (8): 2580-2587.   DOI: 10.11772/j.issn.1001-9081.2023081113
    Abstract477)   HTML8)    PDF (4682KB)(415)       Save

    To address the challenges posed by complex background and varying size of logo images, an improved detection algorithm based on YOLOv5 was proposed. Firstly, in combination with the Channel Block Attention Module (CBAM), compression was applied in both image channels and spatial dimensions to extract critical information and significant regions within the image. Subsequently, the Switchable Atrous Convolution (SAC) was employed to allow the network to adaptively adjust the receptive field size in feature maps at different scales, improving the detection effects of objects across multiple scales. Finally, the Normalized Wasserstein Distance (NWD) was embedded into the loss function. The bounding boxes were modeled as 2D Gaussian distributions, the similarity between corresponding Gaussian distributions was calculated to better measure the similarity among objects, thereby enhancing the detection performance for small objects, and improving model robustness and stability. Compared to the original YOLOv5 algorithm: in small dataset FlickrLogos?32, the improved algorithm achieved a mean of Average Precision (mAP@0.5) of 90.6%, with an increase of 1 percentage point; in large dataset QMULOpenLogo, the improved algorithm achieved an mAP@0.5 of 62.7%, with an increase of 2.3 percentage points; in LogoDet3K for three types of logos, the improved algorithm increased the mAP@0.5 by 1.2, 1.4, and 1.4 percentage points respectively. Experimental results demonstrate that the improved algorithm has better small object detection ability of logo images.

    Table and Figures | Reference | Related Articles | Metrics
    Development, technologies and applications of blockchain 3.0
    Peng FANG, Fan ZHAO, Baoquan WANG, Yi WANG, Tonghai JIANG
    Journal of Computer Applications    2024, 44 (12): 3647-3657.   DOI: 10.11772/j.issn.1001-9081.2023121826
    Abstract476)   HTML38)    PDF (2294KB)(329)       Save

    Blockchain 3.0 is the third stage of the development of blockchain technology and the core of building a value Internet. Its innovations in sharding, cross-chain and privacy protection have given it a wide range of application scenarios and research value. It is highly valued by relevant people in academia and industry. For the development, technologies and applications of blockchain 3.0, the relevant literature on blockchain 3.0 at home and abroad in the past five years were surveyed and reviewed. Firstly, the basic theory and technical characteristics of blockchain were introduced, laying the foundation for an in-depth understanding of the research progress of blockchain. Subsequently, based on the evolution trend of blockchain technology over time, the development process and various key development time nodes of blockchain 3.0, as well as the reasons of the division of different stages of development of blockchain using sharding and side-chain technologies as benchmarks, were given. Then, the current research status of key technologies of blockchain 3.0 was analyzed in detail, and typical applications of blockchain 3.0 in six major fields such as internet of things, medical care, and agriculture were summarized. Finally, the key challenges and future development opportunities faced by blockchain 3.0 in its development process were summed up.

    Table and Figures | Reference | Related Articles | Metrics
    Review of evolutionary multitasking from the perspective of optimization scenarios
    Jiawei ZHAO, Xuefeng CHEN, Liang FENG, Yaqing HOU, Zexuan ZHU, Yew‑Soon Ong
    Journal of Computer Applications    2024, 44 (5): 1325-1337.   DOI: 10.11772/j.issn.1001-9081.2024020208
    Abstract461)   HTML72)    PDF (1383KB)(1154)       Save

    Due to the escalating complexity of optimization problems, traditional evolutionary algorithms increasingly struggle with high computational costs and limited adaptability. Evolutionary MultiTasking Optimization (EMTO) algorithms have emerged as a novel solution, leveraging knowledge transfer to tackle multiple optimization issues concurrently, thereby enhancing evolutionary algorithms’ efficiency in complex scenarios. The current progression of evolutionary multitasking optimization research was summarized, and different research perspectives were explored by reviewing existing literature and highlighting the notable absence of optimization scenario analysis. By focusing on the application scenarios of optimization problems, the scenarios suitable for evolutionary multitasking optimization and their fundamental solution strategies were systematically outlined. This study thus could aid researchers in selecting the appropriate methods based on specific application needs. Moreover, an in-depth discussion on the current challenges and future directions of EMTO were also presented to provide guidance and insights for advancing research in this field.

    Table and Figures | Reference | Related Articles | Metrics
    Time series classification method based on multi-scale cross-attention fusion in time-frequency domain
    Mei WANG, Xuesong SU, Jia LIU, Ruonan YIN, Shan HUANG
    Journal of Computer Applications    2024, 44 (6): 1842-1847.   DOI: 10.11772/j.issn.1001-9081.2023060731
    Abstract435)   HTML9)    PDF (2511KB)(684)       Save

    To address the problem of low classification accuracy caused by insufficient potential information interaction between time series subsequences, a time series classification method based on multi-scale cross-attention fusion in time-frequency domain called TFFormer (Time-Frequency Transformer) was proposed. First, time and frequency spectrums of the original time series were divided into subsequences with the same length respectively, and the point-value coupling problem was solved by adding positional embedding after linear projection. Then, the long-term time series dependency problem was solved because the model was made to focus on more important time series features by Improved Multi-Head self-Attention (IMHA) mechanism. Finally, a multi-scale Cross-Modality Attention (CMA) module was proposed to enhance the interaction between the time domain and frequency domain, so that the model could further mine the frequency information of the time series. The experimental results show that compared with Fully Convolutional Network (FCN), the classification accuracy of the proposed method on Trace, StarLightCurves and UWaveGestureLibraryAll datasets increased by 0.3, 0.9 and 1.4 percentage points. It is proved that by enhancing the information interaction between time domain and frequency domain of the time series, the model convergence speed and classification accuracy can be improved.

    Table and Figures | Reference | Related Articles | Metrics
    Research review of multitasking optimization algorithms and applications
    Yue WU, Hangqi DING, Hao HE, Shunjie BI, Jun JIANG, Maoguo GONG, Qiguang MIAO, Wenping MA
    Journal of Computer Applications    2024, 44 (5): 1338-1347.   DOI: 10.11772/j.issn.1001-9081.2024020209
    Abstract430)   HTML53)    PDF (1486KB)(615)       Save

    Evolutionary MultiTasking Optimization (EMTO) is one of the new methods in evolutionary computing, which can simultaneously solve multiple related optimization tasks and enhance the optimization of each task through knowledge transfer between tasks. In recent years, more and more research on evolutionary multitasking optimization has been devoted to utilizing its powerful parallel search capability and potential for reducing computational costs to optimize various problems, and EMTO has been used in a variety of real-world scenarios. The researches and applications of EMTO were discussed from four aspects: principle, core design, applications, and challenges. Firstly, the general classification of EMTO was introduced from two levels and four aspects, including single-population multitasking, multi-population multitasking, auxiliary task, and multiform task. Next, the core component design of EMTO was introduced, including task construction and knowledge transfer. Finally, its various application scenarios were introduced and a summary and outlook for future research was provided.

    Table and Figures | Reference | Related Articles | Metrics
    Small object detection algorithm from drone perspective based on improved YOLOv8n
    Tao LIU, Shihong JU, Yimeng GAO
    Journal of Computer Applications    2024, 44 (11): 3603-3609.   DOI: 10.11772/j.issn.1001-9081.2023111644
    Abstract417)   HTML11)    PDF (1561KB)(253)       Save

    In view of the low accuracy of object detection algorithms in small object detection from drone perspective, a new small object detection algorithm named SFM-YOLOv8 was proposed by improving the backbone network and attention mechanism of YOLOv8. Firstly, the SPace-to-Depth Convolution (SPDConv) suitable for low-resolution images and small object detection was integrated into the backbone network to retain discriminative feature information and improve the perception ability to small objects. Secondly, a multi-branch attention named MCA (Multiple Coordinate Attention) was introduced to enhance the spatial and channel information on the feature layer. Then, a convolution FE-C2f fusing FasterNet and Efficient Multi-scale Attention (EMA) was constructed to reduce the computational cost and lightweight the model. Besides, a Minimum Point Distance based Intersection over Union (MPDIoU) loss function was introduced to improve the accuracy of the algorithm. Finally, a small object detection layer was added to the network structure of YOLOv8n to retain more location information and detailed features of small objects. Experimental results show that compared with YOLOv8n, SFM-YOLOv8 achieves a 4.37 percentage point increase in mAP50 (mean Average Precision) with a 5.98% reduction in parameters on VisDrone-DET2019 dataset. Compared to the related mainstream models, SFM-YOLOv8 achieves higher accuracy and meets real-time detection requirements.

    Table and Figures | Reference | Related Articles | Metrics
    Correlation filtering based target tracking with nonlinear temporal consistency
    Wentao JIANG, Wanxuan LI, Shengchong ZHANG
    Journal of Computer Applications    2024, 44 (8): 2558-2570.   DOI: 10.11772/j.issn.1001-9081.2023081121
    Abstract406)   HTML0)    PDF (7942KB)(82)       Save

    Concerning the problem that existing target tracking algorithms mainly use the linear constraint mechanism LADCF (Learning Adaptive Discriminative Correlation Filters), which easily causes model drift, a correlation filtering based target tracking algorithm with nonlinear temporal consistency was proposed. First, a nonlinear temporal consistency term was proposed based on Stevens’ Law, which aligned closely with the characteristics of human visual perception. The nonlinear temporal consistency term allowed the model to track the target relatively smoothly, thus ensuring tracking continuity and preventing model drift. Next, the Alternating Direction Method of Multipliers (ADMM) was employed to compute the optimal function value, ensuring real-time tracking of the algorithm. Lastly, Stevens’ Law was used for nonlinear filter updating, enabling the filter update factor to enhance and suppress the filter according to the change of the target, thereby adapting to target changes and preventing filter degradation. Comparison experiments with mainstream correlation filtering and deep learning algorithms were performed on four standard datasets. Compared with the baseline algorithm LADCF, the tracking precision and success rate of the proposed algorithm were improved by 2.4 and 3.8 percentage points on OTB100 dataset, and 1.5 and 2.5 percentage points on UAV123 dataset. The experimental results show that the proposed algorithm effectively avoids tracking model drift, reduces the likelihood of filter degradation, has higher tracking precision and success rate, and stronger robustness in complicated situations such as occlusion and illumination changes.

    Table and Figures | Reference | Related Articles | Metrics
    Mobile robot 3D space path planning method based on deep reinforcement learning
    Tian MA, Runtao XI, Jiahao LYU, Yijie ZENG, Jiayi YANG, Jiehui ZHANG
    Journal of Computer Applications    2024, 44 (7): 2055-2064.   DOI: 10.11772/j.issn.1001-9081.2023060749
    Abstract388)   HTML28)    PDF (5732KB)(936)       Save

    Aiming at the problems of high complexity and uncertainty in 3D unknown environment, a mobile robot 3D path planning method based on deep reinforcement learning was proposed, under a limited observation space optimization strategy. First, the depth map information was used as the agent’s input in the limited observation space, which could simulate complex 3D space environments with limited and unknown movement conditions. Second, a two-stage action selection policy in discrete action space was designed, including directional actions and movement actions, which could reduce the searching steps and time. Finally, based on the Proximal Policy Optimization (PPO) algorithm, the Gated Recurrent Unit (GRU) was added to combine the historical state information, to enhance the policy stability in unknown environments, so that the accuracy and smoothness of the planned path could be improved. The experimental results show that, compared with Advantage Actor-Critic (A2C), the average search time is reduced by 49.07% and the average planned path length is reduced by 1.04%. Meanwhile, the proposed method can complete the multi-objective path planning tasks under linear sequential logic constraints.

    Table and Figures | Reference | Related Articles | Metrics
    Class-imbalanced traffic abnormal detection based on 1D-CNN and BiGRU
    Hong CHEN, Bing QI, Haibo JIN, Cong WU, Li’ang ZHANG
    Journal of Computer Applications    2024, 44 (8): 2493-2499.   DOI: 10.11772/j.issn.1001-9081.2023081112
    Abstract361)   HTML1)    PDF (1194KB)(661)       Save

    Network traffic anomaly detection is a network security defense method that involves analyzing and determining network traffic to identify potential attacks. A new approach was proposed to address the issue of low detection accuracy and high false positive rate caused by imbalanced high-dimensional network traffic data and different attack categories. One Dimensional Convolutional Neural Network(1D-CNN) and Bidirectional Gated Recurrent Unit (BiGRU) were combined to construct a model for traffic anomaly detection. For class-imbalanced data, balanced processing was performed by using an improved Synthetic Minority Oversampling TEchnique (SMOTE), namely Borderline-SMOTE, and an undersampling clustering technique based on Gaussian Mixture Model (GMM). Subsequently, a one-dimensional CNN was utilized to extract local features in the data, and BiGRU was used to better extract the time series features in the data. Finally, the proposed model was evaluated on the UNSW-NB15 dataset, achieving an accuracy of 98.12% and a false positive rate of 1.28%. The experimental results demonstrate that the proposed model outperforms other classic machine learning and deep learning models, it improves the recognition rate for minority attacks and achieves higher detection accuracy.

    Table and Figures | Reference | Related Articles | Metrics
    Generative label adversarial text classification model
    Xun YAO, Zhongzheng QIN, Jie YANG
    Journal of Computer Applications    2024, 44 (6): 1781-1785.   DOI: 10.11772/j.issn.1001-9081.2023050662
    Abstract357)   HTML14)    PDF (1142KB)(438)       Save

    Text classification is a fundamental task in Natural Language Processing (NLP), aiming to assign text data to predefined categories. The combination of Graph Convolutional neural Network (GCN) and large-scale pre-trained model BERT (Bidirectional Encoder Representations from Transformer) has achieved excellent results in text classification tasks. Undirected information transmission of GCN in large-scale heterogeneous graphs produces information noise, which affects the judgment of the model and reduce the classification ability of the model. To solve this problem, a generative label adversarial model, the Class Adversarial Graph Convolutional Network (CAGCN) model, was proposed to reduce the interference of irrelevant information during classification and improve the classification performance of the model. Firstly, the composition method in TextGCN (Text Graph Convolutional Network) was used to construct the adjacency matrix, which was combined with GCN and BERT models as a Class Generator (CG). Secondly, the pseudo-label feature training method was used in the model training to construct a clueter. The cluster and the class generator were jointly trained. Finally, experiments were carried out on several widely used datasets. Experimental results show that the classification accuracy of CAGCN model is 1.2, 0.1, 0.5, 1.7 and 0.5 percentage points higher than that of RoBERTaGCN model on the widely used classification datasets 20NG, R8, R52, Ohsumed and MR, respectively.

    Table and Figures | Reference | Related Articles | Metrics
    Lightweight algorithm for impurity detection in raw cotton based on improved YOLOv7
    Yongjin ZHANG, Jian XU, Mingxing ZHANG
    Journal of Computer Applications    2024, 44 (7): 2271-2278.   DOI: 10.11772/j.issn.1001-9081.2023070969
    Abstract352)   HTML13)    PDF (8232KB)(341)       Save

    Addressing the challenges posed by high throughput of raw cotton and long impurity inspection duration in cotton mills, an improved YOLOv7 model incorporating lightweight modifications was proposed for impurity detection in raw cotton. Initially, redundant convolutional layers within YOLOv7 model were pruned, thereby increasing detection speed. Following this, FasterNet convolutional layer was integrated into the primary network to mitigate model computational load, diminish redundancy in feature maps, and consequently realized real-time detection. Ultimately, CSP-RepFPN (Cross Stage Partial networks with Replicated Feature Pyramid Network) was used within neck network to facilitate the reconstruction of feature pyramid, augment flow of feature information, minimize feature loss, and elevate the detection precision. Experimental results show that, the improved YOLOv7 model achieves a detection mean Average Precison of 96.0%, coupled with a 37.5% reduction in detection time on self-made raw cotton impurity dataset; and achieves a detection accuracy of 82.5% with a detection time of only 29.8 ms on publicly DWC (Drinking Waste Classification) dataset. This improved YOLOv7 model provides a lightweight approach for real-time detection, recognition and classification of impurities in raw cotton, yielding substantial time savings.

    Table and Figures | Reference | Related Articles | Metrics
    Personalized federated learning method based on dual stream neural network
    Zheyuan SHEN, Keke YANG, Jing LI
    Journal of Computer Applications    2024, 44 (8): 2319-2325.   DOI: 10.11772/j.issn.1001-9081.2023081207
    Abstract350)   HTML53)    PDF (2185KB)(233)       Save

    Classic Federated Learning (FL) algorithms are difficult to achieve good results in scenarios where data is highly heterogeneous. In Personalized FL (PFL), a new solution was proposed aiming at the problem of data heterogeneity in federated learning, which is to “tailor” a dedicated model for each client. In this way, the models had good performance. However, it brought the difficulty in extending federated learning to new clients at the same time. Focusing on the challenges of performance and scalability in PFL, FedDual, a FL model with dual stream neural network structure, was proposed. By adding an encoder for analyzing the personalized characteristics of clients, this model was not only able to have the performance of personalized models, but also able to be extended to new clients easily. Experimental results show that compared to the classic Federated Averaging (FedAvg) algorithm on datasets such as MNIST and FashionMNIST, FedDual obviously improves the accuracy; on CIFAR10 dataset, FedDual improves the accuracy by more than 10 percentage points, FedDual achieves “plug and play” for new clients without decrease of the accuracy, solving the problem of difficult scalability for new clients.

    Table and Figures | Reference | Related Articles | Metrics
    Proximal policy optimization algorithm based on clipping optimization and policy guidance
    Yi ZHOU, Hua GAO, Yongshen TIAN
    Journal of Computer Applications    2024, 44 (8): 2334-2341.   DOI: 10.11772/j.issn.1001-9081.2023081079
    Abstract347)   HTML14)    PDF (3877KB)(419)       Save

    Addressing the two issues in the Proximal Policy Optimization (PPO) algorithm, the difficulty in strictly constraining the difference between old and new policies and the relatively low efficiency in exploration and utilization, a PPO based on Clipping Optimization And Policy Guidance (COAPG-PPO) algorithm was proposed. Firstly, by analyzing the clipping mechanism of PPO, a trust-region clipping approach based on the Wasserstein distance was devised, strengthening the constraint on the difference between old and new policies. Secondly, within the policy updating process, ideas from simulated annealing and greedy algorithms were incorporated, improving the exploration efficiency and learning speed of algorithm. To validate the effectiveness of COAPG-PPO algorithm, comparative experiments were conducted using the MuJoCo testing benchmarks between PPO based on Clipping Optimization (CO-PPO), PPO with Covariance Matrix Adaptation (PPO-CMA), Trust Region-based PPO with RollBack (TR-PPO-RB), and PPO algorithm. The experimental results indicate that COAPG-PPO algorithm demonstrates stricter constraint capabilities, higher exploration and exploitation efficiencies, and higher reward values in most environments.

    Table and Figures | Reference | Related Articles | Metrics
    Image super-resolution network based on global dependency Transformer
    Zihan LIU, Dengwen ZHOU, Yukai LIU
    Journal of Computer Applications    2024, 44 (5): 1588-1596.   DOI: 10.11772/j.issn.1001-9081.2023050636
    Abstract346)   HTML6)    PDF (2858KB)(205)       Save

    At present, the image super-resolution networks based on deep learning are mainly implemented by convolution. Compared with the traditional Convolutional Neural Network (CNN), the main advantage of Transformer in the image super-resolution task is its long-distance dependency modeling ability. However, most Transformer-based image super-resolution models cannot establish global dependencies with small parameters and few network layers, which limits the performance of the model. In order to establish global dependencies in super-resolution network, an image Super-Resolution network based on Global Dependency Transformer (GDTSR) was proposed. Its main component was the Residual Square Axial Window Block (RSAWB), and in Transformer residual layer, axial window and self-attention were used to make each pixel globally dependent on the entire feature map. In addition, the super-resolution image reconstruction modules of most current image super-resolution models are composed of convolutions. In order to dynamically integrate the extracted feature information, Transformer and convolution were combined to jointly reconstruct super-resolution images. Experimental results show that the Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM) of GDTSR on five standard test sets, including Set5, Set14, B100, Urban100 and Manga109, are optimal for three multiples ( × 2 × 3 × 4 ), and on large-scale datasets Urban100 and Manga109, the performance improvement is especially obvious.

    Table and Figures | Reference | Related Articles | Metrics
    Road damage detection algorithm based on enhanced feature extraction
    Wudan LONG, Bo PENG, Jie HU, Ying SHEN, Danni DING
    Journal of Computer Applications    2024, 44 (7): 2264-2270.   DOI: 10.11772/j.issn.1001-9081.2023070956
    Abstract341)   HTML8)    PDF (2806KB)(487)       Save

    In response to the challenge posed by the difficulty in detecting small road damage areas and the uneven distribution of damage categories, a road damage detection algorithm termed RDD-YOLO was introduced based on the YOLOv7-tiny architecture. Firstly, the K-means++ algorithm was employed to determine anchor boxes better conforming to object dimensions. Subsequently, a Quantization Aware RepVGG (QARepVGG) module was utilized within the auxiliary detection branch, thereby enhancing the extraction of shallow features. Concurrently, an Addition and Multiplication Convolutional Block Attention Module (AM-CBAM) was embedded into the three inputs of the neck, effectively suppressing disturbances arising from intricate background. Furthermore, the feature fusion module Res-RFB (Resblock with Receptive Field Block) was devised to emulate the expansion of receptive field in human visual perception, consequently fusing information across multiple scales and thereby amplifying representational aptitude. Additionally, a lightweight Small Decoupled Head (S-DeHead) was introduced to elevate the precision of detecting small objects. Ultimately, the process of localizing small objects was optimized through the application of the Normalized Wasserstein Distance (NWD) metric, which in turn mitigated the challenge of imbalanced samples. Experimental results show that RDD-YOLO algorithm achieves a notable 6.19 percentage points enhancement in mAP50, a 5.31 percentage points elevation in F1-Score and the detection velocity of 135.26 frame/s by only increasing 0.71×106 parameters and 1.7 GFLOPs, which can meet the requirements for both accuracy and speed in road maintenance.

    Table and Figures | Reference | Related Articles | Metrics
    Multi-robot path following and formation based on deep reinforcement learning
    Haodong HE, Hao FU, Qiang WANG, Shuai ZHOU, Wei LIU
    Journal of Computer Applications    2024, 44 (8): 2626-2633.   DOI: 10.11772/j.issn.1001-9081.2023081120
    Abstract333)   HTML8)    PDF (3411KB)(206)       Save

    Aiming at the obstacle avoidance and trajectory smoothness problem of multi-robot path following and formation in crowd environment, a multi-robot path following and formation algorithm based on deep reinforcement learning was proposed. Firstly, a pedestrian danger priority mechanism was established, which was combined with reinforcement learning to design a danger awareness network to enhance the safety of multi-robot formation. Subsequently, a virtual robot was introduced as the reference target for multiple robots, thus transforming path following into tracking control of the virtual robot by the multiple robots, with the purpose of enhancing the smoothness of the robot trajectories. Finally, quantitative and qualitative analysis was conducted through simulation experiments to compare the proposed algorithm with existing ones. The experimental results show that compared with the existing point-to-point path following algorithms, the proposed algorithm has excellent obstacle avoidance performance in crowd environments, which ensures the smoothness of multi-robot motion trajectories.

    Table and Figures | Reference | Related Articles | Metrics
    Real-time object detection algorithm for complex construction environments
    Xiaogang SONG, Dongdong ZHANG, Pengfei ZHANG, Li LIANG, Xinhong HEI
    Journal of Computer Applications    2024, 44 (5): 1605-1612.   DOI: 10.11772/j.issn.1001-9081.2023050687
    Abstract319)   HTML16)    PDF (3015KB)(178)       Save

    A real-time object detection algorithm YOLO-C for complex construction environment was proposed for the problems of cluttered environment, obscured objects, large object scale range, unbalanced positive and negative samples, and insufficient real-time of existing detection algorithms, which commonly exist in construction environment. The extracted low-level features were fused with the high-level features to enhance the global sensing capability of the network, and a small object detection layer was designed to improve the detection accuracy of the algorithm for objects of different scales. A Channel-Spatial Attention (CSA) module was designed to enhance the object features and suppress the background features. In the loss function part, VariFocal Loss was used to calculate the classification loss to solve the problem of positive and negative sample imbalance. GhostConv was used as the basic convolutional block to construct the GCSP (Ghost Cross Stage Partial) structure to reduce the number of parameters and the amount of computation. For complex construction environments, a concrete construction site object detection dataset was constructed, and comparison experiments for various algorithms were conducted on the constructed dataset. Experimental results demonstrate that the YOLO?C has higher detection accuracy and smaller parameters, making it more suitable for object detection tasks in complex construction environments.

    Table and Figures | Reference | Related Articles | Metrics
    Review on security threats and defense measures in federated learning
    Xuebin CHEN, Zhiqiang REN, Hongyang ZHANG
    Journal of Computer Applications    2024, 44 (6): 1663-1672.   DOI: 10.11772/j.issn.1001-9081.2023060832
    Abstract318)   HTML20)    PDF (1072KB)(622)       Save

    Federated learning is a distributed learning approach for solving the data sharing problem and privacy protection problem in machine learning, in which multiple parties jointly train a machine learning model and protect the privacy of data. However, there are security threats inherent in federated learning, which makes federated learning face great challenges in practical applications. Therefore, analyzing the attacks faced by federation learning and the corresponding defensive measures are crucial for the development and application of federation learning. First, the definition, process and classification of federated learning were introduced, and the attacker model in federated learning was introduced. Then, the possible attacks in terms of both robustness and privacy of federated learning systems were introduced, and the corresponding defense measures were introduced as well. Furthermore, the shortcomings of the defense schemes were also pointed out. Finally, a secure federated learning system was envisioned.

    Table and Figures | Reference | Related Articles | Metrics
    Distributed UAV cluster pursuit decision-making based on trajectory prediction and MADDPG
    Yu WANG, Zhihui GUAN, Yuanpeng LI
    Journal of Computer Applications    2024, 44 (11): 3623-3628.   DOI: 10.11772/j.issn.1001-9081.2023101538
    Abstract316)   HTML2)    PDF (918KB)(113)       Save

    A Trajectory Prediction based Distributed Multi-Agent Deep Deterministic Policy Gradient (TP-DMADDPG) algorithm was proposed to address the problems of insufficient flexibility and poor generalization ability of Unmanned Aerial Vehicle (UAV) cluster pursuit decision-making algorithms in complex mission environments. Firstly, to enhance the realism of the pursuit mission, an intelligent escape strategy was designed for the target. Secondly, considering the conditions such as missing information of target due to communication interruption and other reasons, a Long Short-Term Memory (LSTM) network was used to predict the position information of target in real time, and the state space of the decision-making model was constructed on the basis of the prediction information. Finally, TP-DMADDPG was designed based on the distributed framework and Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm, which enhanced the flexibility and generalization ability of pursuit decision-making in the process of complex air combat. Simulation results show that compared with Deep Deterministic Policy Gradient (DDPG), Twin Delayed Deep Deterministic policy gradient (TD3) and MADDPG algorithms, the TP-DMADDPG algorithm increases the success rate of collaborative decision-making by more than 15 percentage points, and can solve the problem of pursuing intelligent escaping target with incomplete information.

    Table and Figures | Reference | Related Articles | Metrics
    Small target detection model in overlooking scenes on tower cranes based on improved real-time detection Transformer
    Yudong PANG, Zhixing LI, Weijie LIU, Tianhao LI, Ningning WANG
    Journal of Computer Applications    2024, 44 (12): 3922-3929.   DOI: 10.11772/j.issn.1001-9081.2023121796
    Abstract314)   HTML1)    PDF (3128KB)(254)       Save

    In view of a series of problems of security guarantee of construction site personnel such as casualties led by falling objects and tower crane collapse caused by mutual collision of tower hooks, a small target detection model in overlooking scenes on tower cranes based on improved Real-Time DEtection TRansformer (RT-DETR) was proposed. Firstly, the multiple training and single inference structures designed by applying the idea of model reparameterization were added to the original model to improve the detection speed. Secondly, the convolution module in FasterNet Block was redesigned to replace BasicBlock in the original BackBone to improve performance of the detection model. Thirdly, the new loss function Inner-SIoU (Inner-Structured Intersection over Union) was utilized to further improve precision and convergence speed of the model. Finally, the ablation and comparison experiments were conducted to verify the model performance. The results show that, in detection of the small target images in overlooking scenes on tower cranes, the proposed model achieves the precision of 94.7%, which is higher than that of the original RT-DETR model by 6.1 percentage points. At the same time, the Frames Per Second (FPS) of the proposed model reaches 59.7, and the detection speed is improved by 21% compared with the original model. The Average Precision (AP) of the proposed model on the public dataset COCO 2017 is 2.4, 1.5, and 1.3 percentage points higher than those of YOLOv5, YOLOv7, and YOLOv8, respectively. It can be seen that the proposed model meets the precision and speed requirements for small target detection in overlooking scenes on tower cranes.

    Table and Figures | Reference | Related Articles | Metrics
    Deep network compression method based on low-rank decomposition and vector quantization
    Dongwei WANG, Baichen LIU, Zhi HAN, Yanmei WANG, Yandong TANG
    Journal of Computer Applications    2024, 44 (7): 1987-1994.   DOI: 10.11772/j.issn.1001-9081.2023071027
    Abstract314)   HTML120)    PDF (1506KB)(439)       Save

    As the development of artificial intelligence, deep neural network has become an essential tool in various pattern recognition tasks. Deploying deep Convolutional Neural Networks (CNN) on edge computing equipment is challenging due to storage space and computing resource constraints. Therefore, deep network compression has become an important research topic in recent years. Low-rank decomposition and vector quantization are the most popular network compression techniques, which both try to find a compact representation of the original network, thereby reducing the redundancy of network parameters. By establishing a joint compression framework, a deep network compression method based on low-rank decomposition and vector decomposition — Quantized Tensor Decomposition (QTD) was proposed to obtain higher compression ratio by performing further quantization based on the low-rank structure of network. Experimental results of classical ResNet and the proposed method on CIFAR-10 dataset show that the volume can be compressed to 1% by QTD with a slight accuracy drop of 1.71 percentage points. Moreover, the proposed method was compared with the quantization-based method PQF (Permute, Quantize, and Fine-tune), the low-rank decomposition-based method TDNR (Tucker Decomposition with Nonlinear Response), and the pruning-based method CLIP-Q (Compression Learning by In-parallel Pruning-Quantization) on large dataset ImageNet. Experimental results show that QTD can maintain better classification accuracy with same compression range.

    Table and Figures | Reference | Related Articles | Metrics
    Graph data generation approach for graph neural network model extraction attacks
    Ying YANG, Xiaoyan HAO, Dan YU, Yao MA, Yongle CHEN
    Journal of Computer Applications    2024, 44 (8): 2483-2492.   DOI: 10.11772/j.issn.1001-9081.2023081110
    Abstract311)   HTML2)    PDF (3213KB)(305)       Save

    Data-free model extraction attacks are a class of machine learning security problems based on the fact that the attacker has no knowledge of the training data information required to carry out the attack. Aiming at the research gap of data-free model extraction attacks in the field of Graphical Neural Network (GNN), a GNN model extraction attack method was proposed. The graph node feature information and edge information were optimized with the graph neural network interpretability method GNNExplainer and the graph data enhancement method GAUG-M, respectively, so as to generate the required graph data and achieve the final GNN model extraction. Firstly, the GNNExplainer method was used to obtain the important graph node feature information from the interpretable analysis of the response results of the target model. Secondly, the overall optimization of the graph node feature information was achieved by up weighting the important graph node features and downweighting the non-important graph node features. Then, the graph autoencoder was used as the edge information prediction module, which obtained the connection probability information between nodes according to the optimized graph node features. Finally, the edge information was optimized by adding or deleting the corresponding edges according to the probability. Three GNN model architectures trained on five graph datasets were experimented as the target models for extraction attacks, and the obtained alternative models achieve 73% to 87% accuracy in node classification task and 76% to 89% fidelity with the target model performance, which verifies the effectiveness of the proposed method.

    Table and Figures | Reference | Related Articles | Metrics
    Semi-supervised object detection framework guided by curriculum learning
    Yingjun ZHANG, Niuniu LI, Binhong XIE, Rui ZHANG, Wangdong LU
    Journal of Computer Applications    2024, 44 (8): 2326-2333.   DOI: 10.11772/j.issn.1001-9081.2023081062
    Abstract310)   HTML24)    PDF (2042KB)(260)       Save

    In order to enhance the quality of pseudo labels, address the issue of confirmation bias in Semi-Supervised Object Detection (SSOD), and tackle the challenge of ignoring complexities in unlabeled data leading to erroneous pseudo labels in existing algorithms, an SSOD framework guided by Curriculum Learning (CL) was proposed. The framework consisted of two modules: the ICSD (IoU-Confidence-Standard-Deviation) difficulty measurer and the BP (Batch-Package) training scheduler. The ICSD difficulty measurer comprehensively considered information such as IoU (Intersection over Union) between pseudo-bounding boxes, confidence, class label, etc.,and the C_IOU (Checkpoint_IOU) method was introduced to evaluate the reliability of unlabeled data. The BP training scheduler designed two efficient scheduling strategies, starting from the perspectives of Batch and Package respectively, giving priority to unlabeled data with high reliability indicators to achieve full utilization of the entire unlabeled data set in the form of course learning. Extensive comparative experimental results on the Pascal VOC and MS-COCO datasets demonstrate that the proposed framework applies to existing SSOD algorithms and exhibits significant improvements in detection accuracy and stability.

    Table and Figures | Reference | Related Articles | Metrics
    Enhanced deep subspace clustering method with unified framework
    Qing WANG, Jieyu ZHAO, Xulun YE, Nongxiao WANG
    Journal of Computer Applications    2024, 44 (7): 1995-2003.   DOI: 10.11772/j.issn.1001-9081.2023101395
    Abstract306)   HTML83)    PDF (3432KB)(332)       Save

    Deep subspace clustering is a method that performs well in processing high-dimensional data clustering tasks. However, when dealing with challenging data, current deep subspace clustering methods with fixed self-expressive matrix usually exhibit suboptimal clustering results due to the conventional practice of treating self-expressive learning and indicator learning as two separate and independent processes, and the quality of self-expressive matrix has a crucial impact on the accuracy of clustering results. To solve the above problems, an enhanced deep subspace clustering method with unified framework was proposed. Firstly, by integrating feature learning, self-expressive learning, and indicator learning together to optimize all parameters, the self-expressive matrix was dynamically learned based on the characteristics of the data, ensuring accurate capture of data features. Secondly, to improve the effects of self-representative learning, class prototype pseudo-label learning was proposed to provide self-supervised information for feature learning and indicator learning, thereby promoting self-expressive learning. Finally, to enhance the discriminative ability of embedded representations, orthogonality constraints were introduced to help achieve self-expressive attribute. The experimental results show that compared with AASSC (Adaptive Attribute and Structure Subspace Clustering network), the proposed method improves clustering accuracy by 1.84, 0.49 and 0.34 percentage points on the MNIST, UMIST and COIL20 datasets. It can be seen that the proposed method improves the accuracy of self-representative matrix learning, thereby achieving better clustering effects.

    Table and Figures | Reference | Related Articles | Metrics
    Hybrid internet of vehicles intrusion detection system for zero-day attacks
    Jiepo FANG, Chongben TAO
    Journal of Computer Applications    2024, 44 (9): 2763-2769.   DOI: 10.11772/j.issn.1001-9081.2023091328
    Abstract301)   HTML12)    PDF (2618KB)(744)       Save

    Existing machine learning methods suffer from over-reliance on sample data and insensitivity to anomalous data when confronted with zero-day attack detection, thus making it difficult for Intrusion Detection System (IDS) to effectively defend against zero-day attacks. Therefore, a hybrid internet of vehicles intrusion detection system based on Transformer and ANFIS (Adaptive-Network-based Fuzzy Inference System) was proposed. Firstly, a data enhancement algorithm was designed and the problem of unbalanced data samples was solved by denoising first and then generating. Secondly, a feature engineering module was designed by introducing non-linear feature interactions into complex feature combinations. Finally, the self-attention mechanism of Transformer and the adaptive learning method of ANFIS were combined, which enhanced the ability of feature representation and reduced the dependence on sample data. The proposed system was compared with other SOTA (State-Of-The-Art) algorithms such as Dual-IDS on CICIDS-2017 and UNSW-NB15 intrusion datasets. Experimental results show that for zero-day attacks, the proposed system achieves 98.64% detection accuracy and 98.31% F1 value on CICIDS-2017 intrusion dataset, and 93.07% detection accuracy and 92.43% F1 value on UNSW-NB15 intrusion dataset, which validates high accuracy and strong generalization ability of the proposed algorithm for zero-day attack detection.

    Table and Figures | Reference | Related Articles | Metrics
    Incomplete multi-view clustering algorithm based on self-attention fusion
    Shunyong LI, Shiyi LI, Rui XU, Xingwang ZHAO
    Journal of Computer Applications    2024, 44 (9): 2696-2703.   DOI: 10.11772/j.issn.1001-9081.2023091253
    Abstract295)   HTML8)    PDF (2806KB)(503)       Save

    Multi-view clustering task based on incomplete data has become one of the research hotspots in the field of unsupervised learning. However, most multi-view clustering algorithms based on “shallow” models often find it difficult to extract and characterize potential feature structures within views when dealing with large-scale high-dimensional data. At the same time, the stacking or averaging methods of multi-view information fusion ignore the differences between views and does not fully consider the different contributions of each view to building a common consensus representation. To address the above issues, an Incomplete Multi-View Clustering algorithm based on Self-Attention Fusion (IMVCSAF) was proposed. Firstly, the potential features of each view were extracted on the basis of a deep autoencoder, and the consistency information among views was maximized by using contrastive learning. Secondly, a self-attention mechanism was adopted to recode and fuse the potential representations of each view, and the inherent causality as well as feature complementarity between different views was considered and mined comprehensively. Thirdly, based on the common consensus representation, the potential representation of missing instance was predicted and recovered, thereby fully implementing the process of multi-view clustering. Experimental results on Scene-15, LandUse-21, Caltech101-20 and Noisy-MNIST datasets show that, the accuracy of IMVCSAF is higher than those of other comparison algorithms while meeting the convergence requirements. On Noisy-MNIST dataset with 50% miss rate, the accuracy of IMVCSAF is 6.58 percentage points higher than that of the second best algorithm — COMPETER (inCOMPlete muLti-view clustEring via conTrastivE pRediction).

    Table and Figures | Reference | Related Articles | Metrics
    Two-stage differential grouping method for large-scale overlapping problems
    Maojiang TIAN, Mingke CHEN, Wei DU, Wenli DU
    Journal of Computer Applications    2024, 44 (5): 1348-1354.   DOI: 10.11772/j.issn.1001-9081.2024020255
    Abstract294)   HTML66)    PDF (738KB)(486)       Save

    Large-scale overlapping problems are prevalent in practical engineering applications, and the optimization challenge is significantly amplified due to the existence of shared variables. Decomposition-based Cooperative Co-evolution (CC) algorithms have demonstrated promising performance in addressing large-scale overlapping problems. However, certain novel CC frameworks designed for overlapping problems rely on grouping methods for the identification of overlapping problem structures and the current grouping methods for large-scale overlapping problems fail to consider both accuracy and efficiency simultaneously. To address the above problems, a Two-Stage Differential Grouping (TSDG) method for large-scale overlapping problems was proposed, which achieves accurate grouping while significantly reducing computational resource consumption. In the first stage, a grouping method based on the finite difference principle was employed to efficiently identify all subcomponents and shared variables. To enhance both stability and accuracy in grouping, a grouping refinement method was proposed in the second stage to examine the information of the subcomponents and shared variables obtained in the previous stage and correct inaccurate grouping results. Based on the synergy of the two stages, TSDG achieves efficient and accurate decomposition of large-scale overlapping problems. Extensive experimental results demonstrate that TSDG is capable of accurately grouping large-scale overlapping problems while consuming fewer computational resources. In the optimization experiment, TSDG exhibits superior performance compared to state-of-the-art algorithms for large-scale overlapping problems.

    Table and Figures | Reference | Related Articles | Metrics
    Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network
    Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG
    Journal of Computer Applications    2024, 44 (7): 2065-2072.   DOI: 10.11772/j.issn.1001-9081.2023071045
    Abstract292)   HTML15)    PDF (1969KB)(320)       Save

    High-quality public traffic demand has become one of the major challenges for Intelligent Transportation Systems (ITS). For public traffic demand prediction, most of existing models adopt graphs with fixed structure to describe the spatial correlation of traffic demand, ignoring that traffic demand has different spatial dependence at different scales. Thus, a Multi-scale Spatial-Temporal Graph Convolutional Network (MSTGCN) model was proposed for public traffic demand prediction. Firstly, global demand similarity graph and local demand similarity graph were constructed at global and local scales. Two graphs were used to capture long-term stable and short-term dynamic features of public traffic demand. Graph Convolutional Network (GCN) was introduced to extract global and local spatial information in two graphs; besides, attention mechanism was adopted to combine the two kinds of spatial information adaptively. Moreover, Gated Recurrent Unit (GRU) was used to capture time-varying features of public traffic demand. The experimental results show that the MSTGCN model achieves the Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Pearson Correlation Coefficient (PCC) of 2.788 6, 1.737 1, and 0.799 2 on New York City (NYC) Bike dataset; and 9.573 4, 5.861 2, and 0.963 1 on NYC Taxi dataset. It proves that MSTGCN model can effectively mine multi-scale spatial-temporal features to accurately predict future public traffic demand.

    Table and Figures | Reference | Related Articles | Metrics
    Improved U-Net algorithm based on attention mechanism and multi-scale fusion
    Song WU, Xin LAN, Jingyang SHAN, Haiwen XU
    Journal of Computer Applications    0, (): 24-28.   DOI: 10.11772/j.issn.1001-9081.2022121844
    Abstract290)   HTML5)    PDF (2163KB)(129)       Save

    Aiming at the problems of computational redundancy and difficulty in segmenting fine structures of the original U-Net in medical image segmentation tasks, an improved U-Net algorithm based on attention mechanism and multi-scale fusion was proposed. Firstly, by integrating channel attention mechanism into the skip connections, the channels containing more important information were focused by the network, thereby reducing computational resource cost and improving computational efficiency. Secondly, the feature fusion strategy was added to increase the contextual information for the feature maps passed to the decoder, which realized the complementary and multiple utilization among the features. Finally, the joint optimization was performed by using Dice loss and binary cross entropy loss, so as to handle with the problem of dramatic oscillations of loss function that may occur in fine structure segmentation. Experimental validation results on Kvasir_seg and DRIVE datasets show that compared with the original U-Net algorithm, the proposed improved algorithm has the Dice coefficient increased by 1.82 and 0.82 percentage points, the SEnsitivity (SE) improved by 1.94 and 3.53 percentage points, and the Accuracy (Acc) increased by 1.62 and 0.04 percentage points, respectively. It can be seen that the proposed improved algorithm can enhance performance of the original U-Net for fine structure segmentation.

    Table and Figures | Reference | Related Articles | Metrics
    Industrial defect detection method with improved masked autoencoder
    Kaili DENG, Weibo WEI, Zhenkuan PAN
    Journal of Computer Applications    2024, 44 (8): 2595-2603.   DOI: 10.11772/j.issn.1001-9081.2023081122
    Abstract290)   HTML7)    PDF (4261KB)(24)       Save

    Considering the problem of missed detection or over detection in the existing defect detection methods that only need normal samples, an method that combined an improved masked autoencoder with an improved Unet was constructed to achieve pixel-level defect detection. Firstly, a defect fitting module was used to generate the defect mask image and the defect image corresponding to the normal image. Secondly, the defect image was randomly masked to remove most of the defect information from the defect image. The autoencoder with Transformer structure was stimulated to learn the representations from unmasked normal regions and to repair the defect image based on context. In order to improve the model’s ability to repair details of the image, a new loss function was designed. Finally, in order to achieve pixel-level defect detection, the defect image and the repaired image were concatenated and input into the Unet with the channel cross-fusion Transformer structure. Experimental results on MVTec AD dataset show that the average image-based and pixel-based Area Under the Receiver Operating Characteristic Curve (ROC AUC) of the proposed method reached 0.984 and 0.982 respectively; compared with DRAEM (Discriminatively trained Reconstruction Anomaly Embedding Model), it was increased by 2.9 and 3.2 percentage points; compared with CFLOW-AD (Anomaly Detection via Conditional normalizing FLOWs), it was increased by 3.1 and 0.8 percentage points. It verifies that the proposed method has high recognition rate and detection accuracy.

    Table and Figures | Reference | Related Articles | Metrics
    Review of online education learner knowledge tracing
    Yajuan ZHAO, Fanjun MENG, Xingjian XU
    Journal of Computer Applications    2024, 44 (6): 1683-1698.   DOI: 10.11772/j.issn.1001-9081.2023060852
    Abstract285)   HTML19)    PDF (2932KB)(3524)       Save

    Knowledge Tracing (KT) is a fundamental and challenging task in online education, and it involves the establishment of learner knowledge state model based on the learning history; by which learners can better understand their knowledge states, while teachers can better understand the learning situation of learners. The KT research for learners of online education was summarized. Firstly, the main tasks and historical progress of KT were introduced. Subsequently, traditional KT models and deep learning KT models were explained. Furthermore, relevant datasets and evaluation metrics were summarized, alongside a compilation of the applications of KT. In conclusion, the current status of knowledge tracing was summarized, and the limitations and future prospects for KT were discussed.

    Table and Figures | Reference | Related Articles | Metrics
2025 Vol.45 No.3

Current Issue
Archive
Honorary Editor-in-Chief: ZHANG Jingzhong
Editor-in-Chief: XU Zongben
Associate Editor: SHEN Hengtao XIA Zhaohui
Domestic Post Distribution Code: 62-110
Foreign Distribution Code: M4616
Address:
No. 9, 4th Section of South Renmin Road, Chengdu 610041, China
Tel: 028-85224283-803
  028-85222239-803
Website: www.joca.cn
E-mail: bjb@joca.cn
WeChat
Join CCF