Most Read articles

    Published in last 1 year |  In last 2 years |  In last 3 years |  All

    All
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Embedded road crack detection algorithm based on improved YOLOv8
    Huantong GENG, Zhenyu LIU, Jun JIANG, Zichen FAN, Jiaxing LI
    Journal of Computer Applications    2024, 44 (5): 1613-1618.   DOI: 10.11772/j.issn.1001-9081.2023050635
    Abstract1684)   HTML55)    PDF (2002KB)(2158)       Save

    Deploying the YOLOv8L model on edge devices for road crack detection can achieve high accuracy, but it is difficult to guarantee real-time detection. To solve this problem, a target detection algorithm based on the improved YOLOv8 model that can be deployed on the edge computing device Jetson AGX Xavier was proposed. First, the Faster Block structure was designed using partial convolution to replace the Bottleneck structure in the YOLOv8 C2f module, and the improved C2f module was recorded as C2f-Faster; second, an SE (Squeeze-and-Excitation) channel attention layer was connected after each C2f-Faster module in the YOLOv8 backbone network to further improve the detection accuracy. Experimental results on the open source road damage dataset RDD20 (Road Damage Detection 20) show that the average F1 score of the proposed method is 0.573, the number of detection Frames Per Second (FPS) is 47, and the model size is 55.5 MB. Compared with the SOTA (State-Of-The-Art) model of GRDDC2020 (Global Road Damage Detection Challenge 2020), the F1 score is increased by 0.8 percentage points, the FPS is increased by 291.7%, and the model size is reduced by 41.8%, which realizes the real-time and accurate detection of road cracks on edge devices.

    Table and Figures | Reference | Related Articles | Metrics
    Technology application prospects and risk challenges of large language models
    Yuemei XU, Ling HU, Jiayi ZHAO, Wanze DU, Wenqing WANG
    Journal of Computer Applications    2024, 44 (6): 1655-1662.   DOI: 10.11772/j.issn.1001-9081.2023060885
    Abstract1237)   HTML94)    PDF (1142KB)(1918)       Save

    In view of the rapid development of Large Language Model (LLM) technology, a comprehensive analysis was conducted on its technical application prospects and risk challenges which has great reference value for the development and governance of Artificial General Intelligence (AGI). Firstly, with representative language models such as Multi-BERT (Multilingual Bidirectional Encoder Representations from Transformer), GPT (Generative Pre-trained Transformer) and ChatGPT (Chat Generative Pre-trained Transformer) as examples, the development process, key technologies and evaluation systems of LLM were reviewed. Then, a detailed analysis of LLM on technical limitations and security risks was conducted. Finally, suggestions were put forward for technical improvement and policy follow-up of the LLM. The analysis indicates that at a developing status, the current LLMs still produce non-truthful and biased output, lack real-time autonomous learning ability, require huge computing power, highly rely on data quality and quantity, and tend towards monotonous language style. They have security risks related to data privacy, information security, ethics, and other aspects. Their future developments can continue to improve technically, from “large-scale” to “lightweight”, from “single-modal” to “multi-modal”, from “general-purpose” to “vertical”; for real-time follow-up in policy, their applications and developments should be regulated by targeted regulatory measures.

    Table and Figures | Reference | Related Articles | Metrics
    Review of YOLO algorithm and its applications to object detection in autonomous driving scenes
    Yaping DENG, Yingjiang LI
    Journal of Computer Applications    2024, 44 (6): 1949-1958.   DOI: 10.11772/j.issn.1001-9081.2023060889
    Abstract949)   HTML39)    PDF (1175KB)(874)       Save

    Object detection in autonomous driving scenes is one of the important research directions in computer vision. The researches focus on ensuring real-time and accurate object detection of objects by autonomous vehicles. Recently, a rapid development in deep learning technology had been witnessed, and its wide application in the field of autonomous driving had prompted substantial progress in this field. An analysis was conducted on the research status of object detection by YOLO (You Only Look Once) algorithms in the field of autonomous driving from the following four aspects. Firstly, the ideas and improvement methods of the single-stage YOLO series of detection algorithms were summarized, and the advantages and disadvantages of the YOLO series of algorithms were analyzed. Secondly, the YOLO algorithm-based object detection applications in autonomous driving scenes were introduced, the research status and applications for the detection and recognition of traffic vehicles, pedestrians, and traffic signals were expounded and summarized respectively. Additionally, the commonly used evaluation indicators in object detection, as well as the object detection datasets and automatic driving scene datasets, were summarized. Lastly, the problems and future development directions of object detection were discussed.

    Table and Figures | Reference | Related Articles | Metrics
    Survey of incomplete multi-view clustering
    Yao DONG, Yixue FU, Yongfeng DONG, Jin SHI, Chen CHEN
    Journal of Computer Applications    2024, 44 (6): 1673-1682.   DOI: 10.11772/j.issn.1001-9081.2023060813
    Abstract612)   HTML17)    PDF (2050KB)(515)       Save

    Multi-view clustering has recently been a hot topic in graph data mining. However, due to the limitations of data collection technology or human factors, multi-view data often has the problem of missing views or samples. Reducing the impact of incomplete views on clustering performance is a major challenge currently faced by multi-view clustering. In order to better understand the development of Incomplete Multi-view Clustering (IMC) in recent years, a comprehensive review is of great theoretical significance and practical value. Firstly, the missing types of incomplete multi-view data were summarized and analyzed. Secondly, four types of IMC methods, based on Multiple Kernel Learning (MKL), Matrix Factorization (MF) learning, deep learning, and graph learning were compared, and the technical characteristics and differences among the methods were analyzed. Thirdly, from the perspectives of dataset types, the numbers of views and categories, and application fields, twenty-two public incomplete multi-view datasets were summarized. Then, the evaluation metrics were outlined, and the performance of existing incomplete multi-view clustering methods on homogeneous and heterogeneous datasets were evaluated. Finally, the existing problems, future research directions, and existing application fields of incomplete multi-view clustering were discussed.

    Table and Figures | Reference | Related Articles | Metrics
    Survey on hypergraph application methods: issues, advances, and challenges
    Li ZENG, Jingru YANG, Gang HUANG, Xiang JING, Chaoran LUO
    Journal of Computer Applications    2024, 44 (11): 3315-3326.   DOI: 10.11772/j.issn.1001-9081.2023111629
    Abstract584)   HTML22)    PDF (795KB)(345)       Save

    Hypergraph is the generalization of graph, which has significant advantages in representing higher-order features of complex relationships compared with ordinary graph. As a relatively new data structure, hypergraph is playing a crucial role in various application fields increasingly. By appropriately using hypergraph models and algorithms, specific problems in real world were modeled and solved with higher efficiency and quality. Existing surveys of hypergraph mainly focus on the theory and techniques of hypergraph itself, and lack of a summary of modeling and solving methods in specific scenarios. To this end, after summarizing and introducing some fundamental concepts of hypergraph, the application methods, techniques, common issues, and solutions of hypergraph in various application scenarios were analyzed; by summarizing the existing work, some problems and obstacles that still exist in the applications of hypergraph to real-world problems were elaborated. Finally, the future research directions of hypergraph applications were prospected.

    Table and Figures | Reference | Related Articles | Metrics
    Survey of visual object tracking methods based on Transformer
    Ziwen SUN, Lizhi QIAN, Chuandong YANG, Yibo GAO, Qingyang LU, Guanglin YUAN
    Journal of Computer Applications    2024, 44 (5): 1644-1654.   DOI: 10.11772/j.issn.1001-9081.2023060796
    Abstract541)   HTML22)    PDF (1615KB)(1142)       Save

    Visual object tracking is one of the important tasks in computer vision, in order to achieve high-performance object tracking, a large number of object tracking methods have been proposed in recent years. Among them, Transformer-based object tracking methods become a hot topic in the field of visual object tracking due to their ability to perform global modeling and capture contextual information. Firstly, existing Transformer-based visual object tracking methods were classified based on their network structures, an overview of the underlying principles and key techniques for model improvement were expounded, and the advantages and disadvantages of different network structures were also summarized. Then, the experimental results of the Transformer-based visual object tracking methods on public datasets were compared to analyze the impact of network structure on performance. in which MixViT-L (ConvMAE) achieved tracking success rates of 73.3% and 86.1% on LaSOT and TrackingNet, respectively, proving that the object tracking methods based on pure Transformer two-stage architecture have better performance and broader development prospects. Finally, the limitations of these methods, such as complex network structure, large number of parameters, high training requirements, and difficulty in deploying on edge devices, were summarized, and the future research focus was outlooked, by combining model compression, self-supervised learning, and Transformer interpretability analysis, more kinds of feasible solutions for Transformer-based visual target tracking could be presented.

    Table and Figures | Reference | Related Articles | Metrics
    Overview of research and application of knowledge graph in equipment fault diagnosis
    Jie WU, Ansi ZHANG, Maodong WU, Yizong ZHANG, Congbao WANG
    Journal of Computer Applications    2024, 44 (9): 2651-2659.   DOI: 10.11772/j.issn.1001-9081.2023091280
    Abstract492)   HTML51)    PDF (2858KB)(714)       Save

    Useful knowledge can be extracted from equipment fault diagnosis data for construction of a knowledge graph, which can effectively manage complex equipment fault diagnosis information in the form of triples (entity, relationship, entity). This enables the rapid diagnosis of equipment faults. Firstly, the related concepts of knowledge graph for equipment fault diagnosis were introduced, and the framework of knowledge graph for equipment fault diagnosis domain was analyzed. Secondly, the research status at home and abroad about several key technologies, such as knowledge extraction, knowledge fusion and knowledge reasoning for equipment fault diagnosis knowledge graph, was summarized. Finally, the applications of knowledge graph in equipment fault diagnosis were summarized, some shortcomings and challenges in the construction of knowledge graph in this field were proposed, and some new ideas were provided for the field of equipment fault diagnosis in the future.

    Table and Figures | Reference | Related Articles | Metrics
    Underwater target detection algorithm based on improved YOLOv8
    Dahai LI, Bingtao LI, Zhendong WANG
    Journal of Computer Applications    2024, 44 (11): 3610-3616.   DOI: 10.11772/j.issn.1001-9081.2023111550
    Abstract471)   HTML8)    PDF (1637KB)(126)       Save

    Due to the unique characteristics of underwater creatures, underwater images usually exit many small targets being hard to detect and often overlapping with each other. In addition, light absorption and scattering in underwater environment can cause underwater images' color offset and blur. To overcome those challenges, an underwater target detection algorithm, namely WCA-YOLOv8, was proposed. Firstly, the Feature Fusion Module (FFM) was designed to improve the focus on spatial dimension in order to improve the recognition ability for targets with color offset and blur. Secondly, the FReLU Coordinate Attention (FCA) module was added to enhance the feature extraction ability for overlapped and occluded underwater targets. Thirdly, Complete Intersection over Union (CIoU) loss function was replaced by Wise-IoU version 3 (WIoU v3) loss function to strengthen the detection performance for small size targets. Finally, the Downsampling Enhancement Module (DEM) was designed to preserve context information during feature extraction more completely. Experimental results show that WCA-YOLOv8 achieves 75.8% and 88.6% mean Average Precision (mAP0.5) and 60 frame/s and 57 frame/s detection speeds on RUOD and URPC datasets, respectively. Compared with other state-of-the-art underwater target detection algorithms, WCA-YOLOv8 can achieve higher detection accuracy with faster detection speed.

    Table and Figures | Reference | Related Articles | Metrics
    Logo detection algorithm based on improved YOLOv5
    Yeheng LI, Guangsheng LUO, Qianmin SU
    Journal of Computer Applications    2024, 44 (8): 2580-2587.   DOI: 10.11772/j.issn.1001-9081.2023081113
    Abstract469)   HTML7)    PDF (4682KB)(390)       Save

    To address the challenges posed by complex background and varying size of logo images, an improved detection algorithm based on YOLOv5 was proposed. Firstly, in combination with the Channel Block Attention Module (CBAM), compression was applied in both image channels and spatial dimensions to extract critical information and significant regions within the image. Subsequently, the Switchable Atrous Convolution (SAC) was employed to allow the network to adaptively adjust the receptive field size in feature maps at different scales, improving the detection effects of objects across multiple scales. Finally, the Normalized Wasserstein Distance (NWD) was embedded into the loss function. The bounding boxes were modeled as 2D Gaussian distributions, the similarity between corresponding Gaussian distributions was calculated to better measure the similarity among objects, thereby enhancing the detection performance for small objects, and improving model robustness and stability. Compared to the original YOLOv5 algorithm: in small dataset FlickrLogos?32, the improved algorithm achieved a mean of Average Precision (mAP@0.5) of 90.6%, with an increase of 1 percentage point; in large dataset QMULOpenLogo, the improved algorithm achieved an mAP@0.5 of 62.7%, with an increase of 2.3 percentage points; in LogoDet3K for three types of logos, the improved algorithm increased the mAP@0.5 by 1.2, 1.4, and 1.4 percentage points respectively. Experimental results demonstrate that the improved algorithm has better small object detection ability of logo images.

    Table and Figures | Reference | Related Articles | Metrics
    Development, technologies and applications of blockchain 3.0
    Peng FANG, Fan ZHAO, Baoquan WANG, Yi WANG, Tonghai JIANG
    Journal of Computer Applications    2024, 44 (12): 3647-3657.   DOI: 10.11772/j.issn.1001-9081.2023121826
    Abstract466)   HTML38)    PDF (2294KB)(328)       Save

    Blockchain 3.0 is the third stage of the development of blockchain technology and the core of building a value Internet. Its innovations in sharding, cross-chain and privacy protection have given it a wide range of application scenarios and research value. It is highly valued by relevant people in academia and industry. For the development, technologies and applications of blockchain 3.0, the relevant literature on blockchain 3.0 at home and abroad in the past five years were surveyed and reviewed. Firstly, the basic theory and technical characteristics of blockchain were introduced, laying the foundation for an in-depth understanding of the research progress of blockchain. Subsequently, based on the evolution trend of blockchain technology over time, the development process and various key development time nodes of blockchain 3.0, as well as the reasons of the division of different stages of development of blockchain using sharding and side-chain technologies as benchmarks, were given. Then, the current research status of key technologies of blockchain 3.0 was analyzed in detail, and typical applications of blockchain 3.0 in six major fields such as internet of things, medical care, and agriculture were summarized. Finally, the key challenges and future development opportunities faced by blockchain 3.0 in its development process were summed up.

    Table and Figures | Reference | Related Articles | Metrics
    Review of evolutionary multitasking from the perspective of optimization scenarios
    Jiawei ZHAO, Xuefeng CHEN, Liang FENG, Yaqing HOU, Zexuan ZHU, Yew‑Soon Ong
    Journal of Computer Applications    2024, 44 (5): 1325-1337.   DOI: 10.11772/j.issn.1001-9081.2024020208
    Abstract450)   HTML72)    PDF (1383KB)(1154)       Save

    Due to the escalating complexity of optimization problems, traditional evolutionary algorithms increasingly struggle with high computational costs and limited adaptability. Evolutionary MultiTasking Optimization (EMTO) algorithms have emerged as a novel solution, leveraging knowledge transfer to tackle multiple optimization issues concurrently, thereby enhancing evolutionary algorithms’ efficiency in complex scenarios. The current progression of evolutionary multitasking optimization research was summarized, and different research perspectives were explored by reviewing existing literature and highlighting the notable absence of optimization scenario analysis. By focusing on the application scenarios of optimization problems, the scenarios suitable for evolutionary multitasking optimization and their fundamental solution strategies were systematically outlined. This study thus could aid researchers in selecting the appropriate methods based on specific application needs. Moreover, an in-depth discussion on the current challenges and future directions of EMTO were also presented to provide guidance and insights for advancing research in this field.

    Table and Figures | Reference | Related Articles | Metrics
    Time series classification method based on multi-scale cross-attention fusion in time-frequency domain
    Mei WANG, Xuesong SU, Jia LIU, Ruonan YIN, Shan HUANG
    Journal of Computer Applications    2024, 44 (6): 1842-1847.   DOI: 10.11772/j.issn.1001-9081.2023060731
    Abstract423)   HTML9)    PDF (2511KB)(599)       Save

    To address the problem of low classification accuracy caused by insufficient potential information interaction between time series subsequences, a time series classification method based on multi-scale cross-attention fusion in time-frequency domain called TFFormer (Time-Frequency Transformer) was proposed. First, time and frequency spectrums of the original time series were divided into subsequences with the same length respectively, and the point-value coupling problem was solved by adding positional embedding after linear projection. Then, the long-term time series dependency problem was solved because the model was made to focus on more important time series features by Improved Multi-Head self-Attention (IMHA) mechanism. Finally, a multi-scale Cross-Modality Attention (CMA) module was proposed to enhance the interaction between the time domain and frequency domain, so that the model could further mine the frequency information of the time series. The experimental results show that compared with Fully Convolutional Network (FCN), the classification accuracy of the proposed method on Trace, StarLightCurves and UWaveGestureLibraryAll datasets increased by 0.3, 0.9 and 1.4 percentage points. It is proved that by enhancing the information interaction between time domain and frequency domain of the time series, the model convergence speed and classification accuracy can be improved.

    Table and Figures | Reference | Related Articles | Metrics
    Research review of multitasking optimization algorithms and applications
    Yue WU, Hangqi DING, Hao HE, Shunjie BI, Jun JIANG, Maoguo GONG, Qiguang MIAO, Wenping MA
    Journal of Computer Applications    2024, 44 (5): 1338-1347.   DOI: 10.11772/j.issn.1001-9081.2024020209
    Abstract422)   HTML53)    PDF (1486KB)(597)       Save

    Evolutionary MultiTasking Optimization (EMTO) is one of the new methods in evolutionary computing, which can simultaneously solve multiple related optimization tasks and enhance the optimization of each task through knowledge transfer between tasks. In recent years, more and more research on evolutionary multitasking optimization has been devoted to utilizing its powerful parallel search capability and potential for reducing computational costs to optimize various problems, and EMTO has been used in a variety of real-world scenarios. The researches and applications of EMTO were discussed from four aspects: principle, core design, applications, and challenges. Firstly, the general classification of EMTO was introduced from two levels and four aspects, including single-population multitasking, multi-population multitasking, auxiliary task, and multiform task. Next, the core component design of EMTO was introduced, including task construction and knowledge transfer. Finally, its various application scenarios were introduced and a summary and outlook for future research was provided.

    Table and Figures | Reference | Related Articles | Metrics
    Small object detection algorithm from drone perspective based on improved YOLOv8n
    Tao LIU, Shihong JU, Yimeng GAO
    Journal of Computer Applications    2024, 44 (11): 3603-3609.   DOI: 10.11772/j.issn.1001-9081.2023111644
    Abstract409)   HTML11)    PDF (1561KB)(251)       Save

    In view of the low accuracy of object detection algorithms in small object detection from drone perspective, a new small object detection algorithm named SFM-YOLOv8 was proposed by improving the backbone network and attention mechanism of YOLOv8. Firstly, the SPace-to-Depth Convolution (SPDConv) suitable for low-resolution images and small object detection was integrated into the backbone network to retain discriminative feature information and improve the perception ability to small objects. Secondly, a multi-branch attention named MCA (Multiple Coordinate Attention) was introduced to enhance the spatial and channel information on the feature layer. Then, a convolution FE-C2f fusing FasterNet and Efficient Multi-scale Attention (EMA) was constructed to reduce the computational cost and lightweight the model. Besides, a Minimum Point Distance based Intersection over Union (MPDIoU) loss function was introduced to improve the accuracy of the algorithm. Finally, a small object detection layer was added to the network structure of YOLOv8n to retain more location information and detailed features of small objects. Experimental results show that compared with YOLOv8n, SFM-YOLOv8 achieves a 4.37 percentage point increase in mAP50 (mean Average Precision) with a 5.98% reduction in parameters on VisDrone-DET2019 dataset. Compared to the related mainstream models, SFM-YOLOv8 achieves higher accuracy and meets real-time detection requirements.

    Table and Figures | Reference | Related Articles | Metrics
    Survey and prospect of large language models
    Xiaolin QIN, Xu GU, Dicheng LI, Haiwen XU
    Journal of Computer Applications    2025, 45 (3): 685-696.   DOI: 10.11772/j.issn.1001-9081.2025010128
    Abstract402)   HTML12)    PDF (2035KB)(281)       Save

    Large Language Models (LLMs) are a class of language models composed of artificial neural networks with a vast number of parameters (typically billions of weights or more). They are trained on a large amount of unlabeled text using self-supervised or semi-supervised learning and are the core of current generative Artificial Intelligence (AI) technologies. Compared to traditional language models, LLMs demonstrate stronger language understanding and generation capabilities, supported by substantial computational power, extensive parameters, and large-scale data. They are widely applied in tasks such as machine translation, question answering systems, and dialogue generation with good performance. Most of the existing surveys focus on the theoretical construction and training techniques of LLMs, while systematic exploration of LLMs’ industry-level application practices and evolution of the technological ecosystem remains insufficient. Therefore, based on introducing the foundational architecture, training techniques, and development history of LLMs, the current general key technologies in LLMs and advanced integration technologies with LLMs bases were analyzed. Then, by summarizing the existing research, challenges faced by LLMs in practical applications were further elaborated, including problems such as data bias, model hallucination, and computational resource consumption, and an outlook was provided on the ongoing development trends of LLMs.

    Table and Figures | Reference | Related Articles | Metrics
    Correlation filtering based target tracking with nonlinear temporal consistency
    Wentao JIANG, Wanxuan LI, Shengchong ZHANG
    Journal of Computer Applications    2024, 44 (8): 2558-2570.   DOI: 10.11772/j.issn.1001-9081.2023081121
    Abstract397)   HTML0)    PDF (7942KB)(81)       Save

    Concerning the problem that existing target tracking algorithms mainly use the linear constraint mechanism LADCF (Learning Adaptive Discriminative Correlation Filters), which easily causes model drift, a correlation filtering based target tracking algorithm with nonlinear temporal consistency was proposed. First, a nonlinear temporal consistency term was proposed based on Stevens’ Law, which aligned closely with the characteristics of human visual perception. The nonlinear temporal consistency term allowed the model to track the target relatively smoothly, thus ensuring tracking continuity and preventing model drift. Next, the Alternating Direction Method of Multipliers (ADMM) was employed to compute the optimal function value, ensuring real-time tracking of the algorithm. Lastly, Stevens’ Law was used for nonlinear filter updating, enabling the filter update factor to enhance and suppress the filter according to the change of the target, thereby adapting to target changes and preventing filter degradation. Comparison experiments with mainstream correlation filtering and deep learning algorithms were performed on four standard datasets. Compared with the baseline algorithm LADCF, the tracking precision and success rate of the proposed algorithm were improved by 2.4 and 3.8 percentage points on OTB100 dataset, and 1.5 and 2.5 percentage points on UAV123 dataset. The experimental results show that the proposed algorithm effectively avoids tracking model drift, reduces the likelihood of filter degradation, has higher tracking precision and success rate, and stronger robustness in complicated situations such as occlusion and illumination changes.

    Table and Figures | Reference | Related Articles | Metrics
    Mobile robot 3D space path planning method based on deep reinforcement learning
    Tian MA, Runtao XI, Jiahao LYU, Yijie ZENG, Jiayi YANG, Jiehui ZHANG
    Journal of Computer Applications    2024, 44 (7): 2055-2064.   DOI: 10.11772/j.issn.1001-9081.2023060749
    Abstract374)   HTML28)    PDF (5732KB)(935)       Save

    Aiming at the problems of high complexity and uncertainty in 3D unknown environment, a mobile robot 3D path planning method based on deep reinforcement learning was proposed, under a limited observation space optimization strategy. First, the depth map information was used as the agent’s input in the limited observation space, which could simulate complex 3D space environments with limited and unknown movement conditions. Second, a two-stage action selection policy in discrete action space was designed, including directional actions and movement actions, which could reduce the searching steps and time. Finally, based on the Proximal Policy Optimization (PPO) algorithm, the Gated Recurrent Unit (GRU) was added to combine the historical state information, to enhance the policy stability in unknown environments, so that the accuracy and smoothness of the planned path could be improved. The experimental results show that, compared with Advantage Actor-Critic (A2C), the average search time is reduced by 49.07% and the average planned path length is reduced by 1.04%. Meanwhile, the proposed method can complete the multi-objective path planning tasks under linear sequential logic constraints.

    Table and Figures | Reference | Related Articles | Metrics
    Class-imbalanced traffic abnormal detection based on 1D-CNN and BiGRU
    Hong CHEN, Bing QI, Haibo JIN, Cong WU, Li’ang ZHANG
    Journal of Computer Applications    2024, 44 (8): 2493-2499.   DOI: 10.11772/j.issn.1001-9081.2023081112
    Abstract360)   HTML1)    PDF (1194KB)(566)       Save

    Network traffic anomaly detection is a network security defense method that involves analyzing and determining network traffic to identify potential attacks. A new approach was proposed to address the issue of low detection accuracy and high false positive rate caused by imbalanced high-dimensional network traffic data and different attack categories. One Dimensional Convolutional Neural Network(1D-CNN) and Bidirectional Gated Recurrent Unit (BiGRU) were combined to construct a model for traffic anomaly detection. For class-imbalanced data, balanced processing was performed by using an improved Synthetic Minority Oversampling TEchnique (SMOTE), namely Borderline-SMOTE, and an undersampling clustering technique based on Gaussian Mixture Model (GMM). Subsequently, a one-dimensional CNN was utilized to extract local features in the data, and BiGRU was used to better extract the time series features in the data. Finally, the proposed model was evaluated on the UNSW-NB15 dataset, achieving an accuracy of 98.12% and a false positive rate of 1.28%. The experimental results demonstrate that the proposed model outperforms other classic machine learning and deep learning models, it improves the recognition rate for minority attacks and achieves higher detection accuracy.

    Table and Figures | Reference | Related Articles | Metrics
    Survey of code similarity detection technology
    Xiangjie SUN, Qiang WEI, Yisen WANG, Jiang DU
    Journal of Computer Applications    2024, 44 (4): 1248-1258.   DOI: 10.11772/j.issn.1001-9081.2023040551
    Abstract359)   HTML19)    PDF (1868KB)(912)       Save

    Code reuse not only brings convenience to software development, but also introduces security risks, such as accelerating vulnerability propagation and malicious code plagiarism. Code similarity detection technology is to calculate code similarity by analyzing lexical, syntactic, semantic and other information between codes. It is one of the most effective technologies to judge code reuse, and it is also a program security analysis technology that has developed rapidly in recent years. First, the latest technical progress of code similarity detection was systematically reviewed, and the current code similarity detection technology was classified. According to whether the target code was open source, it was divided into source code similarity detection and binary code similarity detection. According to the different programming languages and instruction sets, the second subdivision was carried out. Then, the ideas and research results of each technology were summarized, the successful cases of machine learning technology in the field of code similarity detection were analyzed, and the advantages and disadvantages of existing technologies were discussed. Finally, the development trend of code similarity detection technology was given to provide reference for relevant researchers.

    Table and Figures | Reference | Related Articles | Metrics
    Re-weighted adversarial variational autoencoder and its application in industrial causal effect estimation
    Zongyu LI, Siwei QIANG, Xiaobo GUO, Zhenfeng ZHU
    Journal of Computer Applications    2024, 44 (4): 1099-1106.   DOI: 10.11772/j.issn.1001-9081.2023050557
    Abstract353)   HTML8)    PDF (2192KB)(180)       Save

    Counterfactual prediction and selection bias are major challenges in causal effect estimation. To effectively represent the complex mixed distribution of potential covariant and enhance the generalization ability of counterfactual prediction, a Re-weighted adversarial Variational AutoEncoder Network (RVAENet) model was proposed for industrial causal effect estimation. To address bias problem in mixed distribution, the idea of domain adaptation was adopted, and an adversarial learning mechanism was used to balance the representation learning distribution of the latent variables obtained by the Variational AutoEncoder (VAE). Furthermore, the sample propensity weights were learned to re-weight the samples, reducing the distribution difference between the treatment group and the control group. The experimental results show that, in two scenarios of the industrial real-world datasets, the Areas Under Uplift Curve (AUUC) of the proposed model are improved by 15.02% and 16.02% compared to TEDVAE (Treatment Effect with Disentangled VAE). On the public datasets, the proposed model generally achieves optimal results for Average Treatment Effect (ATE) and Precision in Estimation of Heterogeneous Effect (PEHE).

    Table and Figures | Reference | Related Articles | Metrics
    Generative label adversarial text classification model
    Xun YAO, Zhongzheng QIN, Jie YANG
    Journal of Computer Applications    2024, 44 (6): 1781-1785.   DOI: 10.11772/j.issn.1001-9081.2023050662
    Abstract349)   HTML14)    PDF (1142KB)(419)       Save

    Text classification is a fundamental task in Natural Language Processing (NLP), aiming to assign text data to predefined categories. The combination of Graph Convolutional neural Network (GCN) and large-scale pre-trained model BERT (Bidirectional Encoder Representations from Transformer) has achieved excellent results in text classification tasks. Undirected information transmission of GCN in large-scale heterogeneous graphs produces information noise, which affects the judgment of the model and reduce the classification ability of the model. To solve this problem, a generative label adversarial model, the Class Adversarial Graph Convolutional Network (CAGCN) model, was proposed to reduce the interference of irrelevant information during classification and improve the classification performance of the model. Firstly, the composition method in TextGCN (Text Graph Convolutional Network) was used to construct the adjacency matrix, which was combined with GCN and BERT models as a Class Generator (CG). Secondly, the pseudo-label feature training method was used in the model training to construct a clueter. The cluster and the class generator were jointly trained. Finally, experiments were carried out on several widely used datasets. Experimental results show that the classification accuracy of CAGCN model is 1.2, 0.1, 0.5, 1.7 and 0.5 percentage points higher than that of RoBERTaGCN model on the widely used classification datasets 20NG, R8, R52, Ohsumed and MR, respectively.

    Table and Figures | Reference | Related Articles | Metrics
    Lightweight algorithm for impurity detection in raw cotton based on improved YOLOv7
    Yongjin ZHANG, Jian XU, Mingxing ZHANG
    Journal of Computer Applications    2024, 44 (7): 2271-2278.   DOI: 10.11772/j.issn.1001-9081.2023070969
    Abstract346)   HTML12)    PDF (8232KB)(341)       Save

    Addressing the challenges posed by high throughput of raw cotton and long impurity inspection duration in cotton mills, an improved YOLOv7 model incorporating lightweight modifications was proposed for impurity detection in raw cotton. Initially, redundant convolutional layers within YOLOv7 model were pruned, thereby increasing detection speed. Following this, FasterNet convolutional layer was integrated into the primary network to mitigate model computational load, diminish redundancy in feature maps, and consequently realized real-time detection. Ultimately, CSP-RepFPN (Cross Stage Partial networks with Replicated Feature Pyramid Network) was used within neck network to facilitate the reconstruction of feature pyramid, augment flow of feature information, minimize feature loss, and elevate the detection precision. Experimental results show that, the improved YOLOv7 model achieves a detection mean Average Precison of 96.0%, coupled with a 37.5% reduction in detection time on self-made raw cotton impurity dataset; and achieves a detection accuracy of 82.5% with a detection time of only 29.8 ms on publicly DWC (Drinking Waste Classification) dataset. This improved YOLOv7 model provides a lightweight approach for real-time detection, recognition and classification of impurities in raw cotton, yielding substantial time savings.

    Table and Figures | Reference | Related Articles | Metrics
    Proximal policy optimization algorithm based on clipping optimization and policy guidance
    Yi ZHOU, Hua GAO, Yongshen TIAN
    Journal of Computer Applications    2024, 44 (8): 2334-2341.   DOI: 10.11772/j.issn.1001-9081.2023081079
    Abstract345)   HTML14)    PDF (3877KB)(404)       Save

    Addressing the two issues in the Proximal Policy Optimization (PPO) algorithm, the difficulty in strictly constraining the difference between old and new policies and the relatively low efficiency in exploration and utilization, a PPO based on Clipping Optimization And Policy Guidance (COAPG-PPO) algorithm was proposed. Firstly, by analyzing the clipping mechanism of PPO, a trust-region clipping approach based on the Wasserstein distance was devised, strengthening the constraint on the difference between old and new policies. Secondly, within the policy updating process, ideas from simulated annealing and greedy algorithms were incorporated, improving the exploration efficiency and learning speed of algorithm. To validate the effectiveness of COAPG-PPO algorithm, comparative experiments were conducted using the MuJoCo testing benchmarks between PPO based on Clipping Optimization (CO-PPO), PPO with Covariance Matrix Adaptation (PPO-CMA), Trust Region-based PPO with RollBack (TR-PPO-RB), and PPO algorithm. The experimental results indicate that COAPG-PPO algorithm demonstrates stricter constraint capabilities, higher exploration and exploitation efficiencies, and higher reward values in most environments.

    Table and Figures | Reference | Related Articles | Metrics
    Image super-resolution network based on global dependency Transformer
    Zihan LIU, Dengwen ZHOU, Yukai LIU
    Journal of Computer Applications    2024, 44 (5): 1588-1596.   DOI: 10.11772/j.issn.1001-9081.2023050636
    Abstract340)   HTML6)    PDF (2858KB)(205)       Save

    At present, the image super-resolution networks based on deep learning are mainly implemented by convolution. Compared with the traditional Convolutional Neural Network (CNN), the main advantage of Transformer in the image super-resolution task is its long-distance dependency modeling ability. However, most Transformer-based image super-resolution models cannot establish global dependencies with small parameters and few network layers, which limits the performance of the model. In order to establish global dependencies in super-resolution network, an image Super-Resolution network based on Global Dependency Transformer (GDTSR) was proposed. Its main component was the Residual Square Axial Window Block (RSAWB), and in Transformer residual layer, axial window and self-attention were used to make each pixel globally dependent on the entire feature map. In addition, the super-resolution image reconstruction modules of most current image super-resolution models are composed of convolutions. In order to dynamically integrate the extracted feature information, Transformer and convolution were combined to jointly reconstruct super-resolution images. Experimental results show that the Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM) of GDTSR on five standard test sets, including Set5, Set14, B100, Urban100 and Manga109, are optimal for three multiples ( × 2 × 3 × 4 ), and on large-scale datasets Urban100 and Manga109, the performance improvement is especially obvious.

    Table and Figures | Reference | Related Articles | Metrics
    Personalized federated learning method based on dual stream neural network
    Zheyuan SHEN, Keke YANG, Jing LI
    Journal of Computer Applications    2024, 44 (8): 2319-2325.   DOI: 10.11772/j.issn.1001-9081.2023081207
    Abstract339)   HTML53)    PDF (2185KB)(232)       Save

    Classic Federated Learning (FL) algorithms are difficult to achieve good results in scenarios where data is highly heterogeneous. In Personalized FL (PFL), a new solution was proposed aiming at the problem of data heterogeneity in federated learning, which is to “tailor” a dedicated model for each client. In this way, the models had good performance. However, it brought the difficulty in extending federated learning to new clients at the same time. Focusing on the challenges of performance and scalability in PFL, FedDual, a FL model with dual stream neural network structure, was proposed. By adding an encoder for analyzing the personalized characteristics of clients, this model was not only able to have the performance of personalized models, but also able to be extended to new clients easily. Experimental results show that compared to the classic Federated Averaging (FedAvg) algorithm on datasets such as MNIST and FashionMNIST, FedDual obviously improves the accuracy; on CIFAR10 dataset, FedDual improves the accuracy by more than 10 percentage points, FedDual achieves “plug and play” for new clients without decrease of the accuracy, solving the problem of difficult scalability for new clients.

    Table and Figures | Reference | Related Articles | Metrics
    Road damage detection algorithm based on enhanced feature extraction
    Wudan LONG, Bo PENG, Jie HU, Ying SHEN, Danni DING
    Journal of Computer Applications    2024, 44 (7): 2264-2270.   DOI: 10.11772/j.issn.1001-9081.2023070956
    Abstract337)   HTML8)    PDF (2806KB)(436)       Save

    In response to the challenge posed by the difficulty in detecting small road damage areas and the uneven distribution of damage categories, a road damage detection algorithm termed RDD-YOLO was introduced based on the YOLOv7-tiny architecture. Firstly, the K-means++ algorithm was employed to determine anchor boxes better conforming to object dimensions. Subsequently, a Quantization Aware RepVGG (QARepVGG) module was utilized within the auxiliary detection branch, thereby enhancing the extraction of shallow features. Concurrently, an Addition and Multiplication Convolutional Block Attention Module (AM-CBAM) was embedded into the three inputs of the neck, effectively suppressing disturbances arising from intricate background. Furthermore, the feature fusion module Res-RFB (Resblock with Receptive Field Block) was devised to emulate the expansion of receptive field in human visual perception, consequently fusing information across multiple scales and thereby amplifying representational aptitude. Additionally, a lightweight Small Decoupled Head (S-DeHead) was introduced to elevate the precision of detecting small objects. Ultimately, the process of localizing small objects was optimized through the application of the Normalized Wasserstein Distance (NWD) metric, which in turn mitigated the challenge of imbalanced samples. Experimental results show that RDD-YOLO algorithm achieves a notable 6.19 percentage points enhancement in mAP50, a 5.31 percentage points elevation in F1-Score and the detection velocity of 135.26 frame/s by only increasing 0.71×106 parameters and 1.7 GFLOPs, which can meet the requirements for both accuracy and speed in road maintenance.

    Table and Figures | Reference | Related Articles | Metrics
    Survey of extractive text summarization based on unsupervised learning and supervised learning
    Xiawuji, Heming HUANG, Gengzangcuomao, Yutao FAN
    Journal of Computer Applications    2024, 44 (4): 1035-1048.   DOI: 10.11772/j.issn.1001-9081.2023040537
    Abstract325)   HTML20)    PDF (1575KB)(680)       Save

    Different from generative summarization methods, extractive summarization methods are more feasible to implement, more readable, and more widely used. At present, the literatures on extractive summarization methods mostly analyze and review some specific methods or fields, and there is no multi-faceted and multi-lingual systematic review. Therefore, the meanings of text summarization generation were discussed, related literatures were systematically reviewed, and the methods of extractive text summarization based on unsupervised learning and supervised learning were analyzed multi-dimensionally and comprehensively. First, the development of text summarization techniques was reviewed, and different methods of extractive text summarization were analyzed, including the methods based on rules, Term Frequency-Inverse Document Frequency (TF-IDF), centrality, potential semantic, deep learning, graph sorting, feature engineering, and pre-training learning, etc. Also, comparisons of advantages and disadvantages among different algorithms were made. Secondly, datasets in different languages for text summarization and popular evaluation metrics were introduced in detail. Finally, problems and challenges for research of extractive text summarization were discussed, and solutions and research trends were presented.

    Table and Figures | Reference | Related Articles | Metrics
    Multi-robot path following and formation based on deep reinforcement learning
    Haodong HE, Hao FU, Qiang WANG, Shuai ZHOU, Wei LIU
    Journal of Computer Applications    2024, 44 (8): 2626-2633.   DOI: 10.11772/j.issn.1001-9081.2023081120
    Abstract322)   HTML8)    PDF (3411KB)(204)       Save

    Aiming at the obstacle avoidance and trajectory smoothness problem of multi-robot path following and formation in crowd environment, a multi-robot path following and formation algorithm based on deep reinforcement learning was proposed. Firstly, a pedestrian danger priority mechanism was established, which was combined with reinforcement learning to design a danger awareness network to enhance the safety of multi-robot formation. Subsequently, a virtual robot was introduced as the reference target for multiple robots, thus transforming path following into tracking control of the virtual robot by the multiple robots, with the purpose of enhancing the smoothness of the robot trajectories. Finally, quantitative and qualitative analysis was conducted through simulation experiments to compare the proposed algorithm with existing ones. The experimental results show that compared with the existing point-to-point path following algorithms, the proposed algorithm has excellent obstacle avoidance performance in crowd environments, which ensures the smoothness of multi-robot motion trajectories.

    Table and Figures | Reference | Related Articles | Metrics
    Graph data generation approach for graph neural network model extraction attacks
    Ying YANG, Xiaoyan HAO, Dan YU, Yao MA, Yongle CHEN
    Journal of Computer Applications    2024, 44 (8): 2483-2492.   DOI: 10.11772/j.issn.1001-9081.2023081110
    Abstract309)   HTML2)    PDF (3213KB)(272)       Save

    Data-free model extraction attacks are a class of machine learning security problems based on the fact that the attacker has no knowledge of the training data information required to carry out the attack. Aiming at the research gap of data-free model extraction attacks in the field of Graphical Neural Network (GNN), a GNN model extraction attack method was proposed. The graph node feature information and edge information were optimized with the graph neural network interpretability method GNNExplainer and the graph data enhancement method GAUG-M, respectively, so as to generate the required graph data and achieve the final GNN model extraction. Firstly, the GNNExplainer method was used to obtain the important graph node feature information from the interpretable analysis of the response results of the target model. Secondly, the overall optimization of the graph node feature information was achieved by up weighting the important graph node features and downweighting the non-important graph node features. Then, the graph autoencoder was used as the edge information prediction module, which obtained the connection probability information between nodes according to the optimized graph node features. Finally, the edge information was optimized by adding or deleting the corresponding edges according to the probability. Three GNN model architectures trained on five graph datasets were experimented as the target models for extraction attacks, and the obtained alternative models achieve 73% to 87% accuracy in node classification task and 76% to 89% fidelity with the target model performance, which verifies the effectiveness of the proposed method.

    Table and Figures | Reference | Related Articles | Metrics
    Real-time object detection algorithm for complex construction environments
    Xiaogang SONG, Dongdong ZHANG, Pengfei ZHANG, Li LIANG, Xinhong HEI
    Journal of Computer Applications    2024, 44 (5): 1605-1612.   DOI: 10.11772/j.issn.1001-9081.2023050687
    Abstract309)   HTML16)    PDF (3015KB)(175)       Save

    A real-time object detection algorithm YOLO-C for complex construction environment was proposed for the problems of cluttered environment, obscured objects, large object scale range, unbalanced positive and negative samples, and insufficient real-time of existing detection algorithms, which commonly exist in construction environment. The extracted low-level features were fused with the high-level features to enhance the global sensing capability of the network, and a small object detection layer was designed to improve the detection accuracy of the algorithm for objects of different scales. A Channel-Spatial Attention (CSA) module was designed to enhance the object features and suppress the background features. In the loss function part, VariFocal Loss was used to calculate the classification loss to solve the problem of positive and negative sample imbalance. GhostConv was used as the basic convolutional block to construct the GCSP (Ghost Cross Stage Partial) structure to reduce the number of parameters and the amount of computation. For complex construction environments, a concrete construction site object detection dataset was constructed, and comparison experiments for various algorithms were conducted on the constructed dataset. Experimental results demonstrate that the YOLO?C has higher detection accuracy and smaller parameters, making it more suitable for object detection tasks in complex construction environments.

    Table and Figures | Reference | Related Articles | Metrics
    Review on security threats and defense measures in federated learning
    Xuebin CHEN, Zhiqiang REN, Hongyang ZHANG
    Journal of Computer Applications    2024, 44 (6): 1663-1672.   DOI: 10.11772/j.issn.1001-9081.2023060832
    Abstract309)   HTML20)    PDF (1072KB)(622)       Save

    Federated learning is a distributed learning approach for solving the data sharing problem and privacy protection problem in machine learning, in which multiple parties jointly train a machine learning model and protect the privacy of data. However, there are security threats inherent in federated learning, which makes federated learning face great challenges in practical applications. Therefore, analyzing the attacks faced by federation learning and the corresponding defensive measures are crucial for the development and application of federation learning. First, the definition, process and classification of federated learning were introduced, and the attacker model in federated learning was introduced. Then, the possible attacks in terms of both robustness and privacy of federated learning systems were introduced, and the corresponding defense measures were introduced as well. Furthermore, the shortcomings of the defense schemes were also pointed out. Finally, a secure federated learning system was envisioned.

    Table and Figures | Reference | Related Articles | Metrics
    Semi-supervised object detection framework guided by curriculum learning
    Yingjun ZHANG, Niuniu LI, Binhong XIE, Rui ZHANG, Wangdong LU
    Journal of Computer Applications    2024, 44 (8): 2326-2333.   DOI: 10.11772/j.issn.1001-9081.2023081062
    Abstract308)   HTML24)    PDF (2042KB)(260)       Save

    In order to enhance the quality of pseudo labels, address the issue of confirmation bias in Semi-Supervised Object Detection (SSOD), and tackle the challenge of ignoring complexities in unlabeled data leading to erroneous pseudo labels in existing algorithms, an SSOD framework guided by Curriculum Learning (CL) was proposed. The framework consisted of two modules: the ICSD (IoU-Confidence-Standard-Deviation) difficulty measurer and the BP (Batch-Package) training scheduler. The ICSD difficulty measurer comprehensively considered information such as IoU (Intersection over Union) between pseudo-bounding boxes, confidence, class label, etc.,and the C_IOU (Checkpoint_IOU) method was introduced to evaluate the reliability of unlabeled data. The BP training scheduler designed two efficient scheduling strategies, starting from the perspectives of Batch and Package respectively, giving priority to unlabeled data with high reliability indicators to achieve full utilization of the entire unlabeled data set in the form of course learning. Extensive comparative experimental results on the Pascal VOC and MS-COCO datasets demonstrate that the proposed framework applies to existing SSOD algorithms and exhibits significant improvements in detection accuracy and stability.

    Table and Figures | Reference | Related Articles | Metrics
    Device-to-device content sharing mechanism based on knowledge graph
    Xiaoyan ZHAO, Yan KUANG, Menghan WANG, Peiyan YUAN
    Journal of Computer Applications    2024, 44 (4): 995-1001.   DOI: 10.11772/j.issn.1001-9081.2023040500
    Abstract307)   HTML58)    PDF (3288KB)(779)       Save

    Device-to-Device(D2D) communication leverages the local computing and caching capabilities of the edge network to meet the demand for low-latency, energy-efficient content sharing among future mobile network users. The performance improvement of content sharing efficiency in edge networks not only depends on user social relationships, but also heavily relies on the characteristics of end devices, such as computation, storage, and residual energy resources. Therefore, a D2D content sharing mechanism was proposed to maximize energy efficiency with multidimensional association features of user-device-content, which took into account device heterogeneity, user sociality, and interest difference. Firstly, the multi-objective constraint problem about the user cost-benefit maximization was transformed into the optimal node selection and power control problem. And the multi-dimensional knowledge association features and the graph model for user-device-content were constructed by processing structurally multi-dimensional features related to devices, such as computing resources and storage resources. Then, the willingness measurement methods of users on device attributes and social attributes were studied, and a sharing willingness measurement method was proposed based on user socialization and device graphs. Finally, according to user sharing willingness, a D2D collaboration cluster oriented to content sharing was constructed, and a power control algorithm based on shared willingness for energy efficiency was designed to maximize the performance of network sharing. The experimental results on a real user device dataset and infocom06 dataset show that, compared to nearest selection algorithm and a selection algorithm without considering device willingness, the proposed power control algorithm based on shared willingness improves the system sum rate by about 97.2% and 11.1%, increases the user satisfaction by about 72.7% and 4.3%, and improves the energy efficiency by about 57.8% and 9.7%, respectively. This verifies the effectiveness of the proposed algorithm in terms of transmission rate, energy efficiency and user satisfaction.

    Table and Figures | Reference | Related Articles | Metrics
    Small target detection model in overlooking scenes on tower cranes based on improved real-time detection Transformer
    Yudong PANG, Zhixing LI, Weijie LIU, Tianhao LI, Ningning WANG
    Journal of Computer Applications    2024, 44 (12): 3922-3929.   DOI: 10.11772/j.issn.1001-9081.2023121796
    Abstract305)   HTML1)    PDF (3128KB)(252)       Save

    In view of a series of problems of security guarantee of construction site personnel such as casualties led by falling objects and tower crane collapse caused by mutual collision of tower hooks, a small target detection model in overlooking scenes on tower cranes based on improved Real-Time DEtection TRansformer (RT-DETR) was proposed. Firstly, the multiple training and single inference structures designed by applying the idea of model reparameterization were added to the original model to improve the detection speed. Secondly, the convolution module in FasterNet Block was redesigned to replace BasicBlock in the original BackBone to improve performance of the detection model. Thirdly, the new loss function Inner-SIoU (Inner-Structured Intersection over Union) was utilized to further improve precision and convergence speed of the model. Finally, the ablation and comparison experiments were conducted to verify the model performance. The results show that, in detection of the small target images in overlooking scenes on tower cranes, the proposed model achieves the precision of 94.7%, which is higher than that of the original RT-DETR model by 6.1 percentage points. At the same time, the Frames Per Second (FPS) of the proposed model reaches 59.7, and the detection speed is improved by 21% compared with the original model. The Average Precision (AP) of the proposed model on the public dataset COCO 2017 is 2.4, 1.5, and 1.3 percentage points higher than those of YOLOv5, YOLOv7, and YOLOv8, respectively. It can be seen that the proposed model meets the precision and speed requirements for small target detection in overlooking scenes on tower cranes.

    Table and Figures | Reference | Related Articles | Metrics
    Deep network compression method based on low-rank decomposition and vector quantization
    Dongwei WANG, Baichen LIU, Zhi HAN, Yanmei WANG, Yandong TANG
    Journal of Computer Applications    2024, 44 (7): 1987-1994.   DOI: 10.11772/j.issn.1001-9081.2023071027
    Abstract305)   HTML120)    PDF (1506KB)(406)       Save

    As the development of artificial intelligence, deep neural network has become an essential tool in various pattern recognition tasks. Deploying deep Convolutional Neural Networks (CNN) on edge computing equipment is challenging due to storage space and computing resource constraints. Therefore, deep network compression has become an important research topic in recent years. Low-rank decomposition and vector quantization are the most popular network compression techniques, which both try to find a compact representation of the original network, thereby reducing the redundancy of network parameters. By establishing a joint compression framework, a deep network compression method based on low-rank decomposition and vector decomposition — Quantized Tensor Decomposition (QTD) was proposed to obtain higher compression ratio by performing further quantization based on the low-rank structure of network. Experimental results of classical ResNet and the proposed method on CIFAR-10 dataset show that the volume can be compressed to 1% by QTD with a slight accuracy drop of 1.71 percentage points. Moreover, the proposed method was compared with the quantization-based method PQF (Permute, Quantize, and Fine-tune), the low-rank decomposition-based method TDNR (Tucker Decomposition with Nonlinear Response), and the pruning-based method CLIP-Q (Compression Learning by In-parallel Pruning-Quantization) on large dataset ImageNet. Experimental results show that QTD can maintain better classification accuracy with same compression range.

    Table and Figures | Reference | Related Articles | Metrics
    Distributed UAV cluster pursuit decision-making based on trajectory prediction and MADDPG
    Yu WANG, Zhihui GUAN, Yuanpeng LI
    Journal of Computer Applications    2024, 44 (11): 3623-3628.   DOI: 10.11772/j.issn.1001-9081.2023101538
    Abstract300)   HTML1)    PDF (918KB)(107)       Save

    A Trajectory Prediction based Distributed Multi-Agent Deep Deterministic Policy Gradient (TP-DMADDPG) algorithm was proposed to address the problems of insufficient flexibility and poor generalization ability of Unmanned Aerial Vehicle (UAV) cluster pursuit decision-making algorithms in complex mission environments. Firstly, to enhance the realism of the pursuit mission, an intelligent escape strategy was designed for the target. Secondly, considering the conditions such as missing information of target due to communication interruption and other reasons, a Long Short-Term Memory (LSTM) network was used to predict the position information of target in real time, and the state space of the decision-making model was constructed on the basis of the prediction information. Finally, TP-DMADDPG was designed based on the distributed framework and Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm, which enhanced the flexibility and generalization ability of pursuit decision-making in the process of complex air combat. Simulation results show that compared with Deep Deterministic Policy Gradient (DDPG), Twin Delayed Deep Deterministic policy gradient (TD3) and MADDPG algorithms, the TP-DMADDPG algorithm increases the success rate of collaborative decision-making by more than 15 percentage points, and can solve the problem of pursuing intelligent escaping target with incomplete information.

    Table and Figures | Reference | Related Articles | Metrics
    Enhanced deep subspace clustering method with unified framework
    Qing WANG, Jieyu ZHAO, Xulun YE, Nongxiao WANG
    Journal of Computer Applications    2024, 44 (7): 1995-2003.   DOI: 10.11772/j.issn.1001-9081.2023101395
    Abstract298)   HTML83)    PDF (3432KB)(327)       Save

    Deep subspace clustering is a method that performs well in processing high-dimensional data clustering tasks. However, when dealing with challenging data, current deep subspace clustering methods with fixed self-expressive matrix usually exhibit suboptimal clustering results due to the conventional practice of treating self-expressive learning and indicator learning as two separate and independent processes, and the quality of self-expressive matrix has a crucial impact on the accuracy of clustering results. To solve the above problems, an enhanced deep subspace clustering method with unified framework was proposed. Firstly, by integrating feature learning, self-expressive learning, and indicator learning together to optimize all parameters, the self-expressive matrix was dynamically learned based on the characteristics of the data, ensuring accurate capture of data features. Secondly, to improve the effects of self-representative learning, class prototype pseudo-label learning was proposed to provide self-supervised information for feature learning and indicator learning, thereby promoting self-expressive learning. Finally, to enhance the discriminative ability of embedded representations, orthogonality constraints were introduced to help achieve self-expressive attribute. The experimental results show that compared with AASSC (Adaptive Attribute and Structure Subspace Clustering network), the proposed method improves clustering accuracy by 1.84, 0.49 and 0.34 percentage points on the MNIST, UMIST and COIL20 datasets. It can be seen that the proposed method improves the accuracy of self-representative matrix learning, thereby achieving better clustering effects.

    Table and Figures | Reference | Related Articles | Metrics
    Hybrid internet of vehicles intrusion detection system for zero-day attacks
    Jiepo FANG, Chongben TAO
    Journal of Computer Applications    2024, 44 (9): 2763-2769.   DOI: 10.11772/j.issn.1001-9081.2023091328
    Abstract295)   HTML12)    PDF (2618KB)(686)       Save

    Existing machine learning methods suffer from over-reliance on sample data and insensitivity to anomalous data when confronted with zero-day attack detection, thus making it difficult for Intrusion Detection System (IDS) to effectively defend against zero-day attacks. Therefore, a hybrid internet of vehicles intrusion detection system based on Transformer and ANFIS (Adaptive-Network-based Fuzzy Inference System) was proposed. Firstly, a data enhancement algorithm was designed and the problem of unbalanced data samples was solved by denoising first and then generating. Secondly, a feature engineering module was designed by introducing non-linear feature interactions into complex feature combinations. Finally, the self-attention mechanism of Transformer and the adaptive learning method of ANFIS were combined, which enhanced the ability of feature representation and reduced the dependence on sample data. The proposed system was compared with other SOTA (State-Of-The-Art) algorithms such as Dual-IDS on CICIDS-2017 and UNSW-NB15 intrusion datasets. Experimental results show that for zero-day attacks, the proposed system achieves 98.64% detection accuracy and 98.31% F1 value on CICIDS-2017 intrusion dataset, and 93.07% detection accuracy and 92.43% F1 value on UNSW-NB15 intrusion dataset, which validates high accuracy and strong generalization ability of the proposed algorithm for zero-day attack detection.

    Table and Figures | Reference | Related Articles | Metrics
    Incomplete multi-view clustering algorithm based on self-attention fusion
    Shunyong LI, Shiyi LI, Rui XU, Xingwang ZHAO
    Journal of Computer Applications    2024, 44 (9): 2696-2703.   DOI: 10.11772/j.issn.1001-9081.2023091253
    Abstract288)   HTML8)    PDF (2806KB)(492)       Save

    Multi-view clustering task based on incomplete data has become one of the research hotspots in the field of unsupervised learning. However, most multi-view clustering algorithms based on “shallow” models often find it difficult to extract and characterize potential feature structures within views when dealing with large-scale high-dimensional data. At the same time, the stacking or averaging methods of multi-view information fusion ignore the differences between views and does not fully consider the different contributions of each view to building a common consensus representation. To address the above issues, an Incomplete Multi-View Clustering algorithm based on Self-Attention Fusion (IMVCSAF) was proposed. Firstly, the potential features of each view were extracted on the basis of a deep autoencoder, and the consistency information among views was maximized by using contrastive learning. Secondly, a self-attention mechanism was adopted to recode and fuse the potential representations of each view, and the inherent causality as well as feature complementarity between different views was considered and mined comprehensively. Thirdly, based on the common consensus representation, the potential representation of missing instance was predicted and recovered, thereby fully implementing the process of multi-view clustering. Experimental results on Scene-15, LandUse-21, Caltech101-20 and Noisy-MNIST datasets show that, the accuracy of IMVCSAF is higher than those of other comparison algorithms while meeting the convergence requirements. On Noisy-MNIST dataset with 50% miss rate, the accuracy of IMVCSAF is 6.58 percentage points higher than that of the second best algorithm — COMPETER (inCOMPlete muLti-view clustEring via conTrastivE pRediction).

    Table and Figures | Reference | Related Articles | Metrics
    Point cloud semantic segmentation based on attention mechanism and global feature optimization
    Pengfei ZHANG, Litao HAN, Hengjian FENG, Hongmei LI
    Journal of Computer Applications    2024, 44 (4): 1086-1092.   DOI: 10.11772/j.issn.1001-9081.2023050588
    Abstract288)   HTML16)    PDF (1971KB)(347)       Save

    In the 3D point cloud semantic segmentation algorithm based on deep learning, to enhance the fine-grained ability to extract local features and learn the long-range dependencies between different local neighborhoods, a neural network based on attention mechanism and global feature optimization was proposed. First, a Single-Channel Attention (SCA) module and a Point Attention (PA) module were designed in the form of additive attention. The former strengthened the resolution of local features by adaptively adjusting the features of each point in a single channel, and the latter adjusted the importance of the single-point feature vector to suppress useless features and reduce feature redundancy. Second, a Global Feature Aggregation (GFA) module was added to aggregate local neighborhood features to capture global context information, thereby improving semantic segmentation accuracy. The experimental results show that the proposed network improves the mean Intersection?over?Union (mIoU) by 1.8 percentage points compared with RandLA-Net (Random sampling and an effective Local feature Aggregator Network) on the point cloud dataset S3DIS, and has good segmentation performance and good adaptability.

    Table and Figures | Reference | Related Articles | Metrics
2025 Vol.45 No.3

Current Issue
Archive
Honorary Editor-in-Chief: ZHANG Jingzhong
Editor-in-Chief: XU Zongben
Associate Editor: SHEN Hengtao XIA Zhaohui
Domestic Post Distribution Code: 62-110
Foreign Distribution Code: M4616
Address:
No. 9, 4th Section of South Renmin Road, Chengdu 610041, China
Tel: 028-85224283-803
  028-85222239-803
Website: www.joca.cn
E-mail: bjb@joca.cn
WeChat
Join CCF