Most Read articles

    Published in last 1 year |  In last 2 years |  In last 3 years |  All

    Published in last 1 year
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Technology application prospects and risk challenges of large language models
    Yuemei XU, Ling HU, Jiayi ZHAO, Wanze DU, Wenqing WANG
    Journal of Computer Applications    2024, 44 (6): 1655-1662.   DOI: 10.11772/j.issn.1001-9081.2023060885
    Abstract1280)   HTML102)    PDF (1142KB)(2305)       Save

    In view of the rapid development of Large Language Model (LLM) technology, a comprehensive analysis was conducted on its technical application prospects and risk challenges which has great reference value for the development and governance of Artificial General Intelligence (AGI). Firstly, with representative language models such as Multi-BERT (Multilingual Bidirectional Encoder Representations from Transformer), GPT (Generative Pre-trained Transformer) and ChatGPT (Chat Generative Pre-trained Transformer) as examples, the development process, key technologies and evaluation systems of LLM were reviewed. Then, a detailed analysis of LLM on technical limitations and security risks was conducted. Finally, suggestions were put forward for technical improvement and policy follow-up of the LLM. The analysis indicates that at a developing status, the current LLMs still produce non-truthful and biased output, lack real-time autonomous learning ability, require huge computing power, highly rely on data quality and quantity, and tend towards monotonous language style. They have security risks related to data privacy, information security, ethics, and other aspects. Their future developments can continue to improve technically, from “large-scale” to “lightweight”, from “single-modal” to “multi-modal”, from “general-purpose” to “vertical”; for real-time follow-up in policy, their applications and developments should be regulated by targeted regulatory measures.

    Table and Figures | Reference | Related Articles | Metrics
    Review of YOLO algorithm and its applications to object detection in autonomous driving scenes
    Yaping DENG, Yingjiang LI
    Journal of Computer Applications    2024, 44 (6): 1949-1958.   DOI: 10.11772/j.issn.1001-9081.2023060889
    Abstract993)   HTML42)    PDF (1175KB)(881)       Save

    Object detection in autonomous driving scenes is one of the important research directions in computer vision. The researches focus on ensuring real-time and accurate object detection of objects by autonomous vehicles. Recently, a rapid development in deep learning technology had been witnessed, and its wide application in the field of autonomous driving had prompted substantial progress in this field. An analysis was conducted on the research status of object detection by YOLO (You Only Look Once) algorithms in the field of autonomous driving from the following four aspects. Firstly, the ideas and improvement methods of the single-stage YOLO series of detection algorithms were summarized, and the advantages and disadvantages of the YOLO series of algorithms were analyzed. Secondly, the YOLO algorithm-based object detection applications in autonomous driving scenes were introduced, the research status and applications for the detection and recognition of traffic vehicles, pedestrians, and traffic signals were expounded and summarized respectively. Additionally, the commonly used evaluation indicators in object detection, as well as the object detection datasets and automatic driving scene datasets, were summarized. Lastly, the problems and future development directions of object detection were discussed.

    Table and Figures | Reference | Related Articles | Metrics
    Survey of incomplete multi-view clustering
    Yao DONG, Yixue FU, Yongfeng DONG, Jin SHI, Chen CHEN
    Journal of Computer Applications    2024, 44 (6): 1673-1682.   DOI: 10.11772/j.issn.1001-9081.2023060813
    Abstract636)   HTML19)    PDF (2050KB)(590)       Save

    Multi-view clustering has recently been a hot topic in graph data mining. However, due to the limitations of data collection technology or human factors, multi-view data often has the problem of missing views or samples. Reducing the impact of incomplete views on clustering performance is a major challenge currently faced by multi-view clustering. In order to better understand the development of Incomplete Multi-view Clustering (IMC) in recent years, a comprehensive review is of great theoretical significance and practical value. Firstly, the missing types of incomplete multi-view data were summarized and analyzed. Secondly, four types of IMC methods, based on Multiple Kernel Learning (MKL), Matrix Factorization (MF) learning, deep learning, and graph learning were compared, and the technical characteristics and differences among the methods were analyzed. Thirdly, from the perspectives of dataset types, the numbers of views and categories, and application fields, twenty-two public incomplete multi-view datasets were summarized. Then, the evaluation metrics were outlined, and the performance of existing incomplete multi-view clustering methods on homogeneous and heterogeneous datasets were evaluated. Finally, the existing problems, future research directions, and existing application fields of incomplete multi-view clustering were discussed.

    Table and Figures | Reference | Related Articles | Metrics
    Survey on hypergraph application methods: issues, advances, and challenges
    Li ZENG, Jingru YANG, Gang HUANG, Xiang JING, Chaoran LUO
    Journal of Computer Applications    2024, 44 (11): 3315-3326.   DOI: 10.11772/j.issn.1001-9081.2023111629
    Abstract605)   HTML24)    PDF (795KB)(364)       Save

    Hypergraph is the generalization of graph, which has significant advantages in representing higher-order features of complex relationships compared with ordinary graph. As a relatively new data structure, hypergraph is playing a crucial role in various application fields increasingly. By appropriately using hypergraph models and algorithms, specific problems in real world were modeled and solved with higher efficiency and quality. Existing surveys of hypergraph mainly focus on the theory and techniques of hypergraph itself, and lack of a summary of modeling and solving methods in specific scenarios. To this end, after summarizing and introducing some fundamental concepts of hypergraph, the application methods, techniques, common issues, and solutions of hypergraph in various application scenarios were analyzed; by summarizing the existing work, some problems and obstacles that still exist in the applications of hypergraph to real-world problems were elaborated. Finally, the future research directions of hypergraph applications were prospected.

    Table and Figures | Reference | Related Articles | Metrics
    Survey and prospect of large language models
    Xiaolin QIN, Xu GU, Dicheng LI, Haiwen XU
    Journal of Computer Applications    2025, 45 (3): 685-696.   DOI: 10.11772/j.issn.1001-9081.2025010128
    Abstract583)   HTML44)    PDF (2035KB)(478)       Save

    Large Language Models (LLMs) are a class of language models composed of artificial neural networks with a vast number of parameters (typically billions of weights or more). They are trained on a large amount of unlabeled text using self-supervised or semi-supervised learning and are the core of current generative Artificial Intelligence (AI) technologies. Compared to traditional language models, LLMs demonstrate stronger language understanding and generation capabilities, supported by substantial computational power, extensive parameters, and large-scale data. They are widely applied in tasks such as machine translation, question answering systems, and dialogue generation with good performance. Most of the existing surveys focus on the theoretical construction and training techniques of LLMs, while systematic exploration of LLMs’ industry-level application practices and evolution of the technological ecosystem remains insufficient. Therefore, based on introducing the foundational architecture, training techniques, and development history of LLMs, the current general key technologies in LLMs and advanced integration technologies with LLMs bases were analyzed. Then, by summarizing the existing research, challenges faced by LLMs in practical applications were further elaborated, including problems such as data bias, model hallucination, and computational resource consumption, and an outlook was provided on the ongoing development trends of LLMs.

    Table and Figures | Reference | Related Articles | Metrics
    Overview of research and application of knowledge graph in equipment fault diagnosis
    Jie WU, Ansi ZHANG, Maodong WU, Yizong ZHANG, Congbao WANG
    Journal of Computer Applications    2024, 44 (9): 2651-2659.   DOI: 10.11772/j.issn.1001-9081.2023091280
    Abstract526)   HTML52)    PDF (2858KB)(814)       Save

    Useful knowledge can be extracted from equipment fault diagnosis data for construction of a knowledge graph, which can effectively manage complex equipment fault diagnosis information in the form of triples (entity, relationship, entity). This enables the rapid diagnosis of equipment faults. Firstly, the related concepts of knowledge graph for equipment fault diagnosis were introduced, and the framework of knowledge graph for equipment fault diagnosis domain was analyzed. Secondly, the research status at home and abroad about several key technologies, such as knowledge extraction, knowledge fusion and knowledge reasoning for equipment fault diagnosis knowledge graph, was summarized. Finally, the applications of knowledge graph in equipment fault diagnosis were summarized, some shortcomings and challenges in the construction of knowledge graph in this field were proposed, and some new ideas were provided for the field of equipment fault diagnosis in the future.

    Table and Figures | Reference | Related Articles | Metrics
    Underwater target detection algorithm based on improved YOLOv8
    Dahai LI, Bingtao LI, Zhendong WANG
    Journal of Computer Applications    2024, 44 (11): 3610-3616.   DOI: 10.11772/j.issn.1001-9081.2023111550
    Abstract507)   HTML16)    PDF (1637KB)(142)       Save

    Due to the unique characteristics of underwater creatures, underwater images usually exit many small targets being hard to detect and often overlapping with each other. In addition, light absorption and scattering in underwater environment can cause underwater images' color offset and blur. To overcome those challenges, an underwater target detection algorithm, namely WCA-YOLOv8, was proposed. Firstly, the Feature Fusion Module (FFM) was designed to improve the focus on spatial dimension in order to improve the recognition ability for targets with color offset and blur. Secondly, the FReLU Coordinate Attention (FCA) module was added to enhance the feature extraction ability for overlapped and occluded underwater targets. Thirdly, Complete Intersection over Union (CIoU) loss function was replaced by Wise-IoU version 3 (WIoU v3) loss function to strengthen the detection performance for small size targets. Finally, the Downsampling Enhancement Module (DEM) was designed to preserve context information during feature extraction more completely. Experimental results show that WCA-YOLOv8 achieves 75.8% and 88.6% mean Average Precision (mAP0.5) and 60 frame/s and 57 frame/s detection speeds on RUOD and URPC datasets, respectively. Compared with other state-of-the-art underwater target detection algorithms, WCA-YOLOv8 can achieve higher detection accuracy with faster detection speed.

    Table and Figures | Reference | Related Articles | Metrics
    Development, technologies and applications of blockchain 3.0
    Peng FANG, Fan ZHAO, Baoquan WANG, Yi WANG, Tonghai JIANG
    Journal of Computer Applications    2024, 44 (12): 3647-3657.   DOI: 10.11772/j.issn.1001-9081.2023121826
    Abstract491)   HTML39)    PDF (2294KB)(334)       Save

    Blockchain 3.0 is the third stage of the development of blockchain technology and the core of building a value Internet. Its innovations in sharding, cross-chain and privacy protection have given it a wide range of application scenarios and research value. It is highly valued by relevant people in academia and industry. For the development, technologies and applications of blockchain 3.0, the relevant literature on blockchain 3.0 at home and abroad in the past five years were surveyed and reviewed. Firstly, the basic theory and technical characteristics of blockchain were introduced, laying the foundation for an in-depth understanding of the research progress of blockchain. Subsequently, based on the evolution trend of blockchain technology over time, the development process and various key development time nodes of blockchain 3.0, as well as the reasons of the division of different stages of development of blockchain using sharding and side-chain technologies as benchmarks, were given. Then, the current research status of key technologies of blockchain 3.0 was analyzed in detail, and typical applications of blockchain 3.0 in six major fields such as internet of things, medical care, and agriculture were summarized. Finally, the key challenges and future development opportunities faced by blockchain 3.0 in its development process were summed up.

    Table and Figures | Reference | Related Articles | Metrics
    Logo detection algorithm based on improved YOLOv5
    Yeheng LI, Guangsheng LUO, Qianmin SU
    Journal of Computer Applications    2024, 44 (8): 2580-2587.   DOI: 10.11772/j.issn.1001-9081.2023081113
    Abstract482)   HTML9)    PDF (4682KB)(436)       Save

    To address the challenges posed by complex background and varying size of logo images, an improved detection algorithm based on YOLOv5 was proposed. Firstly, in combination with the Channel Block Attention Module (CBAM), compression was applied in both image channels and spatial dimensions to extract critical information and significant regions within the image. Subsequently, the Switchable Atrous Convolution (SAC) was employed to allow the network to adaptively adjust the receptive field size in feature maps at different scales, improving the detection effects of objects across multiple scales. Finally, the Normalized Wasserstein Distance (NWD) was embedded into the loss function. The bounding boxes were modeled as 2D Gaussian distributions, the similarity between corresponding Gaussian distributions was calculated to better measure the similarity among objects, thereby enhancing the detection performance for small objects, and improving model robustness and stability. Compared to the original YOLOv5 algorithm: in small dataset FlickrLogos?32, the improved algorithm achieved a mean of Average Precision (mAP@0.5) of 90.6%, with an increase of 1 percentage point; in large dataset QMULOpenLogo, the improved algorithm achieved an mAP@0.5 of 62.7%, with an increase of 2.3 percentage points; in LogoDet3K for three types of logos, the improved algorithm increased the mAP@0.5 by 1.2, 1.4, and 1.4 percentage points respectively. Experimental results demonstrate that the improved algorithm has better small object detection ability of logo images.

    Table and Figures | Reference | Related Articles | Metrics
    Time series classification method based on multi-scale cross-attention fusion in time-frequency domain
    Mei WANG, Xuesong SU, Jia LIU, Ruonan YIN, Shan HUANG
    Journal of Computer Applications    2024, 44 (6): 1842-1847.   DOI: 10.11772/j.issn.1001-9081.2023060731
    Abstract453)   HTML10)    PDF (2511KB)(782)       Save

    To address the problem of low classification accuracy caused by insufficient potential information interaction between time series subsequences, a time series classification method based on multi-scale cross-attention fusion in time-frequency domain called TFFormer (Time-Frequency Transformer) was proposed. First, time and frequency spectrums of the original time series were divided into subsequences with the same length respectively, and the point-value coupling problem was solved by adding positional embedding after linear projection. Then, the long-term time series dependency problem was solved because the model was made to focus on more important time series features by Improved Multi-Head self-Attention (IMHA) mechanism. Finally, a multi-scale Cross-Modality Attention (CMA) module was proposed to enhance the interaction between the time domain and frequency domain, so that the model could further mine the frequency information of the time series. The experimental results show that compared with Fully Convolutional Network (FCN), the classification accuracy of the proposed method on Trace, StarLightCurves and UWaveGestureLibraryAll datasets increased by 0.3, 0.9 and 1.4 percentage points. It is proved that by enhancing the information interaction between time domain and frequency domain of the time series, the model convergence speed and classification accuracy can be improved.

    Table and Figures | Reference | Related Articles | Metrics
    Small object detection algorithm from drone perspective based on improved YOLOv8n
    Tao LIU, Shihong JU, Yimeng GAO
    Journal of Computer Applications    2024, 44 (11): 3603-3609.   DOI: 10.11772/j.issn.1001-9081.2023111644
    Abstract433)   HTML13)    PDF (1561KB)(255)       Save

    In view of the low accuracy of object detection algorithms in small object detection from drone perspective, a new small object detection algorithm named SFM-YOLOv8 was proposed by improving the backbone network and attention mechanism of YOLOv8. Firstly, the SPace-to-Depth Convolution (SPDConv) suitable for low-resolution images and small object detection was integrated into the backbone network to retain discriminative feature information and improve the perception ability to small objects. Secondly, a multi-branch attention named MCA (Multiple Coordinate Attention) was introduced to enhance the spatial and channel information on the feature layer. Then, a convolution FE-C2f fusing FasterNet and Efficient Multi-scale Attention (EMA) was constructed to reduce the computational cost and lightweight the model. Besides, a Minimum Point Distance based Intersection over Union (MPDIoU) loss function was introduced to improve the accuracy of the algorithm. Finally, a small object detection layer was added to the network structure of YOLOv8n to retain more location information and detailed features of small objects. Experimental results show that compared with YOLOv8n, SFM-YOLOv8 achieves a 4.37 percentage point increase in mAP50 (mean Average Precision) with a 5.98% reduction in parameters on VisDrone-DET2019 dataset. Compared to the related mainstream models, SFM-YOLOv8 achieves higher accuracy and meets real-time detection requirements.

    Table and Figures | Reference | Related Articles | Metrics
    Correlation filtering based target tracking with nonlinear temporal consistency
    Wentao JIANG, Wanxuan LI, Shengchong ZHANG
    Journal of Computer Applications    2024, 44 (8): 2558-2570.   DOI: 10.11772/j.issn.1001-9081.2023081121
    Abstract424)   HTML3)    PDF (7942KB)(82)       Save

    Concerning the problem that existing target tracking algorithms mainly use the linear constraint mechanism LADCF (Learning Adaptive Discriminative Correlation Filters), which easily causes model drift, a correlation filtering based target tracking algorithm with nonlinear temporal consistency was proposed. First, a nonlinear temporal consistency term was proposed based on Stevens’ Law, which aligned closely with the characteristics of human visual perception. The nonlinear temporal consistency term allowed the model to track the target relatively smoothly, thus ensuring tracking continuity and preventing model drift. Next, the Alternating Direction Method of Multipliers (ADMM) was employed to compute the optimal function value, ensuring real-time tracking of the algorithm. Lastly, Stevens’ Law was used for nonlinear filter updating, enabling the filter update factor to enhance and suppress the filter according to the change of the target, thereby adapting to target changes and preventing filter degradation. Comparison experiments with mainstream correlation filtering and deep learning algorithms were performed on four standard datasets. Compared with the baseline algorithm LADCF, the tracking precision and success rate of the proposed algorithm were improved by 2.4 and 3.8 percentage points on OTB100 dataset, and 1.5 and 2.5 percentage points on UAV123 dataset. The experimental results show that the proposed algorithm effectively avoids tracking model drift, reduces the likelihood of filter degradation, has higher tracking precision and success rate, and stronger robustness in complicated situations such as occlusion and illumination changes.

    Table and Figures | Reference | Related Articles | Metrics
    Mobile robot 3D space path planning method based on deep reinforcement learning
    Tian MA, Runtao XI, Jiahao LYU, Yijie ZENG, Jiayi YANG, Jiehui ZHANG
    Journal of Computer Applications    2024, 44 (7): 2055-2064.   DOI: 10.11772/j.issn.1001-9081.2023060749
    Abstract406)   HTML29)    PDF (5732KB)(939)       Save

    Aiming at the problems of high complexity and uncertainty in 3D unknown environment, a mobile robot 3D path planning method based on deep reinforcement learning was proposed, under a limited observation space optimization strategy. First, the depth map information was used as the agent’s input in the limited observation space, which could simulate complex 3D space environments with limited and unknown movement conditions. Second, a two-stage action selection policy in discrete action space was designed, including directional actions and movement actions, which could reduce the searching steps and time. Finally, based on the Proximal Policy Optimization (PPO) algorithm, the Gated Recurrent Unit (GRU) was added to combine the historical state information, to enhance the policy stability in unknown environments, so that the accuracy and smoothness of the planned path could be improved. The experimental results show that, compared with Advantage Actor-Critic (A2C), the average search time is reduced by 49.07% and the average planned path length is reduced by 1.04%. Meanwhile, the proposed method can complete the multi-objective path planning tasks under linear sequential logic constraints.

    Table and Figures | Reference | Related Articles | Metrics
    Class-imbalanced traffic abnormal detection based on 1D-CNN and BiGRU
    Hong CHEN, Bing QI, Haibo JIN, Cong WU, Li’ang ZHANG
    Journal of Computer Applications    2024, 44 (8): 2493-2499.   DOI: 10.11772/j.issn.1001-9081.2023081112
    Abstract377)   HTML2)    PDF (1194KB)(818)       Save

    Network traffic anomaly detection is a network security defense method that involves analyzing and determining network traffic to identify potential attacks. A new approach was proposed to address the issue of low detection accuracy and high false positive rate caused by imbalanced high-dimensional network traffic data and different attack categories. One Dimensional Convolutional Neural Network(1D-CNN) and Bidirectional Gated Recurrent Unit (BiGRU) were combined to construct a model for traffic anomaly detection. For class-imbalanced data, balanced processing was performed by using an improved Synthetic Minority Oversampling TEchnique (SMOTE), namely Borderline-SMOTE, and an undersampling clustering technique based on Gaussian Mixture Model (GMM). Subsequently, a one-dimensional CNN was utilized to extract local features in the data, and BiGRU was used to better extract the time series features in the data. Finally, the proposed model was evaluated on the UNSW-NB15 dataset, achieving an accuracy of 98.12% and a false positive rate of 1.28%. The experimental results demonstrate that the proposed model outperforms other classic machine learning and deep learning models, it improves the recognition rate for minority attacks and achieves higher detection accuracy.

    Table and Figures | Reference | Related Articles | Metrics
    Generative label adversarial text classification model
    Xun YAO, Zhongzheng QIN, Jie YANG
    Journal of Computer Applications    2024, 44 (6): 1781-1785.   DOI: 10.11772/j.issn.1001-9081.2023050662
    Abstract370)   HTML15)    PDF (1142KB)(446)       Save

    Text classification is a fundamental task in Natural Language Processing (NLP), aiming to assign text data to predefined categories. The combination of Graph Convolutional neural Network (GCN) and large-scale pre-trained model BERT (Bidirectional Encoder Representations from Transformer) has achieved excellent results in text classification tasks. Undirected information transmission of GCN in large-scale heterogeneous graphs produces information noise, which affects the judgment of the model and reduce the classification ability of the model. To solve this problem, a generative label adversarial model, the Class Adversarial Graph Convolutional Network (CAGCN) model, was proposed to reduce the interference of irrelevant information during classification and improve the classification performance of the model. Firstly, the composition method in TextGCN (Text Graph Convolutional Network) was used to construct the adjacency matrix, which was combined with GCN and BERT models as a Class Generator (CG). Secondly, the pseudo-label feature training method was used in the model training to construct a clueter. The cluster and the class generator were jointly trained. Finally, experiments were carried out on several widely used datasets. Experimental results show that the classification accuracy of CAGCN model is 1.2, 0.1, 0.5, 1.7 and 0.5 percentage points higher than that of RoBERTaGCN model on the widely used classification datasets 20NG, R8, R52, Ohsumed and MR, respectively.

    Table and Figures | Reference | Related Articles | Metrics
    Personalized federated learning method based on dual stream neural network
    Zheyuan SHEN, Keke YANG, Jing LI
    Journal of Computer Applications    2024, 44 (8): 2319-2325.   DOI: 10.11772/j.issn.1001-9081.2023081207
    Abstract367)   HTML54)    PDF (2185KB)(239)       Save

    Classic Federated Learning (FL) algorithms are difficult to achieve good results in scenarios where data is highly heterogeneous. In Personalized FL (PFL), a new solution was proposed aiming at the problem of data heterogeneity in federated learning, which is to “tailor” a dedicated model for each client. In this way, the models had good performance. However, it brought the difficulty in extending federated learning to new clients at the same time. Focusing on the challenges of performance and scalability in PFL, FedDual, a FL model with dual stream neural network structure, was proposed. By adding an encoder for analyzing the personalized characteristics of clients, this model was not only able to have the performance of personalized models, but also able to be extended to new clients easily. Experimental results show that compared to the classic Federated Averaging (FedAvg) algorithm on datasets such as MNIST and FashionMNIST, FedDual obviously improves the accuracy; on CIFAR10 dataset, FedDual improves the accuracy by more than 10 percentage points, FedDual achieves “plug and play” for new clients without decrease of the accuracy, solving the problem of difficult scalability for new clients.

    Table and Figures | Reference | Related Articles | Metrics
    Lightweight algorithm for impurity detection in raw cotton based on improved YOLOv7
    Yongjin ZHANG, Jian XU, Mingxing ZHANG
    Journal of Computer Applications    2024, 44 (7): 2271-2278.   DOI: 10.11772/j.issn.1001-9081.2023070969
    Abstract364)   HTML13)    PDF (8232KB)(346)       Save

    Addressing the challenges posed by high throughput of raw cotton and long impurity inspection duration in cotton mills, an improved YOLOv7 model incorporating lightweight modifications was proposed for impurity detection in raw cotton. Initially, redundant convolutional layers within YOLOv7 model were pruned, thereby increasing detection speed. Following this, FasterNet convolutional layer was integrated into the primary network to mitigate model computational load, diminish redundancy in feature maps, and consequently realized real-time detection. Ultimately, CSP-RepFPN (Cross Stage Partial networks with Replicated Feature Pyramid Network) was used within neck network to facilitate the reconstruction of feature pyramid, augment flow of feature information, minimize feature loss, and elevate the detection precision. Experimental results show that, the improved YOLOv7 model achieves a detection mean Average Precison of 96.0%, coupled with a 37.5% reduction in detection time on self-made raw cotton impurity dataset; and achieves a detection accuracy of 82.5% with a detection time of only 29.8 ms on publicly DWC (Drinking Waste Classification) dataset. This improved YOLOv7 model provides a lightweight approach for real-time detection, recognition and classification of impurities in raw cotton, yielding substantial time savings.

    Table and Figures | Reference | Related Articles | Metrics
    Proximal policy optimization algorithm based on clipping optimization and policy guidance
    Yi ZHOU, Hua GAO, Yongshen TIAN
    Journal of Computer Applications    2024, 44 (8): 2334-2341.   DOI: 10.11772/j.issn.1001-9081.2023081079
    Abstract357)   HTML15)    PDF (3877KB)(464)       Save

    Addressing the two issues in the Proximal Policy Optimization (PPO) algorithm, the difficulty in strictly constraining the difference between old and new policies and the relatively low efficiency in exploration and utilization, a PPO based on Clipping Optimization And Policy Guidance (COAPG-PPO) algorithm was proposed. Firstly, by analyzing the clipping mechanism of PPO, a trust-region clipping approach based on the Wasserstein distance was devised, strengthening the constraint on the difference between old and new policies. Secondly, within the policy updating process, ideas from simulated annealing and greedy algorithms were incorporated, improving the exploration efficiency and learning speed of algorithm. To validate the effectiveness of COAPG-PPO algorithm, comparative experiments were conducted using the MuJoCo testing benchmarks between PPO based on Clipping Optimization (CO-PPO), PPO with Covariance Matrix Adaptation (PPO-CMA), Trust Region-based PPO with RollBack (TR-PPO-RB), and PPO algorithm. The experimental results indicate that COAPG-PPO algorithm demonstrates stricter constraint capabilities, higher exploration and exploitation efficiencies, and higher reward values in most environments.

    Table and Figures | Reference | Related Articles | Metrics
    Road damage detection algorithm based on enhanced feature extraction
    Wudan LONG, Bo PENG, Jie HU, Ying SHEN, Danni DING
    Journal of Computer Applications    2024, 44 (7): 2264-2270.   DOI: 10.11772/j.issn.1001-9081.2023070956
    Abstract353)   HTML9)    PDF (2806KB)(583)       Save

    In response to the challenge posed by the difficulty in detecting small road damage areas and the uneven distribution of damage categories, a road damage detection algorithm termed RDD-YOLO was introduced based on the YOLOv7-tiny architecture. Firstly, the K-means++ algorithm was employed to determine anchor boxes better conforming to object dimensions. Subsequently, a Quantization Aware RepVGG (QARepVGG) module was utilized within the auxiliary detection branch, thereby enhancing the extraction of shallow features. Concurrently, an Addition and Multiplication Convolutional Block Attention Module (AM-CBAM) was embedded into the three inputs of the neck, effectively suppressing disturbances arising from intricate background. Furthermore, the feature fusion module Res-RFB (Resblock with Receptive Field Block) was devised to emulate the expansion of receptive field in human visual perception, consequently fusing information across multiple scales and thereby amplifying representational aptitude. Additionally, a lightweight Small Decoupled Head (S-DeHead) was introduced to elevate the precision of detecting small objects. Ultimately, the process of localizing small objects was optimized through the application of the Normalized Wasserstein Distance (NWD) metric, which in turn mitigated the challenge of imbalanced samples. Experimental results show that RDD-YOLO algorithm achieves a notable 6.19 percentage points enhancement in mAP50, a 5.31 percentage points elevation in F1-Score and the detection velocity of 135.26 frame/s by only increasing 0.71×106 parameters and 1.7 GFLOPs, which can meet the requirements for both accuracy and speed in road maintenance.

    Table and Figures | Reference | Related Articles | Metrics
    Distributed UAV cluster pursuit decision-making based on trajectory prediction and MADDPG
    Yu WANG, Zhihui GUAN, Yuanpeng LI
    Journal of Computer Applications    2024, 44 (11): 3623-3628.   DOI: 10.11772/j.issn.1001-9081.2023101538
    Abstract348)   HTML4)    PDF (918KB)(120)       Save

    A Trajectory Prediction based Distributed Multi-Agent Deep Deterministic Policy Gradient (TP-DMADDPG) algorithm was proposed to address the problems of insufficient flexibility and poor generalization ability of Unmanned Aerial Vehicle (UAV) cluster pursuit decision-making algorithms in complex mission environments. Firstly, to enhance the realism of the pursuit mission, an intelligent escape strategy was designed for the target. Secondly, considering the conditions such as missing information of target due to communication interruption and other reasons, a Long Short-Term Memory (LSTM) network was used to predict the position information of target in real time, and the state space of the decision-making model was constructed on the basis of the prediction information. Finally, TP-DMADDPG was designed based on the distributed framework and Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm, which enhanced the flexibility and generalization ability of pursuit decision-making in the process of complex air combat. Simulation results show that compared with Deep Deterministic Policy Gradient (DDPG), Twin Delayed Deep Deterministic policy gradient (TD3) and MADDPG algorithms, the TP-DMADDPG algorithm increases the success rate of collaborative decision-making by more than 15 percentage points, and can solve the problem of pursuing intelligent escaping target with incomplete information.

    Table and Figures | Reference | Related Articles | Metrics
    Review on security threats and defense measures in federated learning
    Xuebin CHEN, Zhiqiang REN, Hongyang ZHANG
    Journal of Computer Applications    2024, 44 (6): 1663-1672.   DOI: 10.11772/j.issn.1001-9081.2023060832
    Abstract347)   HTML22)    PDF (1072KB)(670)       Save

    Federated learning is a distributed learning approach for solving the data sharing problem and privacy protection problem in machine learning, in which multiple parties jointly train a machine learning model and protect the privacy of data. However, there are security threats inherent in federated learning, which makes federated learning face great challenges in practical applications. Therefore, analyzing the attacks faced by federation learning and the corresponding defensive measures are crucial for the development and application of federation learning. First, the definition, process and classification of federated learning were introduced, and the attacker model in federated learning was introduced. Then, the possible attacks in terms of both robustness and privacy of federated learning systems were introduced, and the corresponding defense measures were introduced as well. Furthermore, the shortcomings of the defense schemes were also pointed out. Finally, a secure federated learning system was envisioned.

    Table and Figures | Reference | Related Articles | Metrics
    Multi-robot path following and formation based on deep reinforcement learning
    Haodong HE, Hao FU, Qiang WANG, Shuai ZHOU, Wei LIU
    Journal of Computer Applications    2024, 44 (8): 2626-2633.   DOI: 10.11772/j.issn.1001-9081.2023081120
    Abstract342)   HTML9)    PDF (3411KB)(209)       Save

    Aiming at the obstacle avoidance and trajectory smoothness problem of multi-robot path following and formation in crowd environment, a multi-robot path following and formation algorithm based on deep reinforcement learning was proposed. Firstly, a pedestrian danger priority mechanism was established, which was combined with reinforcement learning to design a danger awareness network to enhance the safety of multi-robot formation. Subsequently, a virtual robot was introduced as the reference target for multiple robots, thus transforming path following into tracking control of the virtual robot by the multiple robots, with the purpose of enhancing the smoothness of the robot trajectories. Finally, quantitative and qualitative analysis was conducted through simulation experiments to compare the proposed algorithm with existing ones. The experimental results show that compared with the existing point-to-point path following algorithms, the proposed algorithm has excellent obstacle avoidance performance in crowd environments, which ensures the smoothness of multi-robot motion trajectories.

    Table and Figures | Reference | Related Articles | Metrics
    Deep network compression method based on low-rank decomposition and vector quantization
    Dongwei WANG, Baichen LIU, Zhi HAN, Yanmei WANG, Yandong TANG
    Journal of Computer Applications    2024, 44 (7): 1987-1994.   DOI: 10.11772/j.issn.1001-9081.2023071027
    Abstract333)   HTML121)    PDF (1506KB)(467)       Save

    As the development of artificial intelligence, deep neural network has become an essential tool in various pattern recognition tasks. Deploying deep Convolutional Neural Networks (CNN) on edge computing equipment is challenging due to storage space and computing resource constraints. Therefore, deep network compression has become an important research topic in recent years. Low-rank decomposition and vector quantization are the most popular network compression techniques, which both try to find a compact representation of the original network, thereby reducing the redundancy of network parameters. By establishing a joint compression framework, a deep network compression method based on low-rank decomposition and vector decomposition — Quantized Tensor Decomposition (QTD) was proposed to obtain higher compression ratio by performing further quantization based on the low-rank structure of network. Experimental results of classical ResNet and the proposed method on CIFAR-10 dataset show that the volume can be compressed to 1% by QTD with a slight accuracy drop of 1.71 percentage points. Moreover, the proposed method was compared with the quantization-based method PQF (Permute, Quantize, and Fine-tune), the low-rank decomposition-based method TDNR (Tucker Decomposition with Nonlinear Response), and the pruning-based method CLIP-Q (Compression Learning by In-parallel Pruning-Quantization) on large dataset ImageNet. Experimental results show that QTD can maintain better classification accuracy with same compression range.

    Table and Figures | Reference | Related Articles | Metrics
    Semi-supervised object detection framework guided by curriculum learning
    Yingjun ZHANG, Niuniu LI, Binhong XIE, Rui ZHANG, Wangdong LU
    Journal of Computer Applications    2024, 44 (8): 2326-2333.   DOI: 10.11772/j.issn.1001-9081.2023081062
    Abstract331)   HTML25)    PDF (2042KB)(267)       Save

    In order to enhance the quality of pseudo labels, address the issue of confirmation bias in Semi-Supervised Object Detection (SSOD), and tackle the challenge of ignoring complexities in unlabeled data leading to erroneous pseudo labels in existing algorithms, an SSOD framework guided by Curriculum Learning (CL) was proposed. The framework consisted of two modules: the ICSD (IoU-Confidence-Standard-Deviation) difficulty measurer and the BP (Batch-Package) training scheduler. The ICSD difficulty measurer comprehensively considered information such as IoU (Intersection over Union) between pseudo-bounding boxes, confidence, class label, etc.,and the C_IOU (Checkpoint_IOU) method was introduced to evaluate the reliability of unlabeled data. The BP training scheduler designed two efficient scheduling strategies, starting from the perspectives of Batch and Package respectively, giving priority to unlabeled data with high reliability indicators to achieve full utilization of the entire unlabeled data set in the form of course learning. Extensive comparative experimental results on the Pascal VOC and MS-COCO datasets demonstrate that the proposed framework applies to existing SSOD algorithms and exhibits significant improvements in detection accuracy and stability.

    Table and Figures | Reference | Related Articles | Metrics
    Small target detection model in overlooking scenes on tower cranes based on improved real-time detection Transformer
    Yudong PANG, Zhixing LI, Weijie LIU, Tianhao LI, Ningning WANG
    Journal of Computer Applications    2024, 44 (12): 3922-3929.   DOI: 10.11772/j.issn.1001-9081.2023121796
    Abstract327)   HTML8)    PDF (3128KB)(256)       Save

    In view of a series of problems of security guarantee of construction site personnel such as casualties led by falling objects and tower crane collapse caused by mutual collision of tower hooks, a small target detection model in overlooking scenes on tower cranes based on improved Real-Time DEtection TRansformer (RT-DETR) was proposed. Firstly, the multiple training and single inference structures designed by applying the idea of model reparameterization were added to the original model to improve the detection speed. Secondly, the convolution module in FasterNet Block was redesigned to replace BasicBlock in the original BackBone to improve performance of the detection model. Thirdly, the new loss function Inner-SIoU (Inner-Structured Intersection over Union) was utilized to further improve precision and convergence speed of the model. Finally, the ablation and comparison experiments were conducted to verify the model performance. The results show that, in detection of the small target images in overlooking scenes on tower cranes, the proposed model achieves the precision of 94.7%, which is higher than that of the original RT-DETR model by 6.1 percentage points. At the same time, the Frames Per Second (FPS) of the proposed model reaches 59.7, and the detection speed is improved by 21% compared with the original model. The Average Precision (AP) of the proposed model on the public dataset COCO 2017 is 2.4, 1.5, and 1.3 percentage points higher than those of YOLOv5, YOLOv7, and YOLOv8, respectively. It can be seen that the proposed model meets the precision and speed requirements for small target detection in overlooking scenes on tower cranes.

    Table and Figures | Reference | Related Articles | Metrics
    Graph data generation approach for graph neural network model extraction attacks
    Ying YANG, Xiaoyan HAO, Dan YU, Yao MA, Yongle CHEN
    Journal of Computer Applications    2024, 44 (8): 2483-2492.   DOI: 10.11772/j.issn.1001-9081.2023081110
    Abstract321)   HTML3)    PDF (3213KB)(327)       Save

    Data-free model extraction attacks are a class of machine learning security problems based on the fact that the attacker has no knowledge of the training data information required to carry out the attack. Aiming at the research gap of data-free model extraction attacks in the field of Graphical Neural Network (GNN), a GNN model extraction attack method was proposed. The graph node feature information and edge information were optimized with the graph neural network interpretability method GNNExplainer and the graph data enhancement method GAUG-M, respectively, so as to generate the required graph data and achieve the final GNN model extraction. Firstly, the GNNExplainer method was used to obtain the important graph node feature information from the interpretable analysis of the response results of the target model. Secondly, the overall optimization of the graph node feature information was achieved by up weighting the important graph node features and downweighting the non-important graph node features. Then, the graph autoencoder was used as the edge information prediction module, which obtained the connection probability information between nodes according to the optimized graph node features. Finally, the edge information was optimized by adding or deleting the corresponding edges according to the probability. Three GNN model architectures trained on five graph datasets were experimented as the target models for extraction attacks, and the obtained alternative models achieve 73% to 87% accuracy in node classification task and 76% to 89% fidelity with the target model performance, which verifies the effectiveness of the proposed method.

    Table and Figures | Reference | Related Articles | Metrics
    Hybrid internet of vehicles intrusion detection system for zero-day attacks
    Jiepo FANG, Chongben TAO
    Journal of Computer Applications    2024, 44 (9): 2763-2769.   DOI: 10.11772/j.issn.1001-9081.2023091328
    Abstract321)   HTML13)    PDF (2618KB)(888)       Save

    Existing machine learning methods suffer from over-reliance on sample data and insensitivity to anomalous data when confronted with zero-day attack detection, thus making it difficult for Intrusion Detection System (IDS) to effectively defend against zero-day attacks. Therefore, a hybrid internet of vehicles intrusion detection system based on Transformer and ANFIS (Adaptive-Network-based Fuzzy Inference System) was proposed. Firstly, a data enhancement algorithm was designed and the problem of unbalanced data samples was solved by denoising first and then generating. Secondly, a feature engineering module was designed by introducing non-linear feature interactions into complex feature combinations. Finally, the self-attention mechanism of Transformer and the adaptive learning method of ANFIS were combined, which enhanced the ability of feature representation and reduced the dependence on sample data. The proposed system was compared with other SOTA (State-Of-The-Art) algorithms such as Dual-IDS on CICIDS-2017 and UNSW-NB15 intrusion datasets. Experimental results show that for zero-day attacks, the proposed system achieves 98.64% detection accuracy and 98.31% F1 value on CICIDS-2017 intrusion dataset, and 93.07% detection accuracy and 92.43% F1 value on UNSW-NB15 intrusion dataset, which validates high accuracy and strong generalization ability of the proposed algorithm for zero-day attack detection.

    Table and Figures | Reference | Related Articles | Metrics
    Enhanced deep subspace clustering method with unified framework
    Qing WANG, Jieyu ZHAO, Xulun YE, Nongxiao WANG
    Journal of Computer Applications    2024, 44 (7): 1995-2003.   DOI: 10.11772/j.issn.1001-9081.2023101395
    Abstract319)   HTML84)    PDF (3432KB)(383)       Save

    Deep subspace clustering is a method that performs well in processing high-dimensional data clustering tasks. However, when dealing with challenging data, current deep subspace clustering methods with fixed self-expressive matrix usually exhibit suboptimal clustering results due to the conventional practice of treating self-expressive learning and indicator learning as two separate and independent processes, and the quality of self-expressive matrix has a crucial impact on the accuracy of clustering results. To solve the above problems, an enhanced deep subspace clustering method with unified framework was proposed. Firstly, by integrating feature learning, self-expressive learning, and indicator learning together to optimize all parameters, the self-expressive matrix was dynamically learned based on the characteristics of the data, ensuring accurate capture of data features. Secondly, to improve the effects of self-representative learning, class prototype pseudo-label learning was proposed to provide self-supervised information for feature learning and indicator learning, thereby promoting self-expressive learning. Finally, to enhance the discriminative ability of embedded representations, orthogonality constraints were introduced to help achieve self-expressive attribute. The experimental results show that compared with AASSC (Adaptive Attribute and Structure Subspace Clustering network), the proposed method improves clustering accuracy by 1.84, 0.49 and 0.34 percentage points on the MNIST, UMIST and COIL20 datasets. It can be seen that the proposed method improves the accuracy of self-representative matrix learning, thereby achieving better clustering effects.

    Table and Figures | Reference | Related Articles | Metrics
    Review of online education learner knowledge tracing
    Yajuan ZHAO, Fanjun MENG, Xingjian XU
    Journal of Computer Applications    2024, 44 (6): 1683-1698.   DOI: 10.11772/j.issn.1001-9081.2023060852
    Abstract312)   HTML21)    PDF (2932KB)(3839)       Save

    Knowledge Tracing (KT) is a fundamental and challenging task in online education, and it involves the establishment of learner knowledge state model based on the learning history; by which learners can better understand their knowledge states, while teachers can better understand the learning situation of learners. The KT research for learners of online education was summarized. Firstly, the main tasks and historical progress of KT were introduced. Subsequently, traditional KT models and deep learning KT models were explained. Furthermore, relevant datasets and evaluation metrics were summarized, alongside a compilation of the applications of KT. In conclusion, the current status of knowledge tracing was summarized, and the limitations and future prospects for KT were discussed.

    Table and Figures | Reference | Related Articles | Metrics
    Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network
    Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG
    Journal of Computer Applications    2024, 44 (7): 2065-2072.   DOI: 10.11772/j.issn.1001-9081.2023071045
    Abstract309)   HTML15)    PDF (1969KB)(325)       Save

    High-quality public traffic demand has become one of the major challenges for Intelligent Transportation Systems (ITS). For public traffic demand prediction, most of existing models adopt graphs with fixed structure to describe the spatial correlation of traffic demand, ignoring that traffic demand has different spatial dependence at different scales. Thus, a Multi-scale Spatial-Temporal Graph Convolutional Network (MSTGCN) model was proposed for public traffic demand prediction. Firstly, global demand similarity graph and local demand similarity graph were constructed at global and local scales. Two graphs were used to capture long-term stable and short-term dynamic features of public traffic demand. Graph Convolutional Network (GCN) was introduced to extract global and local spatial information in two graphs; besides, attention mechanism was adopted to combine the two kinds of spatial information adaptively. Moreover, Gated Recurrent Unit (GRU) was used to capture time-varying features of public traffic demand. The experimental results show that the MSTGCN model achieves the Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Pearson Correlation Coefficient (PCC) of 2.788 6, 1.737 1, and 0.799 2 on New York City (NYC) Bike dataset; and 9.573 4, 5.861 2, and 0.963 1 on NYC Taxi dataset. It proves that MSTGCN model can effectively mine multi-scale spatial-temporal features to accurately predict future public traffic demand.

    Table and Figures | Reference | Related Articles | Metrics
    Incomplete multi-view clustering algorithm based on self-attention fusion
    Shunyong LI, Shiyi LI, Rui XU, Xingwang ZHAO
    Journal of Computer Applications    2024, 44 (9): 2696-2703.   DOI: 10.11772/j.issn.1001-9081.2023091253
    Abstract304)   HTML9)    PDF (2806KB)(601)       Save

    Multi-view clustering task based on incomplete data has become one of the research hotspots in the field of unsupervised learning. However, most multi-view clustering algorithms based on “shallow” models often find it difficult to extract and characterize potential feature structures within views when dealing with large-scale high-dimensional data. At the same time, the stacking or averaging methods of multi-view information fusion ignore the differences between views and does not fully consider the different contributions of each view to building a common consensus representation. To address the above issues, an Incomplete Multi-View Clustering algorithm based on Self-Attention Fusion (IMVCSAF) was proposed. Firstly, the potential features of each view were extracted on the basis of a deep autoencoder, and the consistency information among views was maximized by using contrastive learning. Secondly, a self-attention mechanism was adopted to recode and fuse the potential representations of each view, and the inherent causality as well as feature complementarity between different views was considered and mined comprehensively. Thirdly, based on the common consensus representation, the potential representation of missing instance was predicted and recovered, thereby fully implementing the process of multi-view clustering. Experimental results on Scene-15, LandUse-21, Caltech101-20 and Noisy-MNIST datasets show that, the accuracy of IMVCSAF is higher than those of other comparison algorithms while meeting the convergence requirements. On Noisy-MNIST dataset with 50% miss rate, the accuracy of IMVCSAF is 6.58 percentage points higher than that of the second best algorithm — COMPETER (inCOMPlete muLti-view clustEring via conTrastivE pRediction).

    Table and Figures | Reference | Related Articles | Metrics
    Industrial defect detection method with improved masked autoencoder
    Kaili DENG, Weibo WEI, Zhenkuan PAN
    Journal of Computer Applications    2024, 44 (8): 2595-2603.   DOI: 10.11772/j.issn.1001-9081.2023081122
    Abstract302)   HTML8)    PDF (4261KB)(24)       Save

    Considering the problem of missed detection or over detection in the existing defect detection methods that only need normal samples, an method that combined an improved masked autoencoder with an improved Unet was constructed to achieve pixel-level defect detection. Firstly, a defect fitting module was used to generate the defect mask image and the defect image corresponding to the normal image. Secondly, the defect image was randomly masked to remove most of the defect information from the defect image. The autoencoder with Transformer structure was stimulated to learn the representations from unmasked normal regions and to repair the defect image based on context. In order to improve the model’s ability to repair details of the image, a new loss function was designed. Finally, in order to achieve pixel-level defect detection, the defect image and the repaired image were concatenated and input into the Unet with the channel cross-fusion Transformer structure. Experimental results on MVTec AD dataset show that the average image-based and pixel-based Area Under the Receiver Operating Characteristic Curve (ROC AUC) of the proposed method reached 0.984 and 0.982 respectively; compared with DRAEM (Discriminatively trained Reconstruction Anomaly Embedding Model), it was increased by 2.9 and 3.2 percentage points; compared with CFLOW-AD (Anomaly Detection via Conditional normalizing FLOWs), it was increased by 3.1 and 0.8 percentage points. It verifies that the proposed method has high recognition rate and detection accuracy.

    Table and Figures | Reference | Related Articles | Metrics
    Multivariate time series prediction model based on decoupled attention mechanism
    Liting LI, Bei HUA, Ruozhou HE, Kuang XU
    Journal of Computer Applications    2024, 44 (9): 2732-2738.   DOI: 10.11772/j.issn.1001-9081.2023091301
    Abstract300)   HTML11)    PDF (1545KB)(786)       Save

    Aiming at the problem that it is difficult to fully utilize the sequence contextual semantic information and the implicit correlation information among variables in multivariate time-series prediction, a model based on decoupled attention mechanism — Decformer was proposed for multivariate time-series prediction. Firstly, a novel decoupled attention mechanism was proposed to fully utilize the embedded semantic information, thereby improving the accuracy of attention weight allocation. Secondly, a pattern correlation mining method without relying on explicit variable relationships was proposed to mine and utilize implicit pattern correlation information among variables. On three different types of real datasets (TTV, ECL and PeMS-Bay), including traffic volume of call, electricity consumption and traffic, Decformer achieves the highest prediction accuracy over all prediction time lengths compared with excellent open-source multivariate time-series prediction models such as Long- and Short-term Time-series Network (LSTNet), Transformer and FEDformer. Compared with LSTNet, Decformer has the Mean Absolute Error (MAE) reduced by 17.73%-27.32%, 10.89%-17.01%, and 13.03%-19.64% on TTV, ECL and PeMS-Bay datasets, respectively, and the Mean Squared Error (MSE) reduced by 23.53%-58.96%, 16.36%-23.56% and 15.91%-26.30% on TTV, ECL and PeMS-Bay datasets, respectively. Experimental results indicate that Decformer can enhance the accuracy of multivariate time series prediction significantly.

    Table and Figures | Reference | Related Articles | Metrics
    Improved U-Net algorithm based on attention mechanism and multi-scale fusion
    Song WU, Xin LAN, Jingyang SHAN, Haiwen XU
    Journal of Computer Applications    0, (): 24-28.   DOI: 10.11772/j.issn.1001-9081.2022121844
    Abstract300)   HTML6)    PDF (2163KB)(130)       Save

    Aiming at the problems of computational redundancy and difficulty in segmenting fine structures of the original U-Net in medical image segmentation tasks, an improved U-Net algorithm based on attention mechanism and multi-scale fusion was proposed. Firstly, by integrating channel attention mechanism into the skip connections, the channels containing more important information were focused by the network, thereby reducing computational resource cost and improving computational efficiency. Secondly, the feature fusion strategy was added to increase the contextual information for the feature maps passed to the decoder, which realized the complementary and multiple utilization among the features. Finally, the joint optimization was performed by using Dice loss and binary cross entropy loss, so as to handle with the problem of dramatic oscillations of loss function that may occur in fine structure segmentation. Experimental validation results on Kvasir_seg and DRIVE datasets show that compared with the original U-Net algorithm, the proposed improved algorithm has the Dice coefficient increased by 1.82 and 0.82 percentage points, the SEnsitivity (SE) improved by 1.94 and 3.53 percentage points, and the Accuracy (Acc) increased by 1.62 and 0.04 percentage points, respectively. It can be seen that the proposed improved algorithm can enhance performance of the original U-Net for fine structure segmentation.

    Table and Figures | Reference | Related Articles | Metrics
    Trajectory planning for autonomous vehicles based on model predictive control
    Chao GE, Jiabin ZHANG, Lei WANG, Zhixin LUN
    Journal of Computer Applications    2024, 44 (6): 1959-1964.   DOI: 10.11772/j.issn.1001-9081.2023050725
    Abstract299)   HTML8)    PDF (2720KB)(554)       Save

    To help the autonomous vehicle plan a safe, comfortable and efficient driving trajectory, a trajectory planning approach based on model predictive control was proposed. First, to simplify the planning environment, a safe and feasible “three-circle” expansion of the safety zone was introduced, which also eliminates the collision issues caused by the idealized model of the vehicle. Then, the trajectory planning was decoupled in lateral and longitudinal space. A model prediction method was applied for lateral planning to generate a series of candidate trajectories that met the driving requirements, and a dynamic planning approach was utilized for longitudinal planning, which improved the efficiency of the planning process. Eventually, the factors affecting the selection of optimal trajectories were considered comprehensively, and an optimal trajectory evaluation function was proposed for path planning and speed planning more compatible with the driving requirements. The effectiveness of the proposed algorithm was verified by joint simulation with Matlab/Simulink, Prescan and Carsim software. Experimental results indicate that the vehicle achieves the expected effects in terms of comfort metrics, steering wheel angle variation and localization accuracy, and the planning curve also perfectly matches the tracking curve, which validates the advantage of the proposed algorithm.

    Table and Figures | Reference | Related Articles | Metrics
    Review of marine ship communication cybersecurity
    Zhongdai WU, Dezhi HAN, Haibao JIANG, Cheng FENG, Bing HAN, Chongqing CHEN
    Journal of Computer Applications    2024, 44 (7): 2123-2136.   DOI: 10.11772/j.issn.1001-9081.2023070975
    Abstract299)   HTML6)    PDF (3942KB)(1094)       Save

    Maritime transportation is one of the most important modes of human transportation. Maritime cybersecurity is crucial to avoid financial loss and ensure shipping safety. Due to the obvious weakness of maritime cybersecurity maritime cyberattacks are frequent. There are a lot of research literatures about maritime cybersecurity at domestic and abroad but most of them have not been reviewed yet. The structures risks and countermeasures of the maritime network were systematically organized and comprehensively introduced. On this basis some suggestions were put forward to deal with the maritime cyberthreats.

    Table and Figures | Reference | Related Articles | Metrics
    Dual-branch low-light image enhancement network combining spatial and frequency domain information
    Dahai LI, Zhonghua WANG, Zhendong WANG
    Journal of Computer Applications    2024, 44 (7): 2175-2182.   DOI: 10.11772/j.issn.1001-9081.2023070933
    Abstract292)   HTML11)    PDF (3079KB)(471)       Save

    To address the problems of blurred texture details and color distortion in low-light image enhancement, an end-to-end lightweight dual-branch network by combining spatial and frequency information, named SAFNet, was proposed. Transformer-based spatial block and frequency block were adopted by SAFNet to process spatial information and Fourier transformed frequency information of input image in spatial and frequency branchs, respectively. Attention mechanism was also applied in SAFNet to fuse features captured from spatial and frequency branchs adaptively to obtain final enhanced image. Furthermore, a frequency-domain loss function for frequency information was added into joint loss function, in order to constrain SAFNet on both spatial and frequency domains. Experiments on public datasets LOL and LSRW were conducted to evaluate the performance of SAFNet. Experimed results show that SAFNet achieved 0.823, 0.114 in metrics of Structural SIMilarity (SSIM) and Learned Perceptual Image Patch Similarity (LPIPS) on LOL, respectively, and 17.234 dB, 0.550 in Peak Signal-to-Noise Ratio (PSNR) and SSIM on LSRW. SAFNet achieve supreme performance than evaluated mainstream methods, such as LLFormer (Low-Light Transformer), IAT (Illumination Adaptive Transformer), and KinD (Kindling the Darkness) ++ with only 0.07×106 parameters. On DarkFace dataset, the average precision of human face detection is increased from 52.6% to 72.5% by applying SAFNet as preprocessing step. Above experimental results illustrate that SAFNet can effectively enhance low-light images quality and improve performance of low-light face detection for downstream tasks significantly.

    Table and Figures | Reference | Related Articles | Metrics
    Multi-domain fake news detection model enhanced by APK-CNN and Transformer
    Jinjin LI, Guoming SANG, Yijia ZHANG
    Journal of Computer Applications    2024, 44 (9): 2674-2682.   DOI: 10.11772/j.issn.1001-9081.2023091359
    Abstract292)   HTML17)    PDF (1378KB)(987)       Save

    In order to solve the problems of domain shifting and incomplete domain labeling in social media news, as well as to explore more efficient multi-domain news feature extraction and fusion networks, a multi-domain fake news detection model based on enhancement by APK-CNN (Adaptive Pooling Kernel Convolutional Neural Network) and Transformer was proposed, namely Transm3. Firstly, a three-channel network was designed for feature extraction and representation of semantic, emotional, and stylistic information of the text and view combination of these features using a multi-granularity cross-domain interactor. Secondly, the news domain labels were refined by optimized soft-shared memory networking and domain adapters. Then, Transformer was combined with a multi-granularity cross-domain interactor to dynamically and weighty aggregate the interaction features of different domains. Finally, the fused features were fed into the classifier for true/false news discrimination. Experimental results show that compared with M3FEND (Memory-guided Multi-view Multi-domain FakE News Detection) and EANN (Event Adversarial Neural Networks for multi-modal fake news detection), Transm3 improves the comprehensive F1 value by 3.68% and 6.46% on Chinese dataset, and 6.75% and 11.93% on English dataset; and the F1 values on sub-domains are also significantly improved. The effectiveness of Transm3 for multi-domain fake news detection is fully validated.

    Table and Figures | Reference | Related Articles | Metrics
    Shorter long-sequence time series forecasting model
    Zexin XU, Lei YANG, Kangshun LI
    Journal of Computer Applications    2024, 44 (6): 1824-1831.   DOI: 10.11772/j.issn.1001-9081.2023060799
    Abstract291)   HTML21)    PDF (2751KB)(119)       Save

    Aiming at the problem that most of the existing researches study short-sequence time series forecasting and long-sequence time series forecasting separately, which leads to the poor forecasting accuracy of the model in the shorter long-sequence time series, a Shorter Long-sequence Time Series Forecasting Model (SLTSFM) was proposed. Firstly, a Sequence-to-Sequence (Seq2Seq) structure was constructed using Convolutional Neural Network (CNN) and PBUSM (Probsparse Based on Uniform Selection Mechanism) self-attention mechanism, which was used to extract the features of the long-sequence input. Secondly, “far light, near heavy” strategy was designed to apply to reallocate the features of each time period extracted from multiple Long Short-Term Memory (LSTM) modules, which were more capable of short-sequence input feature extraction. Finally, the reallocated features were used to enhance the extracted long-sequence input features to improve the forecasting accuracy and realize the time series forecasting. Four publicly available time series datasets were utilized to verify the effectiveness of the proposed model. The experimental results demonstrate that, compared with the suboptimal comprehensive performing model Gated Recurrent Unit (GRU), the Mean Absolute Error (MAE) metrics of SLTSFM were reduced by 61.54%, 13.48%, 0.92% and 19.58% for univariate time series forecasting, and were reduced by 17.01%, 18.13%, 3.24% and 6.73% for multivariate time series forecasting on the four datasets. It’s verified that SLTSFM is effective in improving the accuracy of shorter long-sequence time series forecasting.

    Table and Figures | Reference | Related Articles | Metrics
    Time series causal inference method based on adaptive threshold learning
    Qinzhuang ZHAO, Hongye TAN
    Journal of Computer Applications    2024, 44 (9): 2660-2666.   DOI: 10.11772/j.issn.1001-9081.2023091278
    Abstract289)   HTML27)    PDF (1142KB)(191)       Save

    Time-series data exhibits recency characteristic, i.e., variable values are generally dependent on recent historical information. Existing time-series causal inference methods do not fully consider the recency characteristic, which use a uniform threshold when inferring causal relationships with different delays through hypothesis testing, so that it is difficult to effectively infer weaker causal relationships. To address the aforementioned issue, a method for time-series causal inference based on adaptive threshold learning was proposed. Firstly, data characteristics were extracted. Then, based on the data characteristics at different delays, a combination of thresholds used in the hypothesis testing process was automatically learned. Finally, this threshold combination was applied to the hypothesis testing processes of the PC (Peter-Clark) algorithm, PCMCI (Peter-Clark and Momentary Conditional Independence) algorithm, and VAR-LINGAM (Vector AutoRegressive LINear non-Gaussian Acyclic Model) algorithm to obtain more accurate causal relationship structures. Experimental results on the simulation dataset show that the F1 values of adaptive PC algorithm, adaptive PCMCI algorithm, and adaptive VAR-LINGAM algorithm using the proposed method are all improved.

    Table and Figures | Reference | Related Articles | Metrics
2025 Vol.45 No.4

Current Issue
Archive
Honorary Editor-in-Chief: ZHANG Jingzhong
Editor-in-Chief: XU Zongben
Associate Editor: SHEN Hengtao XIA Zhaohui
Domestic Post Distribution Code: 62-110
Foreign Distribution Code: M4616
Address:
No. 9, 4th Section of South Renmin Road, Chengdu 610041, China
Tel: 028-85224283-803
  028-85222239-803
Website: www.joca.cn
E-mail: bjb@joca.cn
WeChat
Join CCF