Most Download articles

    Published in last 1 year | In last 2 years| In last 3 years| All| Most Downloaded in Recent Month | Most Downloaded in Recent Year|

    Most Downloaded in Recent Year
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Construction of digital twin water conservancy knowledge graph integrating large language model and prompt learning
    Yan YANG, Feng YE, Dong XU, Xuejie ZHANG, Jin XU
    Journal of Computer Applications    2025, 45 (3): 785-793.   DOI: 10.11772/j.issn.1001-9081.2024050570
    Abstract296)   HTML9)    PDF (2950KB)(1687)       Save

    Constructing digital twin water conservancy construction knowledge graph to mine the potential relationships between water conservancy construction objects can help the relevant personnel to optimize the water conservancy construction design scheme and decision-making process. Aiming at the interdisciplinary and complex knowledge structure of digital twin water conservancy construction, and the problems such as insufficient learning and low extraction accuracy of knowledge of general knowledge extraction models in water conservancy domain, a Digital Twin water conservancy construction Knowledge Extraction method based on Large Language Model (DTKE-LLM) was proposed to improve the accuracy of knowledge extraction. In this method, by deploying local Large Language Model (LLM) through LangChain and integrating digital twin water conservancy domain knowledge, prompt learning was used to fine-tune the LLM. In the LLM, semantic understanding and generation capabilities were utilized to extract knowledge. At the same time, a heterogeneous entity alignment strategy was designed to optimize the entity extraction results. Comparison experiments and ablation experiments were carried out on the water conservancy domain corpus to verify the effectiveness of DTKE-LLM. Results of the comparison experiments demonstrate that DTKE-LLM outperforms the deep learning-based BiLSTM-CRF (Bidirectional Long Short-Term Memory Conditional Random Field) named entity recognition model and the general Information extraction model UIE (Universal Information Extraction) in precision. Results of the ablation experiments show that compared with the ChatGLM2-6B (Chat Generative Language Model 2.6 Billion), DTKE-LLM has the F1 scores of entity extraction and relation extraction improved by 5.5 and 3.2 percentage points respectively. It can be seen that the proposed method realizes the construction of digital twin water conservancy construction knowledge graph on the basis of ensuring the quality of knowledge graph construction.

    Table and Figures | Reference | Related Articles | Metrics
    Bias challenges of large language models: identification, evaluation, and mitigation
    Yuemei XU, Yuqi YE, Xueyi HE
    Journal of Computer Applications    2025, 45 (3): 697-708.   DOI: 10.11772/j.issn.1001-9081.2024091350
    Abstract393)   HTML16)    PDF (2112KB)(919)       Save

    Aiming at the unsafety and being out of control problems caused by biases in the output of Large Language Model (LLM), research status, techniques, and limitations related to biases in the existing LLMs were sorted deeply and analyzed from three aspects: bias identification, evaluation, and mitigation. Firstly, three key techniques of LLM were summed up to study the basic reasons of LLMs’ inevitable intrinsic biases. Secondly, three types of biases in LLMs were categorized into linguistic bias, demographic bias, and evaluation bias, and characteristics and causes of the biases were explored. Thirdly, a systematic review of the existing LLM bias evaluation benchmarks was carried out, and the strengths and weaknesses of these general-purpose, language-specific, and task-specific benchmarks were discussed. Finally, current LLM bias mitigation techniques were analyzed in depth from both model bias mitigation and data bias mitigation perspectives, and directions for their future refinement were pointed out. At the same time, the research directions for biases in LLMs were indicated by analysis: multi-cultural attribute evaluation of bias, lightweight bias mitigation techniques, and enhancement of the interpretability of biases.

    Table and Figures | Reference | Related Articles | Metrics
    ScholatGPT: a large language model for academic social networks and its intelligent applications
    Chengzhe YUAN, Guohua CHEN, Dingding LI, Yuan ZHU, Ronghua LIN, Hao ZHONG, Yong TANG
    Journal of Computer Applications    2025, 45 (3): 755-764.   DOI: 10.11772/j.issn.1001-9081.2024101477
    Abstract520)   HTML29)    PDF (2602KB)(1719)       Save

    To address the limitations of the existing Large Language Models (LLMs) in processing cross-domain knowledge, updating real-time academic information, and ensuring output quality, ScholatGPT, a scholar LLM based on Academic Social Networks (ASNs), was proposed. In ScholatGPT, the abilities of precise semantic retrieval and dynamic knowledge update were enhanced by integrating Knowledge-Graph Augmented Generation (KGAG) and Retrieval-Augmented Generation (RAG), and optimization and fine-tuning were used to improve the generation quality of academic text. Firstly, a scholar knowledge graph was constructed based on relational data from SCHOLAT, with LLMs employed to enrich the graph semantically. Then, a KGAG-based retrieval model was introduced, combined with RAG to realize multi-path hybrid retrieval, thereby enhancing the model’s precision in search. Finally, fine-tuning techniques were applied to optimize the model’s generation quality in academic fields. Experimental results demonstrate that ScholatGPT achieves the precision of 83.2% in academic question answering tasks, outperforming GPT-4o and AMiner AI by 69.4 and 11.5 percentage points, and performs well in all the tasks such as scholar profiling, representative work identification, and research field classification. Furthermore, ScholatGPT obtains stable and competitive results in answer relevance, coherence, and readability, achieving a good balance between specialization and readability. Additionally, ScholatGPT-based intelligent applications such as scholar think tank and academic information recommendation system improve academic resource acquisition efficiency effectively.

    Table and Figures | Reference | Related Articles | Metrics
    Academic journal contribution recommendation algorithm based on author preferences
    Yongfeng DONG, Xiangqian QU, Linhao LI, Yao DONG
    Journal of Computer Applications    2022, 42 (1): 50-56.   DOI: 10.11772/j.issn.1001-9081.2021010185
    Abstract650)   HTML39)    PDF (605KB)(1884)       Save

    In order to solve the problem that the algorithms of publication venue recommendation always consider the text topics or the author’s history of publications separately, which leads to the low accuracy of publication venue recommendation results, a contribution recommendation algorithm of academic journal based on author preferences was proposed. In this algorithm, not only the text topics and the author’s history of publications were used together, but also the potential relationship between the academic focuses of publication venues and time were explored. Firstly, the Latent Dirichlet Allocation (LDA) topic model was used to extract the topic information of the paper title. Then, the topic-journal and time-journal model diagrams were established, and the Large-scale Information Network Embedding (LINE) model was used to learn the embedding of graph nodes. Finally, the author’s subject preferences and history of publication records were fused to calculate the journal composite scores, and the publication venue recommendation for author to contribute was realized. Experimental results on two public datasets, DBLP and PubMed, show that the proposed algorithm has better recall under different list lengths of recommended publication venues compared to six algorithms such as Singular Value Decomposition (SVD), DeepWalk and Non-negative Matrix Factorization (NMF). The proposed algorithm maintains high accuracy while requiring less information from papers and knowledge bases, and can effectively improve the robustness of publication venue recommendation algorithm.

    Table and Figures | Reference | Related Articles | Metrics
    Probability-driven dynamic multiobjective evolutionary optimization for multi-agent cooperative scheduling
    Xiaofang LIU, Jun ZHANG
    Journal of Computer Applications    2024, 44 (5): 1372-1377.   DOI: 10.11772/j.issn.1001-9081.2023121865
    Abstract494)   HTML19)    PDF (1353KB)(1885)       Save

    In multi-agent systems, there are multiple cooperative tasks that change with time and multiple conflict optimization objective functions. To build a multi-agent system, the dynamic multiobjective multi-agent cooperative scheduling problem becomes one of critical problems. To solve this problem, a probability-driven dynamic prediction strategy was proposed to utilize the probability distributions in historical environments to predict the ones in new environments, thus generating new solutions and realizing the fast response to environmental changes. In detail, an element-based representation for probability distributions was designed to represent the adaptability of elements in dynamic environments, and the probability distributions were gradually updated towards real distributions according to the best solutions found by optimization algorithms in each iteration. Taking into account continuity and relevance of environmental changes, a fusion-based prediction mechanism was built to predict the probability distributions and to provide a priori knowledge of new environments by fusing historical probability distributions when the environment changes. A new heuristic-based sampling mechanism was also proposed by combining probability distributions and heuristic information to generate new solutions for updating out-of-date populations. The proposed probability-driven dynamic prediction strategy can be inserted into any multiobjective evolutionary algorithms, resulting in probability-driven dynamic multiobjective evolutionary algorithms. Experimental results on 10 dynamic multiobjective multi-agent cooperative scheduling problem instances show that the proposed algorithms outperform the competing algorithms in terms of solution optimality and diversity, and the proposed probability-driven dynamic prediction strategy can improve the performance of multiobjective evolutionary algorithms in dynamic environments.

    Table and Figures | Reference | Related Articles | Metrics
    Summary of network intrusion detection systems based on deep learning
    Miaolei DENG, Yupei KAN, Chuanchuan SUN, Haihang XU, Shaojun FAN, Xin ZHOU
    Journal of Computer Applications    2025, 45 (2): 453-466.   DOI: 10.11772/j.issn.1001-9081.2024020229
    Abstract567)   HTML45)    PDF (1427KB)(2721)       Save

    Security mechanisms such as Intrusion Detection System (IDS) have been used to protect network infrastructure and communication from network attacks. With the continuous progress of deep learning technology, IDSs based on deep learning have become a research hotspot in the field of network security gradually. Through extensive literature research, a detailed introduction to the latest research progress in network intrusion detection using deep learning technology was given. Firstly, a brief overview of several IDSs was performed. Secondly, the commonly used datasets and evaluation metrics in deep learning-based IDSs were introduced. Thirdly, the commonly used deep learning models in network IDSs and their application scenarios were summarized. Finally, the problems faced in the current related research were discussed, and the future development directions were proposed.

    Table and Figures | Reference | Related Articles | Metrics
    Deep symbolic regression method based on Transformer
    Pengcheng XU, Lei HE, Chuan LI, Weiqi QIAN, Tun ZHAO
    Journal of Computer Applications    2025, 45 (5): 1455-1463.   DOI: 10.11772/j.issn.1001-9081.2024050609
    Abstract362)   HTML6)    PDF (3565KB)(888)       Save

    To address the challenges of reduced population diversity and sensitivity to hyperparameters in solving Symbolic Regression (SR) problems by using genetic evolutionary algorithms, a Deep Symbolic Regression Technique (DSRT) method based on Transformer was proposed. This method employed autoregressive capability of Transformer to generate expression symbol sequence. Subsequently, the transformation of the fitness value between the data and the expression symbol sequence was served as a reward value, and the model parameters were updated through deep reinforcement learning, so that the model was able to output expression sequence that fitted the data better, and with the model’s continuous converging, the optimal expression was identified. The effectiveness of the DSRT method was validated on the SR benchmark dataset Nguyen, and it was compared with DSR (Deep Symbolic Regression) and GP (Genetic Programming) algorithms within 200 iterations. Experimental results confirm the validity of DSRT method. Additionally, the influence of various parameters on DSRT method was discussed, and an experiment to predict the formula for surface pressure coefficient of an aircraft airfoil using NACA4421 dataset was performed. The obtained formula was compared with the Kármán-Tsien formula, yielding a mathematical formula with a lower Root Mean Square Error (RMSE).

    Table and Figures | Reference | Related Articles | Metrics
    Multivariate time series prediction model based on decoupled attention mechanism
    Liting LI, Bei HUA, Ruozhou HE, Kuang XU
    Journal of Computer Applications    2024, 44 (9): 2732-2738.   DOI: 10.11772/j.issn.1001-9081.2023091301
    Abstract933)   HTML11)    PDF (1545KB)(1951)       Save

    Aiming at the problem that it is difficult to fully utilize the sequence contextual semantic information and the implicit correlation information among variables in multivariate time-series prediction, a model based on decoupled attention mechanism — Decformer was proposed for multivariate time-series prediction. Firstly, a novel decoupled attention mechanism was proposed to fully utilize the embedded semantic information, thereby improving the accuracy of attention weight allocation. Secondly, a pattern correlation mining method without relying on explicit variable relationships was proposed to mine and utilize implicit pattern correlation information among variables. On three different types of real datasets (TTV, ECL and PeMS-Bay), including traffic volume of call, electricity consumption and traffic, Decformer achieves the highest prediction accuracy over all prediction time lengths compared with excellent open-source multivariate time-series prediction models such as Long- and Short-term Time-series Network (LSTNet), Transformer and FEDformer. Compared with LSTNet, Decformer has the Mean Absolute Error (MAE) reduced by 17.73%-27.32%, 10.89%-17.01%, and 13.03%-19.64% on TTV, ECL and PeMS-Bay datasets, respectively, and the Mean Squared Error (MSE) reduced by 23.53%-58.96%, 16.36%-23.56% and 15.91%-26.30% on TTV, ECL and PeMS-Bay datasets, respectively. Experimental results indicate that Decformer can enhance the accuracy of multivariate time series prediction significantly.

    Table and Figures | Reference | Related Articles | Metrics
    Enterprise ESG indicator prediction model based on richness coordination technology
    Yan LI, Guanhua YE, Yawen LI, Meiyu LIANG
    Journal of Computer Applications    2025, 45 (2): 670-676.   DOI: 10.11772/j.issn.1001-9081.2024030262
    Abstract308)   HTML7)    PDF (1400KB)(2914)       Save

    Environmental, Social, and Governance (ESG) indicator is a critical indicator for assessing the sustainability of enterprises. The existing ESG assessment systems face challenges such as narrow coverage, strong subjectivity, and poor timeliness. Thus, there is an urgent need for research on prediction models that can forecast ESG indicator accurately using enterprise data. Addressing the issue of inconsistent information richness among ESG-related features in enterprise data, a prediction model RCT (Richness Coordination Transformer) was proposed for enterprise ESG indicator prediction based on richness coordination technology. In this model, an auto-encoder was used in the upstream richness coordination module to coordinate features with heterogeneous information richness, thereby enhancing the ESG indicator prediction performance of the downstream module. Experimental results on real datasets demonstrate that on various prediction indicators, RCT model outperforms multiple models including Temporal Convolutional Network (TCN), Long Short-Term Memory (LSTM) network, Self-Attention Model (Transformer), eXtreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM). The above verifies that the effectiveness and superiority of RCT model in ESG indicator prediction.

    Table and Figures | Reference | Related Articles | Metrics
    Reasoning question answering model of complex temporal knowledge graph with graph attention
    Wenjuan JIANG, Yi GUO, Jiaojiao FU
    Journal of Computer Applications    2024, 44 (10): 3047-3057.   DOI: 10.11772/j.issn.1001-9081.2023101391
    Abstract363)   HTML9)    PDF (2228KB)(649)       Save

    In the task of Temporal Knowledge Graph Question Answering (TKGQA), it is a challenge for models to capture and utilize the implicit temporal information in the questions to enhance the complex reasoning ability of the models. To address this problem, a Graph Attention mechanism-integrated Complex Temporal knowledge graph Reasoning question answering (GACTR) model was proposed. The proposed model was pretrained on a temporal Knowledge Base (KB) in the form of quadruples, and a Graph Attention neTwork (GAT) was introduced to effectively capture implicit temporal information in the question. The relationship representation trained by Robustly optimized Bidirectional Encoder Representations from Transformers pretraining approach (RoBERTa) was integrated to enhance the temporal relationship representation of the question. This representation was combined with the pretrained Temporal Knowledge Graph (TKG) embedding, and the final prediction result was the entity or timestamp with the highest score. On the largest benchmark dataset CRONQUESTIONS, compared to the baseline models, Knowledge Graph Question Answering on CRONQUESTIONS(CRONKGQA), the GACTR model achieved improvements of 34.6 and 13.2 percentage points in handling complex question and time answer types, respectively; compared to the Temporal Question Reasoning (TempoQR) model, the improvements were 8.3 and 2.8 percentage points, respectively.

    Table and Figures | Reference | Related Articles | Metrics
    Survey of visual object tracking methods based on Transformer
    Ziwen SUN, Lizhi QIAN, Chuandong YANG, Yibo GAO, Qingyang LU, Guanglin YUAN
    Journal of Computer Applications    2024, 44 (5): 1644-1654.   DOI: 10.11772/j.issn.1001-9081.2023060796
    Abstract842)   HTML22)    PDF (1615KB)(2132)       Save

    Visual object tracking is one of the important tasks in computer vision, in order to achieve high-performance object tracking, a large number of object tracking methods have been proposed in recent years. Among them, Transformer-based object tracking methods become a hot topic in the field of visual object tracking due to their ability to perform global modeling and capture contextual information. Firstly, existing Transformer-based visual object tracking methods were classified based on their network structures, an overview of the underlying principles and key techniques for model improvement were expounded, and the advantages and disadvantages of different network structures were also summarized. Then, the experimental results of the Transformer-based visual object tracking methods on public datasets were compared to analyze the impact of network structure on performance. in which MixViT-L (ConvMAE) achieved tracking success rates of 73.3% and 86.1% on LaSOT and TrackingNet, respectively, proving that the object tracking methods based on pure Transformer two-stage architecture have better performance and broader development prospects. Finally, the limitations of these methods, such as complex network structure, large number of parameters, high training requirements, and difficulty in deploying on edge devices, were summarized, and the future research focus was outlooked, by combining model compression, self-supervised learning, and Transformer interpretability analysis, more kinds of feasible solutions for Transformer-based visual target tracking could be presented.

    Table and Figures | Reference | Related Articles | Metrics
    Multimodal knowledge graph representation learning: a review
    Chunlei WANG, Xiao WANG, Kai LIU
    Journal of Computer Applications    2024, 44 (1): 1-15.   DOI: 10.11772/j.issn.1001-9081.2023050583
    Abstract1547)   HTML119)    PDF (3449KB)(6916)       Save

    By comprehensively comparing the models of traditional knowledge graph representation learning, including the advantages and disadvantages and the applicable tasks, the analysis shows that the traditional single-modal knowledge graph cannot represent knowledge well. Therefore, how to use multimodal data such as text, image, video, and audio for knowledge graph representation learning has become an important research direction. At the same time, the commonly used multimodal knowledge graph datasets were analyzed in detail to provide data support for relevant researchers. On this basis, the knowledge graph representation learning models under multimodal fusion of text, image, video, and audio were further discussed, and various models were summarized and compared. Finally, the effect of multimodal knowledge graph representation on enhancing classical applications, including knowledge graph completion, question answering system, multimodal generation and recommendation system in practical applications was summarized, and the future research work was prospected.

    Table and Figures | Reference | Related Articles | Metrics
    Cross-view matching model based on attention mechanism and multi-granularity feature fusion
    Meiyu CAI, Runzhe ZHU, Fei WU, Kaiyu ZHANG, Jiale LI
    Journal of Computer Applications    2024, 44 (3): 901-908.   DOI: 10.11772/j.issn.1001-9081.2023040412
    Abstract379)   HTML14)    PDF (3816KB)(1648)       Save

    Cross-view scene matching refers to the discovery of images of the same geographical target from different platforms (such as drones and satellites). However, different image platforms lead to low accuracy of UAV (Unmanned Aerial Vehicle) positioning and navigation tasks, and the existing methods usually focus only on a single dimension of the image and ignore the multi-dimensional features of the image. To solve the above problems, GAMF (Global Attention and Multi-granularity feature Fusion) deep neural network was proposed to improve feature representation and feature distinguishability. Firstly, the images from the UAV perspective and the satellite perspective were combined, and the three branches were extended under the unified network architecture, the spatial location, channel and local features of the images from three dimensions were extracted. Then, by establishing the SGAM (Spatial Global relationship Attention Module) and CGAM (Channel Global Attention Module), the spatial global relationship mechanism and channel attention mechanism were introduced to capture global information, so as to better carry out attention learning. Secondly, in order to fuse local perception features, a local division strategy was introduced to better improve the model’s ability to extract fine-grained features. Finally, the features of the three dimensions were combined as the final features to train the model. The test results on the public dataset University-1652 show that the AP (Average Precision) of the GAMF model on UAV visual positioning tasks reaches 87.41%, and the Recall (R@1) in UAV visual navigation tasks reaches 90.30%, which verifies that the GAMF model can effectively aggregate the multi-dimensional features of the image and improve the accuracy of UAV positioning and navigation tasks.

    Table and Figures | Reference | Related Articles | Metrics
    UAV cluster cooperative combat decision-making method based on deep reinforcement learning
    Lin ZHAO, Ke LYU, Jing GUO, Chen HONG, Xiancai XIANG, Jian XUE, Yong WANG
    Journal of Computer Applications    2023, 43 (11): 3641-3646.   DOI: 10.11772/j.issn.1001-9081.2022101511
    Abstract927)   HTML22)    PDF (2944KB)(2484)       Save

    When the Unmanned Aerial Vehicle (UAV) cluster attacks ground targets, it will be divided into two formations: a strike UAV cluster that attacks the targets and a auxiliary UAV cluster that pins down the enemy. When auxiliary UAVs choose the action strategy of aggressive attack or saving strength, the mission scenario is similar to a public goods game where the benefits to the cooperator are less than those to the betrayer. Based on this, a decision method for cooperative combat of UAV clusters based on deep reinforcement learning was proposed. First, by building a public goods game based UAV cluster combat model, the interest conflict problem between individual and group in cooperation of intelligent UAV clusters was simulated. Then, Muti-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm was used to solve the most reasonable combat decision of the auxiliary UAV cluster to achieve cluster victory with minimum loss cost. Training and experiments were performed under conditions of different numbers of UAV. The results show that compared to the training effects of two algorithms — IDQN (Independent Deep Q-Network) and ID3QN (Imitative Dueling Double Deep Q-Network), the proposed algorithm has the best convergence, its winning rate can reach 100% with four auxiliary UAVs, and it also significantly outperforms the comparison algorithms with other UAV numbers.

    Table and Figures | Reference | Related Articles | Metrics
    Survey and prospect of large language models
    Xiaolin QIN, Xu GU, Dicheng LI, Haiwen XU
    Journal of Computer Applications    2025, 45 (3): 685-696.   DOI: 10.11772/j.issn.1001-9081.2025010128
    Abstract1234)   HTML93)    PDF (2035KB)(2373)       Save

    Large Language Models (LLMs) are a class of language models composed of artificial neural networks with a vast number of parameters (typically billions of weights or more). They are trained on a large amount of unlabeled text using self-supervised or semi-supervised learning and are the core of current generative Artificial Intelligence (AI) technologies. Compared to traditional language models, LLMs demonstrate stronger language understanding and generation capabilities, supported by substantial computational power, extensive parameters, and large-scale data. They are widely applied in tasks such as machine translation, question answering systems, and dialogue generation with good performance. Most of the existing surveys focus on the theoretical construction and training techniques of LLMs, while systematic exploration of LLMs’ industry-level application practices and evolution of the technological ecosystem remains insufficient. Therefore, based on introducing the foundational architecture, training techniques, and development history of LLMs, the current general key technologies in LLMs and advanced integration technologies with LLMs bases were analyzed. Then, by summarizing the existing research, challenges faced by LLMs in practical applications were further elaborated, including problems such as data bias, model hallucination, and computational resource consumption, and an outlook was provided on the ongoing development trends of LLMs.

    Table and Figures | Reference | Related Articles | Metrics
    Review of interpretable deep knowledge tracing methods
    Jinxian SUO, Liping ZHANG, Sheng YAN, Dongqi WANG, Yawen ZHANG
    Journal of Computer Applications    2025, 45 (7): 2043-2055.   DOI: 10.11772/j.issn.1001-9081.2024070970
    Abstract421)   HTML32)    PDF (2726KB)(1537)       Save

    Knowledge Tracing (KT) is a cognitive diagnostic method aimed at simulating learner's mastery level of learned knowledge by analyzing learner's historical question answering records, ultimately predicting learner's future question answering performance. Knowledge tracing techniques based on deep neural network models have become a hot research topic in knowledge tracing field due to their strong feature extraction capabilities and superior prediction performance. However, deep learning-based knowledge tracing models often lack good interpretability. Clear interpretability enable learners and teachers to fully understand the reasoning process and prediction results of knowledge tracing models, thus facilitating the formulation of learning plans tailored to the current knowledge state for future learning, and enhance the trust of learners and teachers in knowledge tracing models at the same time. Therefore, interpretable Deep Knowledge Tracing (DKT) methods were reviewed. Firstly, the development of knowledge tracing and the definition as well as necessity of interpretability were introduced. Secondly, improvement methods proposed for solving the lack of interpretability in DKT models were summarized and listed from the perspectives of feature extraction and internal model enhancement. Thirdly, the related publicly available datasets for researchers were introduced, and the influences of dataset features on interpretability were analyzed, discussing how to evaluate knowledge tracing models from both performance and interpretability perspectives, and sorting out the performance of DKT models on different datasets. Finally, some possible future research directions to address current issues in DKT models were proposed.

    Table and Figures | Reference | Related Articles | Metrics
    Review on security threats and defense measures in federated learning
    Xuebin CHEN, Zhiqiang REN, Hongyang ZHANG
    Journal of Computer Applications    2024, 44 (6): 1663-1672.   DOI: 10.11772/j.issn.1001-9081.2023060832
    Abstract919)   HTML23)    PDF (1072KB)(1731)       Save

    Federated learning is a distributed learning approach for solving the data sharing problem and privacy protection problem in machine learning, in which multiple parties jointly train a machine learning model and protect the privacy of data. However, there are security threats inherent in federated learning, which makes federated learning face great challenges in practical applications. Therefore, analyzing the attacks faced by federation learning and the corresponding defensive measures are crucial for the development and application of federation learning. First, the definition, process and classification of federated learning were introduced, and the attacker model in federated learning was introduced. Then, the possible attacks in terms of both robustness and privacy of federated learning systems were introduced, and the corresponding defense measures were introduced as well. Furthermore, the shortcomings of the defense schemes were also pointed out. Finally, a secure federated learning system was envisioned.

    Table and Figures | Reference | Related Articles | Metrics
    Underwater image enhancement algorithm based on multi-scale perception and multi-dimensional space fusion
    Wei GUO, Manting WANG, Haicheng QU
    Journal of Computer Applications    2026, 46 (1): 224-232.   DOI: 10.11772/j.issn.1001-9081.2025010139
    Abstract145)   HTML3)    PDF (3529KB)(793)       Save

    To address problems caused by deep-sea imaging, such as color distortion, low contrast, and blurred structures in underwater images, an underwater image enhancement algorithm based on multi-scale perception and multi-dimensional space fusion was proposed. By combining spatial, channel, and three-dimensional features, image information was transmitted in parallel by the algorithm to a multi-dimensional feature extraction network and an encoder. Firstly, a multiscale feature refinement module was introduced into the feature extraction network to further process the extracted feature information, allowing the network to learn information at different scales more accurately. Secondly, a multidimensional color enhancement module was incorporated into the encoder to enhance image details and colors. Finally, an adaptive enhancement network was designed to further process the feature information and fuse multi-level features, then the decoder was used to generate the final enhanced image. Experimental results on public datasets demonstrate the outstanding performance of the proposed algorithm. Specifically, it achieves a Peak Signal-to-Noise Ratio (PSNR) of up to 24.865 1 dB and a Structural Similarity (SSIM) of 0.895 4, representing improvements of 1.580 6 dB and 0.039 8 over Hybrid Fusion Method (HFM), respectively, and it has the Underwater Color Image Quality Evaluation (UCIQE) and Underwater Image Quality Measure (UIQM) up to 0.593 1 and 3.102 8, respectively, surpassing HFM by 0.038 4 and 0.151 4, respectively. It can be seen that the proposed algorithm improves underwater visual quality effectively.

    Table and Figures | Reference | Related Articles | Metrics
    Incentive mechanism design for hierarchical federated learning based on multi-leader Stackelberg game
    Fangxing GENG, Zhuo LI, Xin CHEN
    Journal of Computer Applications    2023, 43 (11): 3551-3558.   DOI: 10.11772/j.issn.1001-9081.2022111727
    Abstract591)   HTML17)    PDF (2438KB)(2423)       Save

    The existence of privacy security and resource consumption issues in hierarchical federated learning reduces the enthusiasm of participants. To encourage a sufficient number of participants to actively participate in learning tasks and address the decision-making problem between multiple mobile devices and multiple edge servers, an incentive mechanism based on multi-leader Stackelberg game was proposed. Firstly, by quantifying the cost-utility of mobile devices and the payment of edge servers, a utility function was constructed, and an optimization problem was defined. Then, the interaction among mobile devices was modeled as an evolutionary game, and the interaction among edge servers was modeled as a non-cooperative game. To solve the optimal edge server selection and pricing strategy, a Multi-round Iterative Edge Server selection algorithm (MIES) and a Gradient Iterative Pricing Algorithm (GIPA) were proposed. The former was used to solve the evolutionary game equilibrium solution among mobile devices, and the latter was used to solve the pricing competition problem among edge servers. Experimental results show that compared with Optimal Pricing Prediction Strategy (OPPS), Historical Optimal Pricing Strategy (HOPS) and Random Pricing Strategy (RPS), GIPA can increase the average utility of edge servers by 4.06%, 10.08%, and 31.39% respectively.

    Table and Figures | Reference | Related Articles | Metrics
    Unsupervised cross-domain transfer network for 3D/2D registration in surgical navigation
    Xiyuan WANG, Zhancheng ZHANG, Shaokang XU, Baocheng ZHANG, Xiaoqing LUO, Fuyuan HU
    Journal of Computer Applications    2024, 44 (9): 2911-2918.   DOI: 10.11772/j.issn.1001-9081.2023091332
    Abstract717)   HTML3)    PDF (2025KB)(1667)       Save

    3D/2D registration is a key technique for intraoperative guidance. In existing deep learning based registration methods, image features were extracted through the network to regress the corresponding pose transformation parameters. This kind of method relies on real samples and their corresponding 3D labels for training, however, this part of expert-annotated medical data is scarce. In the alternative solution, the network was trained with Digital Reconstructed Radiography (DRR) images, which struggled to keep the original accuracy on Xray images due to the differences of image features across domains. For the above problems, an Unsupervised Cross-Domain Transfer Network (UCDTN) based on self-attention was designed. Without relying on Xray images and their 3D spatial labels as the training samples, the correspondence between the image features captured in the source domain and spatial transformations were migrated to the target domain. The public features were used to reduce the disparity of features between domains to minimize the negative impact of cross-domain. Experimental results show that the mTRE (mean Registration Target Error) of the result predicted by UCDTN is 2.66 mm, with a 70.61% reduction compared to the model without cross-domain transfer training, indicating the effectiveness of UCDTN in cross-domain registration tasks.

    Table and Figures | Reference | Related Articles | Metrics
    Survey of statistical heterogeneity in federated learning
    Hao YU, Jing FAN, Yihang SUN, Hua DONG, Enkang XI
    Journal of Computer Applications    2025, 45 (9): 2737-2746.   DOI: 10.11772/j.issn.1001-9081.2024091316
    Abstract313)   HTML20)    PDF (2650KB)(1147)       Save

    Federated learning is a distributed machine learning framework that emphasizes privacy protection. However, it faces significant challenges in addressing statistical heterogeneity. Statistical heterogeneity is come from differences in data distribution across participating nodes, which may lead to problems such as model update biases, performance degradation of the global model, and instability in convergence. Aiming at the above problems, firstly, main issues caused by statistical heterogeneity were analyzed in detail, including inconsistent feature distributions, imbalanced label distributions, asymmetrical data sizes, and varying data quality. Secondly, a systematic review of the existing solutions of statistical heterogeneity in federated learning was provided, including local correction, clustering methods, client selection optimization, aggregation strategy adjustments, data sharing, knowledge distillation, and decoupling optimization, with an evaluation of their advantages, disadvantages, and applicable scenarios. Finally, future related research directions were discussed, such as device computing capacity awareness, model heterogeneity adaptation, optimization of privacy security mechanisms, and enhancement of cross-task transferability, thereby providing references for addressing statistical heterogeneity in practical applications.

    Table and Figures | Reference | Related Articles | Metrics
    Personalized federated learning method based on model pre-assignment and self-distillation
    Kejia ZHANG, Zhijun FANG, Nanrun ZHOU, Zhicai SHI
    Journal of Computer Applications    2026, 46 (1): 10-20.   DOI: 10.11772/j.issn.1001-9081.2025010115
    Abstract136)   HTML14)    PDF (1406KB)(681)       Save

    Federated Learning (FL) is a distributed machine learning method that utilizes distributed data for model training while ensuring data privacy. However, it performs poorly in scenarios with highly heterogeneous data distributions. Personalized Federated Learning (PFL) addresses this challenge by providing personalized models for each client. However, the previous PFL algorithms primarily focus on optimizing client local models, while ignoring optimization of server global model. Consequently, server computational resources are not utilized fully. To overcome these limitations,FedPASD, a PFL method based on model pre-assignment and self-distillation, was proposed. FedPASD was operated in both server-side and client-side aspects. On server-side, client models for the next round were pre-assigned targetedly, which not only enhanced model personalization performance, but also utilized server computational resources effectively. On client-side, models were trained hierarchically and fine-tuned using self-distillation to better adapt to local data distribution characteristics. Experimental results on three datasets, CIFAR-10,Fashion-MNIST, and CIFAR-100 of comparing FedPASD with classic algorithms such as FedCP (Federated Conditional Policy),FedPAC (Personalization with feature Alignment and classifier Collaboration), and FedALA (Federated learning with Adaptive Local Aggregation) as baselines demonstrate that FedPASD achieves higher test accuracy than those of baseline algorithms under various data heterogeneity settings. Specifically,FedPASD achieves a test accuracy improvement of 29.05 to 29.22 percentage points over traditional FL algorithms and outperforms the PFL algorithms by 1.11 to 20.99 percentage points on CIFAR-100 dataset; on CIFAR-10 dataset,FedPASD achieves a maximum accuracy of 88.54%.

    Table and Figures | Reference | Related Articles | Metrics
    Review of event causality extraction based on deep learning
    WANG Zhujun, WANG Shi, LI Xueqing, ZHU Junwu
    Journal of Computer Applications    2021, 41 (5): 1247-1255.   DOI: 10.11772/j.issn.1001-9081.2020071080
    Abstract3247)      PDF (1460KB)(5196)       Save
    Causality extraction is a kind of relation extraction task in Natural Language Processing (NLP), which mines event pairs with causality from text by constructing event graph, and play important role in applications of finance, security, biology and other fields. Firstly, the concepts such as event extraction and causality were introduced, and the evolution of mainstream methods and the common datasets of causality extraction were described. Then, the current mainstream causality extraction models were listed. Based on the detailed analysis of pipeline based models and joint extraction models, the advantages and disadvantages of various methods and models were compared. Furthermore, the experimental performance and related experimental data of the models were summarized and analyzed. Finally, the research difficulties and future key research directions of causality extraction were given.
    Reference | Related Articles | Metrics
    Review of radar automatic target recognition based on ensemble learning
    Zirong HONG, Guangqing BAO
    Journal of Computer Applications    2025, 45 (2): 371-382.   DOI: 10.11772/j.issn.1001-9081.2024020179
    Abstract346)   HTML11)    PDF (1391KB)(2026)       Save

    Radar Automatic Target Recognition (RATR) has widespread applications in both domains of military and civilian. Due to the robustness caused by that ensemble learning improves model classification performance by integrating the existing machine learning models, ensemble learning has been applied in the field of radar target detection and recognition increasingly. The research progress of ensemble learning in RATR was discussed in detail through systematic sorting and refining the existing relevant literature. Firstly, the concept, framework, and development process of ensemble learning were introduced, ensemble learning was compared with traditional machine learning and deep learning methods, and the advantages, limitations, and main focuses of research of ensemble learning theory and common ensemble learning methods were summarized. Secondly, the concept of RATR was described briefly. Thirdly, the applications of ensemble learning in different radar image classification features were focused on, with a detailed discussion on target detection and recognition methods based on Synthetic Aperture Radar (SAR) and High-Resolution Range Profile (HRRP), and the research progress and application effect of these methods were summed up. Finally, the challenges faced by RATR and ensemble learning were discussed, and the applications of ensemble learning in the field of radar target recognition were prospected.

    Table and Figures | Reference | Related Articles | Metrics
    Multi-dynamic aware network for unaligned multimodal language sequence sentiment analysis
    Junhao LUO, Yan ZHU
    Journal of Computer Applications    2024, 44 (1): 79-85.   DOI: 10.11772/j.issn.1001-9081.2023060815
    Abstract351)   HTML13)    PDF (1299KB)(1606)       Save

    Considering the issue that the word alignment methods commonly used in the existing methods for aligned multimodal language sequence sentiment analysis lack interpretability, a Multi-Dynamic Aware Network (MultiDAN) for unaligned multimodal language sequence sentiment analysis was proposed. The core of MultiDAN was multi-layer and multi-angle extraction of dynamics. Firstly, Recurrent Neural Network (RNN) and attention mechanism were used to capture the dynamics within the modalities; secondly, intra- and inter-modal, long- and short-term dynamics were extracted at once using Graph Attention neTwork (GAT); finally, the intra- and inter-modal dynamics of the nodes in the graph were extracted again using a special graph readout method to obtain a unique representation of the multimodal language sequence, and the sentiment score of the sequence was obtained by applying a MultiLayer Perceptron (MLP) classification. The experimental results on two commonly used publicly available datasets, CMU-MOSI and CMU-MOSEI, show that MultiDAN can fully extract the dynamics, and the F1 values of MultiDAN on the two unaligned datasets improve by 0.49 and 0.72 percentage points respectively, compared to the optimal Modal-Temporal Attention Graph (MTAG) in the comparison methods, which have high stability. MultiDAN can improve the performance of sentiment analysis for multimodal language sequences, and the Graph Neural Network (GNN) can effectively extract intra- and inter-modal dynamics.

    Table and Figures | Reference | Related Articles | Metrics
    Construction and benchmark detection of multimodal partial forgery dataset
    Shengyou ZHENG, Yanxiang CHEN, Zuxing ZHAO, Haiyang LIU
    Journal of Computer Applications    2024, 44 (10): 3134-3140.   DOI: 10.11772/j.issn.1001-9081.2023101506
    Abstract289)   HTML5)    PDF (1323KB)(935)       Save

    Aiming at the lack of multimodal forgery scenarios and partial forgery scenarios in existing video forgery datasets, a multimodal partial forgery dataset with adjustable forgery ratios — PartialFAVCeleb was constructed by using a wide varieties of audio and video forgery methods. The proposed dataset was based on the FakeAVCeleb multimodal forgery dataset and was with the real and forged data spliced, in which the forgery data were generated by four methods, that is, FaceSwap, FSGAN (Face Swapping Generative Adversarial Network), Wav2Lip (Wave to Lip), and SV2TTS (Speaker Verification to Text-To-Speech). In the splicing process, probabilistic methods were used to generate the locations of the forgery segments in the time domain and modality, then the boundary was randomized to fit the actual forged scenario. And, the phenomenon of background hopping was avoided through material screening. The finally obtained dataset contains forgery videos of different ratios, with one ratio corresponding to 3 970 video data. In the benchmark detection, several audio and video feature extractors were used. And the data was tested in strong supervised and weakly-supervised conditions respectively, and Hierarchical Multi-Instance Learning (HMIL) method was used to realize the latter condition. As the test results indicate, for each test model, the performance on data with low forgery ratio is significantly inferior to that on data with high forgery ratio, and the performance under weakly-supervised condition is significantly inferior to that under strong supervised condition. The difficulty of weakly-supervised detection of proposed partial forgery dataset is verified. Experimental results show that the multimodal partial forgery scenario represented by the proposed dataset has sufficient research value.

    Table and Figures | Reference | Related Articles | Metrics
    Generative adversarial network underwater image enhancement model based on Swin Transformer
    Hui LI, Bingzhi JIA, Chenxi WANG, Ziyu DONG, Jilong LI, Zhaoman ZHONG, Yanyan CHEN
    Journal of Computer Applications    2025, 45 (5): 1439-1446.   DOI: 10.11772/j.issn.1001-9081.2024050730
    Abstract261)   HTML5)    PDF (3642KB)(724)       Save

    Aiming at the problems of low contrast, heavy noise and color deviation in underwater images, using Generative Adversarial Network (GAN) model as the core framework, a new underwater image enhancement model was proposed based on GAN, namely SwinGAN (GAN based on Swin Transformer). Firstly, the generative network was designed according to the encoder-bottleneck-decoder structure, where the input feature maps were divided into multiple non-overlapping local windows at the bottleneck layer. Secondly, a Dual-path Window Multi-head Self-Attention mechanism(DWMSA) was introduced to enhance local attention while simultaneously capturing global information and long-range dependencies. Finally, the decoder recombined the multiple windows back into the original size feature maps, and the discriminator network employed a Markov discriminator. Compared to the URSCT-SESR model, SwinGAN model shows an improvement of 0.837 2 dB in Peak Signal-to-Noise Ratio (PSNR) and 0.003 6 in Structural SIMilarity index (SSIM) on the UFO-120 dataset. On the EUVP-515 dataset, SwinGAN model achieves more significant improvement, with a 0.843 9 dB boost in PSNR, an increase of 0.005 1 in SSIM, an enhancement of 0.112 4 in Underwater Image Quality Measure (UIQM), and a slight increase of 0.001 0 in Underwater Color Image Quality Evaluation (UCIQE). Experimental results demonstrate that the SwinGAN model excels in both subjective and objective evaluation metrics, achieving notable improvements in correcting color deviation in underwater images.

    Table and Figures | Reference | Related Articles | Metrics
    Knowledge tracking model based on concept association memory network with graph attention
    Fan HE, Li LI, Zhongxu YUAN, Xiu YANG, Dongxuan HAN
    Journal of Computer Applications    2026, 46 (1): 43-51.   DOI: 10.11772/j.issn.1001-9081.2025010065
    Abstract146)   HTML8)    PDF (1369KB)(744)       Save

    Tracking students' historical interactions to predict their future performance is a critical research focus in the field of Knowledge Tracing (KT). Recent KT methods aim to explore students' learning patterns and evolving knowledge states to provide personalized learning guidance, but ignore the richness of exercises themselves. Additionally, with the emergence of new disciplines and interdisciplinary fields, Graph Neural Network (GNN) -based KT methods face challenges in the issues such as broadening the scope of concept associations and modeling students' learning behaviors. To address these challenges, a novel Knowledge Tracing (KT) model was proposed, termed the Knowledge Tracking model based on concept association Memory network with Graph Attention (GAMKT). The GAMKT is capable of modeling students' exercise interaction sequences, tracking their knowledge states, and capturing global features of related concepts from the exercise-concept graph. Moreover, a forgetting gate and higher-order information extraction were incorporated into the model to realistically simulate students' exercise-solving processes. Experimental results on the Junyi, ASSIST09, and Static2011 datasets demonstrate that, compared with seven baseline models including Graph-based Knowledge Tracing (GKT), GAMKT achieves average improvements of approximately 2.1% in Area Under the Curve (AUC) and 2.4% in Accuracy, indicating that GAMKT outperforms the baseline methods on datasets with well-structured knowledge.

    Table and Figures | Reference | Related Articles | Metrics
    Multi-objective exam paper generation guided by reinforcement learning and matrix completion
    Changzheng XING, Junfeng LIANG, Haibo JIN, Jiayu XU, Hairong WU
    Journal of Computer Applications    2025, 45 (1): 48-58.   DOI: 10.11772/j.issn.1001-9081.2024010010
    Abstract259)   HTML5)    PDF (3169KB)(926)       Save

    In view of the problem that the existing exam paper generation technologies pay too much attention to the difficulty of generated exam papers, while ignoring other related objectives, such as quality, score distribution, and skill coverage, a multi-objective exam paper generation method guided by reinforcement learning and matrix completion was proposed to optimize the specific objectives in the field of exam paper generation. Firstly, deep knowledge tracking method was used to model the interaction information among students and response logs in order to obtain the skill proficiency of the student group. Secondly, matrix factorization and matrix completion methods were used to predict the scores of students' undone exercises. Finally, based on the multi-objective exam paper generation strategy, in order to improve the Q network update efficiency, an Exam Q-Network function approximator was designed to select the appropriate question set automatically for update of the exam paper composition. Experimental results show that compared with the models such as DEGA (Diseased-Enhanced Genetic Algorithm) and SSA-GA (Sparrow Search Algorithm - Genetic Algorithm), it is verified that the proposed model has significant effect in solving multiple dilemmas of exam paper generation scenarios in terms of three indicators — difficulty, rationality and accuracy. The effect of verifying the models mentioned in the solution of the test papers is significantly effective.

    Table and Figures | Reference | Related Articles | Metrics
    Edge federation dynamic analysis for hierarchical federated learning based on evolutionary game
    Yufei XIANG, Zhengwei NI
    Journal of Computer Applications    2025, 45 (4): 1077-1085.   DOI: 10.11772/j.issn.1001-9081.2024040428
    Abstract285)   HTML3)    PDF (2452KB)(436)       Save

    To address the issue that limited edge resources of the existing Edge Server Providers (ESPs) reduce the Quality of Service (QoS)of hierarchical federated learning edge nodes, a dynamic Edge Federated Framework (EFF)was proposed by considering the potential edge federation probability among edge servers. In the proposed framework, different ESPs cooperated to provide additional edge resources for hierarchical federated learning, which suffered from reduced model training efficiency due to client heterogeneity or Non-Independent and Identically Distributed (Non-IID)data. Firstly, decisions were offloaded by quantifying the communication model, and offloading tasks were assigned to the edge servers of other ESPs within the framework, so as to meet the elastic demand of edge resources. Secondly, the Multi-round Iterative EFF Participation Strategy (MIEPS)algorithm was used to solve the evolutionary game equilibrium solution among ESPs, thereby finding an appropriate resource allocation strategy. Finally, the existence, uniqueness, and stability of the equilibrium point were validated through theoretical and simulation experiments. Experimental results show that compared to non-federation and pairwise federation strategies, the tripartite EFF constructed using MIEPS algorithm improves the prediction accuracy of the global model trained on Independent and Identically Distributed (IID) datasets by 1.5 and 1.0 percentage points, respectively, and the prediction accuracy based on Non-IID datasets by 2.1 and 0.7 percentage points, respectively. Additionally, by changing the resource allocation method of ESP, it is validated that EFF can distribute the rewards of ESP fairly, encouraging more ESPs to participate and forming a positive cooperation environment.

    Table and Figures | Reference | Related Articles | Metrics
2026 Vol.46 No.2

Current Issue
Archive
Honorary Editor-in-Chief: ZHANG Jingzhong
Editor-in-Chief: XU Zongben
Associate Editor: SHEN Hengtao XIA Zhaohui
Domestic Post Distribution Code: 62-110
Foreign Distribution Code: M4616
Address:
No. 9, 4th Section of South Renmin Road, Chengdu 610041, China
Tel: 028-85224283-803
  028-85222239-803
Website: www.joca.cn
E-mail: bjb@joca.cn
WeChat
Join CCF