Search Result

Select

Construction of digital twin water conservancy knowledge graph integrating large language model and prompt learning

Yan YANG, Feng YE, Dong XU, Xuejie ZHANG, Jin XU

Journal of Computer Applications 2025, 45 (3): 785-793. DOI: 10.11772/j.issn.1001-9081.2024050570

Abstract （71）

HTML （6）

PDF （2950KB）（36）

Save

Constructing digital twin water conservancy construction knowledge graph to mine the potential relationships between water conservancy construction objects can help the relevant personnel to optimize the water conservancy construction design scheme and decision-making process. Aiming at the interdisciplinary and complex knowledge structure of digital twin water conservancy construction， and the problems such as insufficient learning and low extraction accuracy of knowledge of general knowledge extraction models in water conservancy domain， a Digital Twin water conservancy construction Knowledge Extraction method based on Large Language Model （DTKE-LLM） was proposed to improve the accuracy of knowledge extraction. In this method， by deploying local Large Language Model （LLM） through LangChain and integrating digital twin water conservancy domain knowledge， prompt learning was used to fine-tune the LLM. In the LLM， semantic understanding and generation capabilities were utilized to extract knowledge. At the same time， a heterogeneous entity alignment strategy was designed to optimize the entity extraction results. Comparison experiments and ablation experiments were carried out on the water conservancy domain corpus to verify the effectiveness of DTKE-LLM. Results of the comparison experiments demonstrate that DTKE-LLM outperforms the deep learning-based BiLSTM-CRF （Bidirectional Long Short-Term Memory Conditional Random Field） named entity recognition model and the general Information extraction model UIE （Universal Information Extraction） in precision. Results of the ablation experiments show that compared with the ChatGLM2-6B （Chat Generative Language Model 2.6 Billion）， DTKE-LLM has the F1 scores of entity extraction and relation extraction improved by 5.5 and 3.2 percentage points respectively. It can be seen that the proposed method realizes the construction of digital twin water conservancy construction knowledge graph on the basis of ensuring the quality of knowledge graph construction.

Table and Figures | Reference | Related Articles | Metrics

Select

Molecular toxicity prediction based on meta graph isomorphism network

Yunchuan HUANG, Yongquan JIANG, Juntao HUANG, Yan YANG

Journal of Computer Applications 2024, 44 (9): 2964-2969. DOI: 10.11772/j.issn.1001-9081.2023091286

Abstract （208）

HTML （7）

PDF （1150KB）（232）

Save

To obtain more accurate molecular toxicity prediction results， a molecular toxicity prediction model based on meta Graph Isomorphism Network （GIN） was proposed， namely Meta-MTP. Firstly， graph isomorphism neural network was used to obtain molecular characterization by using atoms as nodes， bonds as edges， and molecules as graph structures. The pre-trained model was used to initialize the GIN to obtain better parameters. A feedforward Transformer incorporating layer-wise attention and local enhancement was introduced. Atom type prediction and bond prediction were used as auxiliary tasks to extract more internal molecular information. The model was trained through a meta learning dual-level optimization strategy. Finally， the model was trained using Tox21 and SIDER datasets. Experimental results on Tox21 and SIDER datasets show that Meta-MTP has good molecular toxicity prediction ability. When the number of samples is 10， compared to FSGNNTR （Few-Shot Graph Neural Network-TRansformer） model in all tasks， the Area Under the ROC Curve （AUC） of Meta-MTP is improved by 1.4% and 5.4% respectively. Compared to three traditional graph neural network models， Graph Isomorphism Network （GIN）， Graph Convolutional Network （GCN）， and Graph Sample and AGgrEgate （GraphSAGE）， the AUC of Meta-MTP improves by 18.3%-23.7% and 7.3%-22.2% respectively.

Table and Figures | Reference | Related Articles | Metrics

Select

Multi-granularity abrupt change fitting network for air quality prediction

Qianhong SHI, Yan YANG, Yongquan JIANG, Xiaocao OUYANG, Wubo FAN, Qiang CHEN, Tao JIANG, Yuan LI

Journal of Computer Applications 2024, 44 (8): 2643-2650. DOI: 10.11772/j.issn.1001-9081.2023081169

Abstract （155）

HTML （2）

PDF （1283KB）（35）

Save

Air quality data， as a typical spatio-temporal data， exhibits complex multi-scale intrinsic characteristics and has abrupt change problem. Concerning the problem that existing air quality prediction methods perform poorly when dealing with air quality prediction tasks containing large amount of abrupt change， a Multi-Granularity abrupt Change Fitting Network （MACFN） for air quality prediction was proposed. Firstly， multi-granularity feature extraction was first performed on the input data according to the periodicity of air quality data in time. Then， a graph convolution network and a temporal convolution network were used to extract the spatial correlation and temporal dependence of the air quality data， respectively. Finally， to reduce the prediction error， an abrupt change fitting network was designed to adaptively learn the abrupt change part of the data. The proposed network was experimentally evaluated on three real air quality datasets， and the Root Mean Square Error （RMSE） decreased by about 11.6%， 6.3%， and 2.2% respectively， when compared to the Multi-Scale Spatial Temporal Network （MSSTN）. The experimental results show that MACFN can efficiently capture complex spatio-temporal relationships and performs better in the task of predicting air quality that is prone to abrupt change with a large magnitude of variability.

Table and Figures | Reference | Related Articles | Metrics

Select

Review of end-to-end person search algorithms based on images

Cui WANG, Miaolei DENG, Dexian ZHANG, Lei LI, Xiaoyan YANG

Journal of Computer Applications 2024, 44 (8): 2544-2550. DOI: 10.11772/j.issn.1001-9081.2023081195

Abstract （34）

HTML （2）

PDF （1456KB）（11）

Save

Person search is one of the important research directions in the field of computer vision. Its research goal is to detect and identify characters in uncropped image libraries. In order to deeply understand the person search algorithms， a large number of related literature were summarized and analyzed. First of all， according to the network structure， the person search algorithms were divided into two categories： two-step methods and end-to-end one-step methods. The key technologies of the one-step methods， feature learning and measurement learning， were analyzed and introduced. The datasets and evaluation indicators in the field of person search were discussed， and the performance comparison and analysis of the mainstream algorithms were given. The experimental results show that， although the two-step methods have good performance， most of them have high calculation costs and take long time； the one-step methods can solve the two sub-tasks pedestrian detection and person re-identification， in a more efficient learning framework and achieve better results. Finally， the person search algorithms were summarized and their future development directions were prospected.

Table and Figures | Reference | Related Articles | Metrics

Select

User plagiarism identification scheme in social network under blockchain

Li LI, Chunyan YANG, Jiangwen ZHU, Ronglei HU

Journal of Computer Applications 2024, 44 (1): 242-251. DOI: 10.11772/j.issn.1001-9081.2023010031

Abstract （262）

HTML （13）

PDF （4508KB）（88）

Save

To address the problem of difficulty in identifying user plagiarism in social networks and to protect the rights of original authors while holding users accountable for plagiarism actions， a plagiarism identification scheme for social network users under blockchain was proposed. Aiming at the lack of universal tracing model in existing blockchain， a blockchain-based traceability information management model was designed to record user operation information and provide a basis for text similarity detection. Based on the Merkle tree and Bloom filter structures， a new index structure BHMerkle was designed. The calculation overhead of block construction and query was reduced， and the rapid positioning of transactions was realized. At the same time， a multi-feature weighted Simhash algorithm was proposed to improve the precision of word weight calculation and the efficiency of signature value matching stage. In this way， malicious users with plagiarism cloud be identified， and the occurrence of malicious behavior can be curbed through the reward and punishment mechanism. The average precision and recall of the plagiarism detection scheme on news datasets with different topics were 94.8% and 88.3%， respectively. Compared with multi-dimensional Simhash algorithm and Simhash algorithm based on information Entropy weighting （E-Simhash）， the average precision was increased by 6.19 and 4.01 percentage points respectively， the average recall was increased by 3.12 and 2.92 percentage points respectively. Experimental results show that the proposed scheme improves the query and detection efficiency of plagiarism text， and has high accuracy in plagiarism identification.

Table and Figures | Reference | Related Articles | Metrics

Select

Scene graph-aware cross-modal image captioning model

Zhiping ZHU, Yan YANG, Jie WANG

Journal of Computer Applications 2024, 44 (1): 58-64. DOI: 10.11772/j.issn.1001-9081.2022071109

Abstract （386）

HTML （16）

PDF （1879KB）（233）

Save

Aiming at the forgetting and underutilization of the text information of image in image captioning methods， a Scene Graph-aware Cross-modal Network （SGC-Net） was proposed. Firstly， the scene graph was utilized as the image’s visual features， and the Graph Convolutional Network （GCN） was utilized for feature fusion， so that the visual and textual features were in the same feature space. Then， the text sequence generated by the model was stored， and the corresponding position information was added as the textual features of the image， so as to solve the problem of text feature loss brought by the single-layer Long Short-Term Memory （LSTM） Network. Finally， to address the issue of over dependence on image information and underuse of text information， the self-attention mechanism was utilized to extract significant image information and text information and fuse then. Experimental results on Flickr30K and MS-COCO （MicroSoft Common Objects in COntext） datasets demonstrate that SGC-Net outperforms Sub-GC on the indicators BLEU1 （BiLingual Evaluation Understudy with 1-gram）， BLEU4 （BiLingual Evaluation Understudy with 4-grams）， METEOR （Metric for Evaluation of Translation with Explicit ORdering）， ROUGE （Recall-Oriented Understudy for Gisting Evaluation） and SPICE （Semantic Propositional Image Caption Evaluation） with the improvements of 1.1，0.9，0.3，0.7，0.4 and 0.3， 0.1， 0.3， 0.5， 0.6， respectively. It can be seen that the method used by SGC-Net can increase the model’s image captioning performance and the fluency of the generated description effectively.

Table and Figures | Reference | Related Articles | Metrics

Select

Multi-view clustering network with deep fusion

Ziyi HE, Yan YANG, Yiling ZHANG

Journal of Computer Applications 2023, 43 (9): 2651-2656. DOI: 10.11772/j.issn.1001-9081.2022091394

Abstract （613）

HTML （58）

PDF （1074KB）（405）

Save

Current deep multi-view clustering methods have the following shortcomings： 1） When feature extraction is carried out for a single view， only attribute information or structural information of the samples is considered， and these two types of information are not integrated. Thus， the extracted features cannot fully represent latent structure of the original data. 2） Feature extraction and clustering were divided into two separated processes， without establishing the relationship between them， so that the feature extraction process cannot be optimized by the clustering process. To solve these problems， a Deep Fusion based Multi-view Clustering Network （DFMCN） was proposed. Firstly， the embedding space of each view was obtained by combining autoencoder and graph convolution autoencoder to fuse attribute information and structure information of samples. Then， the embedding space of the fusion view was obtained through weighted fusion， and clustering was carried out in this space. And in the process of clustering， the feature extraction process was optimized by a two-layer self-supervision mechanism. Experimental results on FM （Fashion-MNIST）， HW （HandWritten numerals）， and YTF （YouTube Face） datasets show that the accuracy of DFMCN is higher than those of all comparison methods； and DFMCN has the accuracy increased by 1.80 percentage points compared with the suboptimal CMSC-DCCA （Cross-Modal Subspace Clustering via Deep Canonical Correlation Analysis） method on FM dataset， the Normalized Mutual Information （NMI） of DFMCN is increased by 1.26 to 14.84 percentage points compared to all methods except for CMSC-DCCA and DMSC （Deep Multimodal Subspace Clustering networks）. Experimental results verify the effectiveness of the proposed method.

Table and Figures | Reference | Related Articles | Metrics

Select

Feature selection algorithm for imbalanced data based on pseudo-label consistency

Yiheng LI, Chenxi DU, Yanyan YANG, Xiangyu LI

Journal of Computer Applications 2022, 42 (2): 475-484. DOI: 10.11772/j.issn.1001-9081.2021050957

Abstract （489）

HTML （22）

PDF （921KB）（126）

Save

Aiming at the problem that most algorithms of granular computing ignore the class-imbalance of data， a feature selection algorithm integrating pseudo-label strategy was proposed to deal with class-imbalanced data. Firstly， to investigate feature selection from class-imbalanced data conveniently， the sample consistency and dataset consistency were re-defined， and the corresponding greedy forward search algorithm for feature selection was designed. Then， the pseudo-label strategy was introduced to balance the class distribution of the data. By integrating the learned pseudo-label of a sample into consistency measure， the pseudo-label consistency was defined to estimate the features of the class-imbalanced dataset. Finally， an algorithm for Pseudo-Label Consistency based Feature Selection （PLCFS） for class-imbalanced data was developed based on the preservation of the pseudo-label consistency measure for the class-imbalanced dataset. Experimental results indicate that the proposed PLCFS has the performance only lower than max-Relevancy and Min-Redundancy （mRMR） algorithm， but outperforms Relief algorithm and algorithm for Consistency-based Feature Selection （CFS）.

Table and Figures | Reference | Related Articles | Metrics

Select

Adaptive deep graph convolution using initial residual and decoupling operations

Jijie ZHANG, Yan YANG, Yong LIU

Journal of Computer Applications 2022, 42 (1): 9-15. DOI: 10.11772/j.issn.1001-9081.2021071289

Abstract （558）

HTML （41）

PDF （648KB）（286）

Save

The traditional Graph Convolutional Network （GCN） and many of its variants achieve the best effect in the shallow layers， and do not make full use of higher-order neighbor information of nodes in the graph. The subsequent deep graph convolution models can solve the above problem， but inevitably generate the problem of over-smoothing， which makes the models impossible to effectively distinguish different types of nodes in the graph. To address this problem， an adaptive deep graph convolution model using initial residual and decoupling operations， named ID-AGCN （model using Initial residual and Decoupled Adaptive Graph Convolutional Network）， was proposed. Firstly， the node’s representation transformation as well as feature propagation was decoupled. Then， the initial residual was added to the node’s feature propagation process. Finally， the node representations obtained from different propagation layers were combined adaptively， appropriate local and global information was selected for each node to obtain node representations containing rich information， and a small number of labeled nodes were used for supervised training to generate the final node representations. Experimental result on three datasets Cora， CiteSeer and PubMed indicate that the classification accuracy of ID-AGCN is improved by about 3.4 percentage points， 2.3 percentage points and 1.9 percentage points respectively， compared with GCN. The proposed model has superiority in alleviating over-smoothing.

Table and Figures | Reference | Related Articles | Metrics

Select

Transfer learning based on graph convolutional network in bearing service fault diagnosis

Xueying PENG, Yongquan JIANG, Yan YANG

Journal of Computer Applications 2021, 41 (12): 3626-3631. DOI: 10.11772/j.issn.1001-9081.2021060974

Abstract （464）

HTML （8）

PDF （561KB）（506）

Save

Deep learning methods are widely used in bearing fault diagnosis， but in actual engineering applications， real service fault data during bearing service are not easily collected and lack of data labels， which is difficult to train adequately. Focused on the difficulty of bearing service fault diagnosis， a transfer learning model based on Graph Convolutional Network （GCN） in bearing service fault diagnosis was proposed. In the model， the fault knowledge was learned from artificially simulated damage fault data with sufficient data and transferred to real service faults， so as to improve the diagnostic accuracy of service faults. Specifically， the original vibration signals of artificially simulated damage fault data and service fault data were converted into the time-frequency maps with both time and frequency information through wavelet transform， and the obtained maps were input into graph convolutional layers for learning， so as to effectively extract the fault feature representations in the source and target domains. Then the Wasserstein distance between the data distributions of source domain and target domain was calculated to measure the difference between two data distributions， and a fault diagnosis model that can diagnose bearing service faults was constructed by minimizing the difference in data distribution. A variety of different tasks were designed for experiments with different bearing failure data sets and different operating conditions. Experimental results show that the proposed model has the ability to diagnose bearing service faults and also can be transferred from one working condition to another， and perform fault diagnosis between different component types and different working conditions.

Table and Figures | Reference | Related Articles | Metrics

Select

Extractive and abstractive summarization model based on pointer-generator network

Wei CHEN, Yan YANG

Journal of Computer Applications 2021, 41 (12): 3527-3533. DOI: 10.11772/j.issn.1001-9081.2021060899

Abstract （352）

HTML （11）

PDF （562KB）（91）

Save

As a hot issue in natural language processing， summarization generation has important research significance. The abstractive method based on Seq2Seq （Sequence-to-Sequence） model has achieved good results， however， the extractive method has the potential of mining effective features and extracting important sentences of articles， so it is a good research direction to improve the abstractive method by using extractive method. In view of this， a fusion model of abstractive method and extractive method was proposed. Firstly， incorporated with topic similarity， TextRank algorithm was used to extract significant sentences from the article. Then， an abstractive framework based on the Seq2Seq model integrating the semantics of extracted information was designed to implement the summarization task； at the same time， pointer-generator network was introduced to solve the problem of Out-Of-Vocabulary （OOV）. Based on the above steps， the final summary was obtained and verified on the CNN/Daily Mail dataset. The results show that on all the three indexes ROUGE-1， ROUGE-2 and ROUGE-L， the proposed model is better than the traditional TextRank algorithm； meanwhile， the effectiveness of fusing extractive method and abstractive method in the field of summarization is also verified.

Table and Figures | Reference | Related Articles | Metrics

Select

Personal relation extraction based on text headline

YAN Yang, ZHAO Jiapeng, LI Quangang, ZHANG Yang, LIU Tingwen, SHI Jinqiao

Journal of Computer Applications 2016, 36 (3): 726-730. DOI: 10.11772/j.issn.1001-9081.2016.03.726

Abstract （816）

PDF （754KB）（736）

Save

In order to overcome the non-person entity's interference, the difficulties in selection of feature words and muti-person influence on target personal relation extraction, this paper proposed person judgment based on decision tree, relation feature word generation based on minimum set cover and statistical approach based on three-layer sentence pattern rules. In the first step, 18 features were extracted from attribute files of China Conference on Machine Learning (CCML) competition 2015, C4.5 decision was used as the classifier, then 98.2% of recall rate and 92.6% of precision rate were acquired. The results of this step were used as the next step's input. Next, the algorithm based on minimum set cover was used. The feature word set covers all the personal relations as the scale of feature word set is maintained at a proper level, which is used to identify the relation type in text headline. In the last step, a method based on statistics of three-layer sentence pattern rules was used to filter small proportion rules and specify the sentence pattern rules based on positive and negative proportions to judge whether the personal relation is correct or not. The experimental result shows the approach acquires 82.9% in recall rate and 74.4% in precision rate and 78.4% in F1-measure, so the proposed method can be applied to personal relation extraction from text headlines, which helps to construct personal relation knowledge graph.

Reference | Related Articles | Metrics

Select

Solving parameters of Van Genuchten equation by improved harmony search algorithm

XING Chang-ming DAI Yan YANG Lin

Journal of Computer Applications 2012, 32 (08): 2159-2164. DOI: 10.3724/SP.J.1087.2012.02159

Abstract （966）

PDF （853KB）（470）

Save

Van Genuchten equation is the most commonly used soil water characteristic curve equation, and its parameter value precision is the key to the use of the equation. In order to solve these parameters accurately, the Harmony Search (HS) algorithm was introduced, and a new HS algorithm based on the current global information named IGHS was proposed. IGHS algorithm has the following characteristics: firstly, IGHS employs a new method for generating new solution vectors, which uses the current global optimum in the harmony memory. Secondly, in order to avoid premature and enhance global search ability, IGHS disturbs the current global optimum at a certain probability. Lastly, the algorithm is simple, and easy to implement. The experimental results show that the solution accuracy of IGHS is similar to the random Particle Swarm Optimization (PSO) algorithm, but the convergence of IGHS is faster than PSO and the calculated amount is smaller, so IGHS can be used as a new method to calculate Van Genuchten equation parameters.

Reference | Related Articles | Metrics

Select

Tracking algorithm for moving objects based on gradient and color

LIU Hai-yan YANG Chang-yu LIU Chun-ling ZHANG Jin

Journal of Computer Applications 2012, 32 (05): 1265-1268.

Abstract （966）

PDF （2239KB）（911）

Save

Since there are deficiencies in tracking moving object based on either color feature or gradient feature under complex background, a new algorithm CG_CamShift was proposed with the combination of the two features. This algorithm made full use of the color histogram description of the overall goal and the gradient orientation histogram description of the structural information, and predicted the position of the moving object in combination with the Kalman filtering. It resolved the problem of losing object caused by illumination and shading under complicated background. The experimental results show that the algorithm enhances the tracking accuracy while guaranteeing the real-time performance. In addition, it has stronger robustness.

Reference | Related Articles | Metrics

Select

Sequential minimal optimization algorithm for bilateral-weighted fuzzy support vector machine

LI Yan YANG Xiao-wei

Journal of Computer Applications 2011, 31 (12): 3297-3301.

Abstract （1131）

PDF （752KB）（609）

Save

High computational complexity limits the applications of the Bilateral-Weighted Fuzzy Support Vector Machine (BW-FSVM) model in practical classification problems. In this paper, the Sequential Minimal Optimization (SMO) algorithm,which firstly decomposed the overall Quadratic Program (QP) problem into the smallest possible QP sub-problems and then solved these QP sub-problems analytically, was proposed to reduce the computational complexity of the BW-FSVM model. A set of experiments were conducted on three real world benchmarking datasets and two artificial datasets to test the performance of the SMO algorithm. The results indicate that compared with the traditional interior point algorithm, the SMO algorithm can reduce significantly the computational complexity of the BW-FSVM model without influencing the testing accuracy, and makes it possible for the BW-FSVM model to be applied to practical classification problems with outliers or noises.

Related Articles | Metrics

Select

Fatigue pattern recognition of human face based on Gabor wavelet transform

Fen-hua CHENG Hai-yan YANG

Journal of Computer Applications 2011, 31 (08): 2119-2122.

Abstract （1202）

PDF （682KB）（735）

Save

Fatigue is one of the main factors that cause traffic accidents. A new method for monitoring fatigue state based on Gabor wavelet transform was proposed. In this method, the frequent patterns mining algorithm was designed to mine the fatigue patterns of fatigue facial image sequences during the training phase first. And then, during the fatigue recognition phase, the face image sequence to be detected was represented by fused feature sequence through Gabor wavelet transform. Afterwards, the classification algorithm was used for fatigue detection of the human face sequence. The simulation results on 500 fatigue images sampled by the authors show that the proposed algorithm achieves 92.8% in right detection rate and 0.02% in error detection rate, and outperforms than some similar method.

Reference | Related Articles | Metrics

Select

Mining frequent closed patterns over data stream

Wen-liang RONG Yan YANG

Journal of Computer Applications

Abstract （1724）

PDF （636KB）（974）

Save

Recently, frequent closed patterns mining has been an important method to replace the frequent patterns mining. According to the features of data stream, a new algorithm called DSFC_Mine was proposed to solve the problem of mining the frequent closed patterns from sliding window. The basic window of a sliding window was served as an updating unit in this algorithm. And all potential frequent closed patterns of every basic window were mined by the improved CHARM algorithm. Those patterns were stored in a new data structure. And the frequent closed patterns in a sliding window could be rapidly found based on the new data structure. The experimental result shows the feasibility and effectiveness of the algorithm.

Related Articles | Metrics

Select

Binarization algorithms of fingerprint images based on orientation and fuzzy theory

Hong-Yan YANG

Journal of Computer Applications

Abstract （1644）

PDF （590KB）（1045）

Save

Based on the research on binarization algorithms of fingerprint images, a new binarization algorithm of fingerprint images based on the orientation and fuzzy theory was put forward. The method made full use of the orientation and the characteristics of grayscale change and demonstrated the excellent capability of noise resistance.

Related Articles | Metrics

Select

Algorithm for choosing ARQ feedback types based on IEEE 802.16

CAI Cang-Fu Yan YANG

Journal of Computer Applications

Abstract （2903）

PDF （583KB）（680）

Save

ARQ mechanism in IEEE 802.16 can resolve the problems for data transmission on wireless link, but there is no appropriate solution for choosing the ARQ feedback types. The key to the algorithm is to choose an appropriate feedback type to send feedback messages for high resource utilization according to wireless link. Therefore, a new algorithm for choosing proper feedback types in real time was proposed. The simulation results show that the new algorithm can increase resource utilization.

Related Articles | Metrics