Search Result

Select

Headline generation model with position embedding for knowledge reasoning

Yalun WANG, Yangsen ZHANG, Siwen ZHU

Journal of Computer Applications 2025, 45 (2): 345-353. DOI: 10.11772/j.issn.1001-9081.2024030281

Abstract （168）

HTML （30）

PDF （1511KB）（459）

Save

As the smallest semantic unit， sememe is crucial for headline generation task. Although Sememe-Driven Language Model （SDLM） is one of the mainstream models， it has limited encoding capability when dealing with long text sequences， does not fully consider positional relationships， and is prone to introduce noisy knowledge to affect the quality of generated headlines. To address the above problems， a Transformer-based generative headline model was proposed， namely Tran-A-SDLM （Transformer Adaption based Sememe-Driven Language Model with positional embedding and knowledge reasoning）， which fully combined the advantages of adaptive position embedding and knowledge reasoning mechanism. Firstly， Transformer model was introduced to enhance the model’s encoding capability for text sequences. Secondly， the adaptive positional embedding mechanism was utilized to enhance the model’s positional awareness capability， thereby improving the learning of contextual sememe knowledge. In addition， a knowledge reasoning module was introduced for representing the sememe knowledge and guiding the model to generate accurate headlines. Finally， to demonstrate the superiority of Tran-A-SDLM， experiments were conducted on Large scale Chinese Short Text Summarization （LCSTS） dataset. Experimental results show that Tran-A-SDLM achieves improvements of 0.2， 0.7 and 0.5 percentage points respectively in ROUGE-1， ROUGE-2 and ROUGE-L scores， compared to RNN-context-SDLM. Results of the ablation study further validate the effectiveness of the proposed model.

Table and Figures | Reference | Related Articles | Metrics

Select

Chinese named entity recognition model incorporating multi-granularity linguistic knowledge and hierarchical information

Youren YU, Yangsen ZHANG, Yuru JIANG, Gaijuan HUANG

Journal of Computer Applications 2024, 44 (6): 1706-1712. DOI: 10.11772/j.issn.1001-9081.2023060833

Abstract （247）

HTML （17）

PDF （1485KB）（190）

Save

Aiming at the problem that most of the current Named Entity Recognition （NER） models only use character-level information encoding and lack text hierarchical information extraction， a Chinese NER （CNER） model incorporating Multi-granularity linguistic knowledge and Hierarchical information （CMH） was proposed. First， the text was encoded using a model that had been pre-trained with multi-granularity linguistic knowledge， so that the model could capture both fine-grained and coarse-grained linguistic information of the text， and thus better characterize the corpus. Second， hierarchical information was extracted using the ON-LSTM （Ordered Neurons Long Short-Term Memory network） model， in order to utilize the hierarchical structural information of the text itself and enhance the temporal relationships between codes. Finally， at the decoding end of the model， incorporated with the word segmentation Information of the text， the entity recognition problem was transformed into a table filling problem in order to better solve the entity overlapping problem and obtain more accurate entity recognition results. Meanwhile， in order to solve the problem of poor migration ability of the current models in different domains， the concept of universal entity recognition was proposed， and a set of universal NER dataset MDNER （Multi-Domain NER dataset） was constructed to enhance the generalization ability of the model in multiple domains by filtering the universal entity types in multiple domains. To validate the effectiveness of the proposed model， experiments were conducted on the datasets Resume， Weibo， and MSRA， and the F1 values were improved by 0.94， 4.95 and 1.58 percentage points， respectively， compared to the MECT （Multi-metadata Embedding based Cross-Transformer） model. In order to verify the proposed model’s entity recognition effect in multi-domain， experiments were conducted on MDNER， and the F1 value reached 95.29%. The experimental results show that the pre-training of multi-granularity linguistic knowledge， the extraction of hierarchical structural information of the text， and the efficient pointer decoder are crucial for the performance promotion of the model.

Table and Figures | Reference | Related Articles | Metrics

Select

Large language model-driven stance-aware fact-checking

Yushan JIANG, Yangsen ZHANG

Journal of Computer Applications 2024, 44 (10): 3067-3073. DOI: 10.11772/j.issn.1001-9081.2023101407

Abstract （153）

HTML （8）

PDF （1036KB）（38）

Save

To address the issues of evidence stance imbalance and neglect of stance information in the field of Fact-Checking （FC）， a Large Language Model-driven Stance-Aware fact-checking （LLM-SA） method was proposed. Firstly， a series of dialectical claims that differed from the original claim were generated by using a large language model， to capture different perspectives for fact-checking. Secondly， through semantic similarity calculations， the relevances of each piece of evidence sentence to the original claim and the dialectical claim were separately assessed， and the top k sentences with the highest semantic similarity to each of them were selected as the evidence to either support or oppose the original claim， which obtained evidences representing different stances， and helped the fact-checking model integrate information from multiple perspectives and evaluate the veracity of the claim more accurately. Finally， the BERT-StuSE （BERT-based Stance-infused Semantic Encoding network） model was introduced to fully incorporate the semantic and stance information of the evidence through the multi-head attention mechanism and make a more comprehensive and objective judgment on the relationship between the claim and the evidence. The experimental results on the CHEF dataset show that， compared to the BERT method， the Micro F1 value and Macro F1 value of the proposed method on the test set were improved by 3.52 and 3.90 percentage points， respectively， achieving a good level of performance. The experimental results demonstrate the effectiveness of the proposed method， and the significance of considering evidence from different stances and leveraging the stance information of the evidence for enhancing fact-checking performance.

Table and Figures | Reference | Related Articles | Metrics