Search Result

Select

Joint entity-relation extraction method for ancient Chinese books based on prompt learning and global pointer network

Bin LI, Min LIN, Siriguleng, Yingjie GAO, Yurong WANG, Shujun ZHANG

Journal of Computer Applications 2025, 45 (1): 75-81. DOI: 10.11772/j.issn.1001-9081.2023121843

Abstract （170）

HTML （4）

PDF （1437KB）（113）

Save

Joint entity-relation extraction methods based on “pre-training + fine-tuning” paradigm rely on large-scale annotated data. In the small sample scenarios of ancient Chinese books where data annotation is difficult and costly， the fine-tuning efficiency is low and the extraction performance is poor； entity nesting and relation overlapping problems are common in ancient Chinese books， which limit the effect of joint entity-relation extraction； pipeline extraction methods have error propagation problems， which affect the extraction effect. In response to the above problems， a joint entity-relation extraction method for ancient Chinese books based on prompt learning and global pointer network was proposed. Firstly， the prompt learning method of span extraction reading comprehension was used to inject domain knowledge into the Pre-trained Language Model （PLM） to unify the optimization goals of pre-training and fine-tuning， and the input sentences were encoded. Then， the global pointer networks were used to predict and jointly decode the boundaries of subject and object and the boundaries of subject and object of different relationships， so as to align into entity-relation triples， and complete the construction of PTBG （Prompt Tuned BERT with Global pointer） model. As the results， the problem of entity nesting and relation overlapping was solved， and the error propagation problem of pipeline decoding was avoided. Finally， based on the above work， the influence of different prompt templates on extraction performance was analyzed. Experimental results on Records of the Grand Historian dataset show that compared with OneRel model before and after injecting domain knowledge， the PTBG model has the F1-value increased by 1.64 and 1.97 percentage points respectively. It can be seen that the PTBG model can better extract entity-relation jointly in ancient Chinese books， and provides new research ideas and approaches for low-resource， small-sample deep learning scenarios.

Table and Figures | Reference | Related Articles | Metrics

Select

Prompt learning method for ancient text sentence segmentation and punctuation based on span-extracted prototypical network

Yingjie GAO, Min LIN, Siriguleng, Bin LI, Shujun ZHANG

Journal of Computer Applications 2024, 44 (12): 3815-3822. DOI: 10.11772/j.issn.1001-9081.2023121719

Abstract （118）

HTML （4）

PDF （1509KB）（33）

Save

In view of the phenomenon that automatic sentence segmentation and punctuation task in ancient book information processing relies on large-scale annotated corpora， and considering that training high-quality， large-scale samples is expensive and these samples are difficult to obtain， a prompt learning method for ancient text sentence segmentation and punctuation based on span-extracted prototypical network was proposed. Firstly， structured prompt information was incorporated into the support set to form an effective prompt template， so as to improve the model's learning efficiency. Then， combined with a punctuation position extractor and a prototype network classifier， the misjudgment impact and the interference from non-punctuation labels in traditional sequence labeling method were effectively reduced. Experimental results show that on Records of the Grand Historian dataset， the F1 score of the proposed method is 2.47 percentage points higher than that of the Siku-BERT-BiGRU-CRF （Siku - Bidirectional Encoder Representation from Transformer - Bidirectional Gated Recurrent Unit - Conditional Random Field） method. In addition， on the public multi-domain ancient text dataset CCLUE， the precision and F1 score of this method reach 91.60% and 93.12% respectively， indicating that the method can perform sentence segmentation and punctuation in multi-domain ancient text effectively and automatically by using a small number of training samples. Therefore， the proposed method offers new thought and approach for conducting in-depth research on automatic sentence segmentation and punctuation， as well as for enhancing the model's learning efficiency， in multi-domain ancient text.

Table and Figures | Reference | Related Articles | Metrics

Select

Classification and recognition method of copper alloy metallograph based on feature aggregation

Xueyu HUANG, Huaiyu HE, Huimin LIN, Jinshui CHEN

Journal of Computer Applications 2023, 43 (8): 2593-2601. DOI: 10.11772/j.issn.1001-9081.2022060893

Abstract （237）

HTML （13）

PDF （5579KB）（73）

Save

Focusing on the issue of long delay in detection of copper alloy composition， a classification and recognition method of copper alloy metallograph based on feature aggregation was proposed. Firstly，in the feature extraction stage， the Gray-Level Co-occurrence Matrix （GLCM） and the Residual Network （ResNet） model based on convolutional block attention module were constructed to extract the global and local features of the image， respectively. Secondly， in the feature aggregation stage， the extracted features were normalized and then cascaded in a simple way. Finally， in the classification and recognition stage， a Support Vector Machine （SVM） was used for accurate classification. Experimental results show that the proposed method achieves the accuracy of 98.963% and macro-F1 of 98.996%， which are better than those of machine learning methods based on single feature. It can be seen that the features extracted by different methods can describe the texture and edge information of copper alloy metallographs more comprehensively after aggregation， and the proposed method can identify different copper alloys by metallographs， which improves the accuracy of identification and has good robustness.

Table and Figures | Reference | Related Articles | Metrics

Select

Method of creating file based on big directory of NTFS

WU Weimin LIN Shuibin JIANG Daqiang LI Haiming SU Qing

Journal of Computer Applications 2014, 34 (2): 417-420.

Abstract （488）

PDF （632KB）（595）

Save

In the available literatures, creating new files with New Technology File System (NTFS) that does not depend on calling Windows Application Program Interface (API) takes place in small directory. Therefore, a new technological realization of creating files in big directories was proposed in this paper. Firstly, it located the index buffer by traversaling the B+tree. Secondly, by judging whether the index buffer had an index node, it would put the created index entry into the specified location of index buffer respectively. Next, the index buffer inserted by the index was written to disk. Finally, it created new files in the big directories successfully. The experiments prove that the files can be created correctly in a large directory using the new creating technology.