Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Joint entity-relation extraction method for ancient Chinese books based on prompt learning and global pointer network
Bin LI, Min LIN, Siriguleng, Yingjie GAO, Yurong WANG, Shujun ZHANG
Journal of Computer Applications    2025, 45 (1): 75-81.   DOI: 10.11772/j.issn.1001-9081.2023121843
Abstract170)   HTML4)    PDF (1437KB)(113)       Save

Joint entity-relation extraction methods based on “pre-training + fine-tuning” paradigm rely on large-scale annotated data. In the small sample scenarios of ancient Chinese books where data annotation is difficult and costly, the fine-tuning efficiency is low and the extraction performance is poor; entity nesting and relation overlapping problems are common in ancient Chinese books, which limit the effect of joint entity-relation extraction; pipeline extraction methods have error propagation problems, which affect the extraction effect. In response to the above problems, a joint entity-relation extraction method for ancient Chinese books based on prompt learning and global pointer network was proposed. Firstly, the prompt learning method of span extraction reading comprehension was used to inject domain knowledge into the Pre-trained Language Model (PLM) to unify the optimization goals of pre-training and fine-tuning, and the input sentences were encoded. Then, the global pointer networks were used to predict and jointly decode the boundaries of subject and object and the boundaries of subject and object of different relationships, so as to align into entity-relation triples, and complete the construction of PTBG (Prompt Tuned BERT with Global pointer) model. As the results, the problem of entity nesting and relation overlapping was solved, and the error propagation problem of pipeline decoding was avoided. Finally, based on the above work, the influence of different prompt templates on extraction performance was analyzed. Experimental results on Records of the Grand Historian dataset show that compared with OneRel model before and after injecting domain knowledge, the PTBG model has the F1-value increased by 1.64 and 1.97 percentage points respectively. It can be seen that the PTBG model can better extract entity-relation jointly in ancient Chinese books, and provides new research ideas and approaches for low-resource, small-sample deep learning scenarios.

Table and Figures | Reference | Related Articles | Metrics
Prompt learning method for ancient text sentence segmentation and punctuation based on span-extracted prototypical network
Yingjie GAO, Min LIN, Siriguleng, Bin LI, Shujun ZHANG
Journal of Computer Applications    2024, 44 (12): 3815-3822.   DOI: 10.11772/j.issn.1001-9081.2023121719
Abstract118)   HTML4)    PDF (1509KB)(33)       Save

In view of the phenomenon that automatic sentence segmentation and punctuation task in ancient book information processing relies on large-scale annotated corpora, and considering that training high-quality, large-scale samples is expensive and these samples are difficult to obtain, a prompt learning method for ancient text sentence segmentation and punctuation based on span-extracted prototypical network was proposed. Firstly, structured prompt information was incorporated into the support set to form an effective prompt template, so as to improve the model's learning efficiency. Then, combined with a punctuation position extractor and a prototype network classifier, the misjudgment impact and the interference from non-punctuation labels in traditional sequence labeling method were effectively reduced. Experimental results show that on Records of the Grand Historian dataset, the F1 score of the proposed method is 2.47 percentage points higher than that of the Siku-BERT-BiGRU-CRF (Siku - Bidirectional Encoder Representation from Transformer - Bidirectional Gated Recurrent Unit - Conditional Random Field) method. In addition, on the public multi-domain ancient text dataset CCLUE, the precision and F1 score of this method reach 91.60% and 93.12% respectively, indicating that the method can perform sentence segmentation and punctuation in multi-domain ancient text effectively and automatically by using a small number of training samples. Therefore, the proposed method offers new thought and approach for conducting in-depth research on automatic sentence segmentation and punctuation, as well as for enhancing the model's learning efficiency, in multi-domain ancient text.

Table and Figures | Reference | Related Articles | Metrics
Classification and recognition method of copper alloy metallograph based on feature aggregation
Xueyu HUANG, Huaiyu HE, Huimin LIN, Jinshui CHEN
Journal of Computer Applications    2023, 43 (8): 2593-2601.   DOI: 10.11772/j.issn.1001-9081.2022060893
Abstract237)   HTML13)    PDF (5579KB)(73)       Save

Focusing on the issue of long delay in detection of copper alloy composition, a classification and recognition method of copper alloy metallograph based on feature aggregation was proposed. Firstly,in the feature extraction stage, the Gray-Level Co-occurrence Matrix (GLCM) and the Residual Network (ResNet) model based on convolutional block attention module were constructed to extract the global and local features of the image, respectively. Secondly, in the feature aggregation stage, the extracted features were normalized and then cascaded in a simple way. Finally, in the classification and recognition stage, a Support Vector Machine (SVM) was used for accurate classification. Experimental results show that the proposed method achieves the accuracy of 98.963% and macro-F1 of 98.996%, which are better than those of machine learning methods based on single feature. It can be seen that the features extracted by different methods can describe the texture and edge information of copper alloy metallographs more comprehensively after aggregation, and the proposed method can identify different copper alloys by metallographs, which improves the accuracy of identification and has good robustness.

Table and Figures | Reference | Related Articles | Metrics
Method of creating file based on big directory of NTFS
WU Weimin LIN Shuibin JIANG Daqiang LI Haiming SU Qing
Journal of Computer Applications    2014, 34 (2): 417-420.  
Abstract488)      PDF (632KB)(595)       Save
In the available literatures, creating new files with New Technology File System (NTFS) that does not depend on calling Windows Application Program Interface (API) takes place in small directory. Therefore, a new technological realization of creating files in big directories was proposed in this paper. Firstly, it located the index buffer by traversaling the B+tree. Secondly, by judging whether the index buffer had an index node, it would put the created index entry into the specified location of index buffer respectively. Next, the index buffer inserted by the index was written to disk. Finally, it created new files in the big directories successfully. The experiments prove that the files can be created correctly in a large directory using the new creating technology.
Related Articles | Metrics
Using Gini-Index for feature selection in text categorization
Yong-Min LIN Wei-Dong ZHU
Journal of Computer Applications   
Abstract2056)      PDF (720KB)(1074)       Save
This paper used improved Gini-index for text feature selection, and constructed the measure function based on Gini-Index, then compared it to other four feature selection measures using two kinds of classifiers on two different document corpora. The result of experiments shows that its performance is comparable with other text feature selection approaches. However, it is perfect in the time complexity of algorithm.
Related Articles | Metrics