Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (4): 1035-1048.DOI: 10.11772/j.issn.1001-9081.2023040537

Special Issue: 人工智能 综述

• Artificial intelligence • Previous Articles     Next Articles

Survey of extractive text summarization based on unsupervised learning and supervised learning

Xiawuji1,2, Heming HUANG1,2(), Gengzangcuomao1,2, Yutao FAN1,2   

  1. 1.College of Computer,Qinghai Normal University,Xining Qinghai 810008,China
    2.State Key Laboratory of Tibetan Intelligent Information Processing and Application (Qinghai Normal University),Xining Qinghai 810008,China
  • Received:2023-05-06 Revised:2023-07-19 Accepted:2023-07-25 Online:2023-12-04 Published:2024-04-10
  • Contact: Heming HUANG
  • About author:Xiawuji, born in 1982, Ph. D. candidate, associate professor. Her research interests include pattern recognition and intelligent systems, Tibetan intelligent information processing.
    HUANG Heming, born in 1969, Ph. D., professor. His research interests include pattern recognition and artificial intelligence.
    Gengzangcuomao, born in 1993, Ph. D. candidate. Her research interests include pattern recognition and intelligent systems.
    FAN Yutao, born in 1977, Ph. D. candidate, associate professor. Her research interests include pattern recognition and intelligent systems.
  • Supported by:
    National Natural Science Foundation of China(62066039);Qinghai Provincial Natural Science Foundation(2022-ZJ-925);Independent Project of State Key Laboratory of Tibetan Intelligent Information Processing and Application(2022-SKL-007)


夏吾吉1,2, 黄鹤鸣1,2(), 更藏措毛1,2, 范玉涛1,2   

  1. 1.青海师范大学 计算机学院,西宁 810008
    2.藏语智能信息处理及应用国家重点实验室(青海师范大学),西宁 810008
  • 通讯作者: 黄鹤鸣
  • 作者简介:夏吾吉(1982—),女(藏族),青海尖扎人,副教授,博士研究生,CCF会员,主要研究方向:模式识别与智能系统、藏语智能信息处理
  • 基金资助:


Different from generative summarization methods, extractive summarization methods are more feasible to implement, more readable, and more widely used. At present, the literatures on extractive summarization methods mostly analyze and review some specific methods or fields, and there is no multi-faceted and multi-lingual systematic review. Therefore, the meanings of text summarization generation were discussed, related literatures were systematically reviewed, and the methods of extractive text summarization based on unsupervised learning and supervised learning were analyzed multi-dimensionally and comprehensively. First, the development of text summarization techniques was reviewed, and different methods of extractive text summarization were analyzed, including the methods based on rules, Term Frequency-Inverse Document Frequency (TF-IDF), centrality, potential semantic, deep learning, graph sorting, feature engineering, and pre-training learning, etc. Also, comparisons of advantages and disadvantages among different algorithms were made. Secondly, datasets in different languages for text summarization and popular evaluation metrics were introduced in detail. Finally, problems and challenges for research of extractive text summarization were discussed, and solutions and research trends were presented.

Key words: extractive summarization, unsupervised learning, supervised learning, dataset, evaluation metric



关键词: 抽取式摘要, 无监督学习, 监督学习, 数据集, 评价指标

CLC Number: