Journal of Computer Applications ›› 2005, Vol. 25 ›› Issue (09): 2022-2024.DOI: 10.3724/SP.J.1087.2005.02022
• Artificial intelligence • Previous Articles Next Articles
FU Jian-lian,CHEN Qun-xiu
Online:
Published:
傅间莲,陈群秀
Abstract: Topic partition is a significant problem during text structuring in automatic abstracting system.VSM was established for the whole article based on paragraph,and then algorithms for multi-topic text partitioning based on sequential paragraphic similarity were proposed.It solved the problem of chapter structural analysis in multi-topic article and made the abstract of the multi-topic to have more general content and more balanced structure.Experiments on close test show that the precision of topic partition for multi-topic text and single-topic text reaches 92.4% and 99.1% respectively.
Key words: automatic abstraction, VSM, paragraphic similarity, topic partition
摘要: 主题划分是自动文摘系统中文本结构分析阶段所要解决的一个重要问题。文中提出了一个通过建立段落向量空间模型,根据连续段落相似度进行文本主题划分的算法,解决了文章的篇章结构分析问题,使得多主题文章的文摘更具内容全面性与结构平衡性。实验结果表明,该算法对多主题文章的主题划分准确率为92.4%,对单主题文章的主题划分准确率为99.1%。
关键词: 自动文摘, 向量空间模型, 段落相似度, 主题划分
CLC Number:
TP391.1
FU Jian-lian,CHEN Qun-xiu. Study on topic partition based on sequential paragraphic similarity[J]. Journal of Computer Applications, 2005, 25(09): 2022-2024.
傅间莲,陈群秀. 基于连续段落相似度的主题划分算法[J]. 计算机应用, 2005, 25(09): 2022-2024.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.joca.cn/EN/10.3724/SP.J.1087.2005.02022
http://www.joca.cn/EN/Y2005/V25/I09/2022