计算机应用 ›› 2013, Vol. 33 ›› Issue (04): 988-990.DOI: 10.3724/SP.J.1087.2013.00988

• 人工智能 • 上一篇    下一篇

基于高维聚类的探索性文本挖掘算法

张爱科,符保龙   

  1. 柳州职业技术学院 电子信息工程系,广西 柳州 545006
  • 收稿日期:2012-11-05 修回日期:2012-11-29 出版日期:2013-04-01 发布日期:2013-04-23
  • 通讯作者: 张爱科
  • 作者简介:张爱科 (1973-),女(壮族),广西贵港人,副教授,主要研究方向:数据挖掘、演化计算;符保龙(1978-),男(壮族),广西龙州人,副教授,主要研究方向:数据挖掘、演化计算。
  • 基金资助:

    广西教育厅科研项目基金资助项目(201106LX745,201204LX593)

Exploratory text mining algorithm based on high-dimensional clustering

ZHANG Aike,FU Baolong   

  1. Electronic Information Engineering Department,Liuzhou Vocational Technological College, Liuzhou Guangxi 545006, China
  • Received:2012-11-05 Revised:2012-11-29 Online:2013-04-01 Published:2013-04-23
  • Contact: ZHANG Aike

摘要: 建立了一种基于高维聚类的探索性文本挖掘算法,利用文本挖掘的引导作用实现数据类文本中的数据挖掘。算法只需要少量迭代,就能够从非常大的文本集中产生良好的集群;映射到其他数据与将文本记录到用户组,能进一步提高算法的结果。通过对相关数据的测试以及实验结果的分析,证实了该方法的可行性与有效性。

关键词: 自由文本, 高维聚类, 数据覆盖, 文本挖掘, 数据挖掘

Abstract: Because of the unstructured characteristics of free text, text mining becomes an important branch of data mining. In recent years, types of text mining algorithms emerged in large numbers. In this paper, an exploratory text mining algorithm was proposed based on high-dimensional clustering. The algorithm required only a small number of iterations to produce favorable clusters from very large text. Mapping to other recorded data and recording the text to the user group enabled the result of the algorithm be improved further. The feasibility and validity of the proposed method is verified by related data test and the analysis of experimental results.

Key words: free text, high-dimensional clustering, data coverage, text mining, data mining