Journal of Computer Applications ›› 2005, Vol. 25 ›› Issue (05): 1004-1005.DOI: 10.3724/SP.J.1087.2005.1004

• Data mining • Previous Articles     Next Articles

Describing method of the distribution of data-sample based on Gain

SUN Wei-wei, LIU Cai-xing, TIAN Xu-hong   

  1. ollege of Informatics, South China Agricultural University
  • Online:2005-05-01 Published:2005-05-01

基于增益的数据样本分布描述方法

孙微微,刘才兴,田绪红   

  1. 华南农业大学信息学院
  • 基金资助:

    国家自然科学基金资助项目(60375005)

Abstract: For describing the distribution of samples with high-dimensions and discrete classification data, the method of scoring-ratio based on Gain was presented. It computed scoring-ratio for every sample according to the importance of attributes and attribute-value, and the distribution of samples in a class was described from the point of view of membership degree of sample to each class. The probability density curve and histogram showed the distribution of typical and noise samples in each class distinctly.

Key words: Gain, membership degree, distribution of sample

摘要: 针对高维离散型分类数据的样本分布描述问题,提出基于增益的得分比方法,策略是根据属性和属性值的重要程度,为每个样本计算得分比,从样本对分类的隶属度角度描述各分类中样本的分布。得分比的概率密度曲线和直方图可以直观反映出在每一分类中典型样本和噪声样本的分布情况。

关键词: 增益, 隶属度, 样本分布

CLC Number: