计算机应用 ›› 2010, Vol. 30 ›› Issue (05): 1280-1283.

• 数据挖掘与人工智能 • 上一篇    下一篇

基于加权的不完备非负矩阵分解算法

杨志君1,叶东毅2   

  1. 1. 福州大学数学与计算机科学学院2007级计算机研究生
    2. 福建省福州大学数计学院
  • 收稿日期:2009-10-09 修回日期:2009-10-10 发布日期:2010-05-04 出版日期:2010-05-01
  • 通讯作者: 杨志君
  • 基金资助:
    国家自然科学基金资助项目

Weighted non-negative matrix factorization for incomplete dataset

  • Received:2009-10-09 Revised:2009-10-10 Online:2010-05-04 Published:2010-05-01
  • Contact: flistorm

摘要: 非负矩阵分解(NMF)作为一种特征提取与数据降维的新方法,相较于一些传统算法,具有实现上的简便性,分解形式和分解结果上的可解释性等优点。但当样本矩阵不完备时,NMF无法对其进行直接分解。提出一种基于加权的不完备非负矩阵分解(NMFI)算法,该算法在处理不完备样本矩阵时,先采用随机修复的方法降低误差,再利用加权来控制各样本的权重,尽量削弱缺损数据对分解结果产生的干扰。此外,NMFI算法使用区域权重来进一步减少关键区域数据缺损对分解产生的影响。实验结果表明,NMFI算法能有效提取样本中残余数据的信息,减少缺损数据对分解结果的影响。

关键词: 非负矩阵分解, 不完备数据集, 随机修复, 加权, 区域权重

Abstract: Nonnegative Matrix Factorization (NMF) is a new method for feature extraction and data dimension reduction. It has an advantage over traditional algorithms in the simple implementation and the interpretability of factorization form and factorization result. But NMF could not decompose the samples matrix when it is incomplete. However, when dealing with incomplete dataset, NMFI (Weighted Non-negative Matrix Factorization for Incomplete Dataset) made use of random repair to decrease the error and weighted method to control weights of the samples, which could weaken the disturbance of missing data as much as possible. In addition, NMFI used regional weight for further reducing the impact of missing data in critical region. The experimental results demonstrate that NMFI can effectively extract information from retained data and reduce the influence of missing data.

Key words: Nonnegative Matrix Factorization (NMF), incomplete dataset, random repair, weighting, regional weight