《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (5): 1375-1382.DOI: 10.11772/j.issn.1001-9081.2021050706

• 人工智能 • 上一篇    下一篇

基于非负矩阵分解与稀疏表示的多标签分类算法

包永春1(), 张建臣2, 杜守信1, 张军军1   

  1. 1.西安工程大学 计算机科学学院,西安 710048
    2.德州学院 计算机与信息学院,山东 德州 253023
  • 收稿日期:2021-05-06 修回日期:2021-09-07 接受日期:2021-09-16 发布日期:2022-03-08 出版日期:2022-05-10
  • 通讯作者: 包永春
  • 作者简介:包永春(1996—),男,山东菏泽人,硕士研究生,主要研究方向:机器学习、数据挖掘、人工智能 baoyongchun2014@163.com
    张建臣(1974—),男,山东微山人,副教授,硕士,主要研究方向:人工智能、大数据分析
    杜守信(1995—),男,山东菏泽人,硕士,主要研究方向:智能制造
    张军军(1994—)男,陕西咸阳人,硕士,主要研究方向:人工智能、机器学习、深度学习。
  • 基金资助:
    西安市科技计划项目(2020KJRC0027)

Multi-label classification algorithm based on non-negative matrix factorization and sparse representation

Yongchun BAO1(), Jianchen ZHANG2, Shouxin DU1, Junjun ZHANG1   

  1. 1.School of Computer Science,Xi’an Polytechnic University,Xi’an Shaanxi 710048,China
    2.School of Computer and Information,Dezhou University,Dezhou Shandong 253023,China
  • Received:2021-05-06 Revised:2021-09-07 Accepted:2021-09-16 Online:2022-03-08 Published:2022-05-10
  • Contact: Yongchun BAO
  • About author:BAO Yongchun, born in 1996,M. S. candidate. His research interests include machine learning,data mining,artificial intelligence.
    ZHAGN Jianchen, born in 1974,M. S.,associate professor. His research interests include artificial intelligence,big data analysis.
    DU Shouxin, born in 1995,M. S. His research interests include intelligent manufacturing.
    ZHANG Junjun, born in 1994,M. S. His research interests include artificial intelligence,machine learning,deep learning.
  • Supported by:
    Xi’an Science and Technology Program(2020KJRC0027)

摘要:

传统的多标签分类算法是以二值标签预测为基础的,而二值标签由于仅能指示数据是否具有相关类别,所含语义信息较少,无法充分表示标签语义信息。为充分挖掘标签空间的语义信息,提出了一种基于非负矩阵分解和稀疏表示的多标签分类算法(MLNS)。该算法结合非负矩阵分解与稀疏表示技术,将数据的二值标签转化为实值标签,从而丰富标签语义信息并提升分类效果。首先,对标签空间进行非负矩阵分解以获得标签潜在语义空间,并将标签潜在语义空间与原始特征空间结合以形成新的特征空间;然后,对此特征空间进行稀疏编码来获得样本间的全局相似关系;最后,利用该相似关系重构二值标签向量,从而实现二值标签与实值标签的转化。在5个标准多标签数据集和5个评价指标上将所提算法与MLBGM、ML2、LIFT和MLRWKNN等算法进行对比。实验结果表明,所提MLNS在多标签分类中优于对比的多标签分类算法,在50%的案例中排名第一,在76%的案例中排名前二,在全部的案例中排名前三。

关键词: 多标签分类, 非负矩阵分解, 稀疏表示, 多输出回归, 机器学习

Abstract:

Traditional multi-label classification algorithms are based on binary label prediction. However, the binary labels can only indicate whether the data has relevant categories, so that they contain less semantic information and cannot fully represent the label semantic information. In order to fully mine the semantic information of label space, a new Multi-Label classification algorithm based on Non-negative matrix factorization and Sparse representation (MLNS) was proposed. In the proposed algorithm, the non-negative matrix factorization and sparse representation technologies were combined to transform the binary labels of data into the real labels, thereby enriching the label semantic information and improving the classification effect. Firstly, the label latent semantic space was obtained by the non-negative matrix factorization of label space, and the label latent semantic space was combined with the original feature space to form a new feature space. Then, the global similarity relation between samples was obtained by the sparse coding of the obtained feature space. Finally, the binary label vectors were reconstructed by using the obtained similarity relation to realize the transformation between binary labels and real labels. The proposed algorithm was compared with the algorithms such as Multi-Label classification Based on Gravitational Model (MLBGM), Multi-Label Manifold Learning (ML2), multi-Label learning with label-specific FeaTures (LIFT) and Multi-Label classification based on the Random Walk graph and the K-Nearest Neighbor algorithm (MLRWKNN) on 5 standard multi-label datasets and 5 evaluation metrics. Experimental results show that, the proposed MLNS is better than the compared multi-label classification algorithms in multi-label classification, the proposed MLNS ranks top1 in 50% cases, top 2 in 76% cases and top 3 in all cases.

Key words: multi-label classification, non-negative matrix factorization, sparse representation, multiple output regression, machine learning

中图分类号: