Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (6): 1844-1854.DOI: 10.11772/j.issn.1001-9081.2025060685

• Data science and technology • Previous Articles    

Multi-view consistency-driven robust feature selection method

Xue XU1, Hu FAN1, Yandan WANG2, Xue DING1, Xuefeng GAO1, Bo ZHANG1, Bo LIU1, Beihong JIN2()   

  1. 1.Information Center,China Tobacco Zhejiang Industrial Company Limited,Hangzhou Zhejiang 310008,China
    2.Institute of Software,Chinese Academy of Sciences,Beijing 100190,China
  • Received:2025-06-23 Revised:2025-09-23 Accepted:2025-09-29 Online:2025-10-15 Published:2026-06-10
  • Contact: Beihong JIN
  • About author:XU Xue, born in 1994, M. S., engineer. Her research interests include data mining, machine learning.
    FAN Hu, born in 1978, M. S., engineer. His research interests include process data mining and analysis.
    WANG Yandan, born in 1998, Ph. D. candidate. Her research interests include deep learning, intelligent data processing.
    DING Xue, born in 1983, M. S., senior engineer. Her research interests include process data analysis.
    GAO Xuefeng, born in 1986, engineer. Her research interests include process data analysis.
    ZHANG Bo, born in 1978, M. S., senior engineer. His research interests include process data mining.
    LIU Bo, born in 1981, engineer. His research interests include process data analysis.
    First author contact:JIN Beihong, born in 1967, Ph. D., professor. Her research interests include deep learning, distributed computing.
  • Supported by:
    Science and Technology Program of China Tobacco Zhejiang Industrial Company Limited(ZJZY2024E003)

多视图一致性驱动的鲁棒特征选择方法

许雪1, 樊虎1, 王彦丹2, 丁雪1, 高雪峰1, 张博1, 刘博1, 金蓓弘2()   

  1. 1.浙江中烟工业有限责任公司 信息中心,杭州 310008
    2.中国科学院 软件研究所,北京 100190
  • 通讯作者: 金蓓弘
  • 作者简介:许雪(1994—),女,山东菏泽人,工程师,硕士,主要研究方向:数据挖掘、机器学习
    樊虎(1978—),男,湖北枝江人,工程师,硕士,主要研究方向:工艺数据挖掘与分析
    王彦丹(1998—),女,山东青岛人,博士研究生,主要研究方向:深度学习、智能数据处理
    丁雪(1983—),女,山东潍坊人,高级工程师,硕士,主要研究方向:工艺数据分析
    高雪峰(1986—),女,河南社旗人,工程师,主要研究方向:工艺数据分析
    张博(1978—),男,吉林洮南人,高级工程师,硕士,主要研究方向:工艺数据挖掘
    刘博(1981—),男,湖北大悟人,工程师,主要研究方向:工艺数据分析
    第一联系人:金蓓弘(1967—),女,浙江杭州人,教授,博士,主要研究方向:深度学习、分布式计算。
  • 基金资助:
    浙江中烟工业有限责任公司科技项目(ZJZY2024E003)

Abstract:

Identifying important features from high-dimensional complex industrial data is crucial for production process anomaly monitoring. Aiming at the problem that the existing feature selection algorithms are difficult to model the complex intrinsic structure of data in the face of noise disturbance, a Multi-view Consistency-driven Robust feature selection method (MCR) was proposed. Firstly, a consistency-guided denoising mechanism with structure preservation was designed, in which multi-view collaborative modeling and inconsistency region detection were used to eliminate local noise disturbance while improving structural fidelity and integrity of the raw data. Then, a joint discriminative and consistency-driven feature fusion module was constructed, where high-quality multi-view embedding representations and a feature weight matrix were learned simultaneously, thereby enhancing the ability to perceive key feature dimensions. Finally, a cooperative sparse regularization-based feature selection strategy was introduced, so as to select the most discriminative and structurally consistent subset of features from the fused embedding space. Without relying on labeled information, this method achieves perception and selection of key feature dimensions through multi-view collaborative modeling and consistency-driven optimization. Extensive experimental results on several public benchmark datasets and a real-world cigarette production dataset demonstrate that MCR outperforms the existing mainstream feature selection methods such as Binary Horse herd Optimization Algorithm (BinHOA) and Improved Binary DJaya Algorithm (IBJA), achieving classification accuracy improvements of 0.23 to 12.15 percentage points on public datasets and 2.22 to 5.00 percentage points on real industrial dataset, validating its robustness and effectiveness in complex scenarios.

Key words: feature selection, multi-view learning, unsupervised learning, noise robustness, multi-view consistency

摘要:

从高维复杂的工业数据中精准识别关键特征对于生产过程异常监测具有重要意义。针对现有特征选择算法面对噪声扰动难以建模数据复杂内在结构的问题,提出一种多视图一致性驱动的鲁棒特征选择方法(MCR)。首先,提出一种结构保持的一致性引导去噪机制,以通过多视图协同建模与不一致性区域检测,有效剔除局部噪声干扰,并提升原始数据的结构保真性与数据完整性;其次,构建联合判别与一致性驱动的特征融合模块,学习高质量的多视图嵌入表示与特征权重矩阵,从而提升对关键特征维度的感知能力;最后,引入一种基于协同稀疏正则化的特征选择机制,从融合后的嵌入空间中筛选出一个最具判别力和结构一致性的特征子集。该方法无需依赖标签信息,通过多视图协同建模与一致性驱动优化,实现对关键特征维度的感知与选择。在多个公开基准数据集以及一个真实的卷烟生产过程数据集上的大量实验结果表明,MCR在多个分类任务中相较于高效的二值马群优化算法(BinHOA)和二值化Jaya算法(IBJA)等现有的主流方法的分类准确率提升达0.23~12.15个百分点,在实际工业数据集上的分类准确率提升达2.22~5.00个百分点,验证了该方法在复杂场景下的鲁棒性与有效性。

关键词: 特征选择, 多视图学习, 无监督学习, 噪声鲁棒性, 多视图一致性

CLC Number: