Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (5): 1342-1348.DOI: 10.11772/j.issn.1001-9081.2022030429

• China Conference on Data Mining 2022 (CCDM 2022) • Previous Articles    

Iteratively modified robust extreme learning machine

Xinwei LYU1,2, Shuxia LU1,2()   

  1. 1.Hebei Key Laboratory of Machine Learning and Computational Intelligence (Hebei University),Baoding Hebei 071002,China
    2.College of Mathematics and Information Science,Hebei University,Baoding Hebei 071002,China
  • Received:2022-03-17 Revised:2023-02-03 Accepted:2023-02-06 Online:2023-05-08 Published:2023-05-10
  • Contact: Shuxia LU
  • About author:LYU Xinwei, born in 1997, M. S. candidate. His research interests include machine learning.
    LU Shuxia, born in 1966, Ph. D., professor. Her research interests include machine learning, deep learning.
  • Supported by:
    Natural Science Foundation of Hebei Province(F2021201020)

迭代修正鲁棒极限学习机

吕新伟1,2, 鲁淑霞1,2()   

  1. 1.河北省机器学习与计算智能重点实验室(河北大学),河北 保定 071002
    2.河北大学 数学与信息科学学院,河北 保定 071002
  • 通讯作者: 鲁淑霞
  • 作者简介:吕新伟(1997—),男,山东济宁人,硕士研究生,主要研究方向:机器学习
    鲁淑霞(1966—),女,河北保定人,教授,博士,CCF会员,主要研究方向:机器学习、深度学习。cmclusx@126.com
  • 基金资助:
    河北省自然科学基金资助项目(F2021201020)

Abstract:

Many variations of Extreme Learning Machine (ELM) aim at improving the robustness of ELMs to outliers, while the traditional Robust Extreme Learning Machine (RELM) is very sensitive to outliers. How to deal with too many extreme outliers in the data becomes the most difficult problem for constructing RELM models. For outliers with large residuals, a bounded loss function was used to eliminate the pollution of outliers to the model; to solve the problem of excessive outliers, iterative modification technique was used to modify data to reduce the influence caused by excessive outliers. Combining these two approaches, an Iteratively Modified RELM (IMRELM) was proposed and it was solved by iteration. In each iteration, the samples were reweighted to reduce the influence of outliers and the under-fitting was avoided in the process of continuous modification. IMRELM, ELM, Weighted ELM (WELM), Iteratively Re-Weighted ELM (IRWELM) and Iterative Reweighted Regularized ELM (IRRELM) were compared on synthetic datasets and real datasets with different outlier levels. On the synthetic dataset with 80% outliers, the Mean-Square Error (MSE) of IRRELM is 2.450 44, and the MSE of IMRELM is 0.000 79. Experimental results show that IMRELM has good prediction accuracy and robustness on data with excessive extreme outliers.

Key words: Robust Extreme Learning Machine (RELM), reweighting, iterative modification, outlier, regression

摘要:

极限学习机(ELM)的许多变体都致力于提高ELM对异常点的鲁棒性,而传统的鲁棒极限学习机(RELM)对异常点非常敏感,如何处理数据中的过多极端异常点变成构建RELM模型的棘手问题。对于残差较大的异常点,采用有界损失函数消除异常点对模型的污染;为了解决异常点过多的问题,采用迭代修正技术修改数据以降低由异常点过多带来的影响。结合这两种方法,提出迭代修正鲁棒极限学习机(IMRELM)。IMRELM通过迭代的方式求解,在每次的迭代中,通过对样本重加权减小异常点的影响,在不断修正的过程中避免算法出现欠拟合。在具有不同异常点水平的人工数据集和真实数据集上对比了IMRELM、ELM、加权极限学习机(WELM)、迭代重加权极限学习机(IRWELM)和迭代重加权正则化极限学习机(IRRELM)。在异常点占比为80%的人工数据集上,IRRELM的均方误差(MSE)为2.450 44,而IMRELM的MSE为0.000 79。实验结果表明,IMRELM在具有过多极端异常点的数据上具有良好的预测精度和鲁棒性。

关键词: 鲁棒极限学习机, 重加权, 迭代修正, 异常点, 回归

CLC Number: