Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (11): 3104-3112.DOI: 10.11772/j.issn.1001-9081.2021010062

• Artificial intelligence • Previous Articles     Next Articles

Structure-fuzzy multi-class support vector machine algorithm based on pinball loss

Kai LI, Jie LI()   

  1. School of Cyber Security and Computer,Hebei University,Baoding Hebei 071002,China
  • Received:2021-01-13 Revised:2021-03-20 Accepted:2021-04-14 Online:2021-11-20 Published:2021-11-10
  • Contact: Jie LI
  • About author:Ll Kai,born in 1963,Ph. D.,professor. His research interestsinclude machine learning, data mining
    LI Jie,born in 1996,M. S. candidate. Her research interestsinclude machine learning ,data mining.
  • Supported by:
    the Natural Science Foundation of Hebei Province(F2018201060)

基于pinball损失的结构模糊多分类支持向量机算法

李凯, 李洁()   

  1. 河北大学 网络空间安全与计算机学院,河北 保定 071002
  • 通讯作者: 李洁
  • 作者简介:李凯(1963—)男,河北保定人,教授,博士,主要研究方向:机器学习,数据挖掘
    李洁(1996一),女,河北保定人,硕士研究生,主要研究方向:机器学习.数据挖掘。
  • 基金资助:
    河北省自然科学基金资助项目(F2018201060)

Abstract:

The Multi-Class Support Vector Machine (MSVM) has the defects such as strong sensitivity to noise, instability to resampling data and lower generalization performance. In order to solve the problems, the pinball loss function, sample fuzzy membership degree and sample structural information were introduced into the Simplified Multi-Class Support Vector Machine (SimMSVM) algorithm, and a structure-fuzzy multi-class support vector machine algorithm based on pinball loss, namely Pin-SFSimMSVM, was proposed. Experimental results on synthetic datasets, UCI datasets and UCI datasets adding different proportions of noise show that, the accuracy of the proposed Pin-SFSimMSVM algorithm is increased by 0~5.25 percentage points compared with that of SimMSVM algorithm. The results also show that the proposed algorithm not only has the advantages of avoiding indivisible areas of multi-class data and fast calculation speed, but also has good insensitivity to noise and stability to resampling data. At the same time, the proposed algorithm considers the fact that different data samples play different roles in classification and the important prior knowledge contained in the data, so that the classifier training is more accurate.

Key words: multi-class, Support Vector Machine (SVM), pinball loss, structural information, fuzzy membership degree

摘要:

针对多分类支持向量机(MSVM)对噪声较强的敏感性、对重采样数据的不稳定性以及泛化性能低等缺陷,将pinball损失函数、样本模糊隶属度以及样本结构信息引入到简化的多分类支持向量机(SimMSVM)算法中,构建了基于pinball损失的结构模糊多分类支持向量机算法Pin-SFSimMSVM。在人工数据集、UCI数据集以及添加不同比例噪声的UCI数据集上的实验结果显示:所提出的Pin-SFSimMSVM算法与SimMSVM算法相比,准确率均提升了0~5.25个百分点;所提出的算法不仅具有避免多类数据存在不可分区域和计算速度快的优点,而且具有对噪声较好的不敏感性以及对重采样数据的稳定性,同时考虑了不同数据样本在分类时扮演不同角色的事实以及数据中包含的重要先验知识,从而使分类器训练更准确。

关键词: 多分类, 支持向量机, pinball损失, 结构信息, 模糊隶属度

CLC Number: