Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (8): 2212-2218.

Special Issue: 人工智能

• Artificial intelligence •

### Distribution entropy penalized support vector data description

1. 1. School of Information Engineering, Huzhou University, Huzhou Zhejiang 313000, China;
2. School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi Jiangsu 214122, China
• Received:2020-10-08 Revised:2021-01-30 Online:2021-08-10 Published:2021-02-24
• Supported by:
This work is partially supported by the National Natural Science Foundation of China (61772198), the Basic Public Welfare Research Program of Zhejiang Province (LGN18F020002).

### 分布熵惩罚的支持向量数据描述

1. 1. 湖州师范学院 信息工程学院, 浙江 湖州 313000;
2. 江南大学 人工智能与计算机学院, 江苏 无锡 214122
• 通讯作者: 胡天杰
• 作者简介:胡天杰(1997-),男,安徽宣城人,硕士研究生,主要研究方向:模式识别、故障诊断;胡文军(1977-),男,安徽绩溪人,教授,博士,CCF会员,主要研究方向:机器学习、模式识别、智能系统;王士同(1964-),男,江苏扬州人,教授,硕士,主要研究方向:模式识别、数据挖掘、模糊系统。
• 基金资助:
国家自然科学基金资助项目（61772198）；浙江省基础公益研究计划项目（LGN18F020002）。

Abstract: In order to solve the problem that traditional Support Vector Data Description (SVDD) is quite sensitive to penalty parameters, a new detection method, called Distribution Entropy Penalized SVDD (DEP-SVDD), was proposed. First, the normal samples were taken as the global distribution of the data, and the distance measure between each sample point and the normal sample distribution center was defined in the Gaussian kernel space. Then, a probability was defined for every data point, which was able to estimate the possibility of the point belonging to normal sample or abnormal one. Finally, the probability was used to construct the punishment degree based on distribution entropy to punish the corresponding samples. On 9 real-world datasets, the proposed method was compared with the algorithms of SVDD, Density Weighted SVDD (DW-SVDD), Position regularized SVDD (P-SVDD), K-Nearest Neighbor (KNN) and isolation Forest (iForest). The results show that DEP-SVDD achieves the highest classification precision on 6 datasets, which proves that DEP-SVDD has better performance advantages in anomaly detection than many anomaly detection methods.

CLC Number: