Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (10): 3097-3104.DOI: 10.11772/j.issn.1001-9081.2023101419
• Data science and technology • Previous Articles Next Articles
Yuhao TANG, Dezhong PENG, Zhong YUAN()
Received:
2023-10-20
Revised:
2024-01-20
Accepted:
2024-01-26
Online:
2024-10-15
Published:
2024-10-10
Contact:
Zhong YUAN
About author:
TANG Yuhao, born in 1999, M. S. candidate. His research interests include anomaly detection.Supported by:
通讯作者:
袁钟
作者简介:
唐宇皓(1999—),男,重庆人,硕士研究生,主要研究方向:异常检测基金资助:
CLC Number:
Yuhao TANG, Dezhong PENG, Zhong YUAN. Fuzzy multi-granularity anomaly detection for incomplete mixed data[J]. Journal of Computer Applications, 2024, 44(10): 3097-3104.
唐宇皓, 彭德中, 袁钟. 面向不完备混合数据的模糊多粒度异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3097-3104.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2023101419
c | 4 | 0.7 | |
a | * | 0.4 | |
c | 1 | 0.6 | |
* | 2 | 0.3 | |
a | 8 | 0.5 | |
b | 10 | * |
Tab. 1 Incomplete information system
c | 4 | 0.7 | |
a | * | 0.4 | |
c | 1 | 0.6 | |
* | 2 | 0.3 | |
a | 8 | 0.5 | |
b | 10 | * |
编号 | 数据集名称 | 缩写 | 属性 数 | 样本 数 | 异常值 数 | 类型 |
---|---|---|---|---|---|---|
1 | Vertebral | Vert | 6 | 240 | 30 | 数值型 |
2 | Wine | Wine | 13 | 129 | 10 | 数值型 |
3 | Pima_TRUE_55_variant1 | Pima | 9 | 555 | 55 | 数值型 |
4 | Hepatitis_2_9_variant1 | Hep | 19 | 94 | 9 | 混合型 |
5 | German_1_14_variant1 | Germ | 20 | 714 | 14 | 混合型 |
6 | Heart270_2_16_variant1 | Heart | 13 | 166 | 16 | 混合型 |
7 | Wdbc_M_39_variant1 | Wdbc | 31 | 483 | 39 | 数值型 |
8 | Sonar_M_10_variant1 | Sonar | 60 | 107 | 10 | 数值型 |
9 | tic_tac_toe_negative_ 12_variant1 | Tic | 9 | 638 | 12 | 标称型 |
10 | bands_band_6_variant1 | Bands | 38 | 317 | 6 | 混合型 |
11 | creditA_plus_42_variant1 | CreditA | 15 | 425 | 42 | 混合型 |
12 | horse_1_12_variant1 | Horse | 28 | 256 | 12 | 混合型 |
Tab. 2 Datasets used in experiments
编号 | 数据集名称 | 缩写 | 属性 数 | 样本 数 | 异常值 数 | 类型 |
---|---|---|---|---|---|---|
1 | Vertebral | Vert | 6 | 240 | 30 | 数值型 |
2 | Wine | Wine | 13 | 129 | 10 | 数值型 |
3 | Pima_TRUE_55_variant1 | Pima | 9 | 555 | 55 | 数值型 |
4 | Hepatitis_2_9_variant1 | Hep | 19 | 94 | 9 | 混合型 |
5 | German_1_14_variant1 | Germ | 20 | 714 | 14 | 混合型 |
6 | Heart270_2_16_variant1 | Heart | 13 | 166 | 16 | 混合型 |
7 | Wdbc_M_39_variant1 | Wdbc | 31 | 483 | 39 | 数值型 |
8 | Sonar_M_10_variant1 | Sonar | 60 | 107 | 10 | 数值型 |
9 | tic_tac_toe_negative_ 12_variant1 | Tic | 9 | 638 | 12 | 标称型 |
10 | bands_band_6_variant1 | Bands | 38 | 317 | 6 | 混合型 |
11 | creditA_plus_42_variant1 | CreditA | 15 | 425 | 42 | 混合型 |
12 | horse_1_12_variant1 | Horse | 28 | 256 | 12 | 混合型 |
数据集 | COF | ROD | CD | LOCI | LMDD | HBOS | LODA | LOF | ILGNI | MFGAD | ADFIIS |
---|---|---|---|---|---|---|---|---|---|---|---|
平均值 | 0.758 | 0.867 | 0.807 | 0.642 | 0.707 | 0.883 | 0.716 | 0.715 | 0.846 | 0.929 | 0.920 |
Vert | 0.406 | 0.384 | 0.417 | 0.665 | 0.449 | 0.319 | 0.349 | 0.355 | 0.384 | 0.663 | 0.670 |
Wine | 0.853 | 0.865 | 0.573 | 0.526 | 0.825 | 0.902 | 0.849 | 0.889 | 0.712 | 0.947 | 0.968 |
Pima | 0.871 | 0.948 | 0.897 | 0.623 | 0.888 | 0.948 | 0.944 | 0.838 | 0.871 | 0.967 | 0.945 |
Hep | 0.808 | 0.986 | 0.874 | 0.925 | 0.846 | 0.981 | 0.341 | 0.711 | 0.999 | 0.983 | 0.986 |
Germ | 0.868 | 0.945 | 0.913 | 0.627 | 0.451 | 0.912 | 0.682 | 0.903 | 0.943 | 0.970 | 0.942 |
Heart | 0.950 | 0.980 | 0.873 | 0.663 | 0.868 | 0.978 | 0.790 | 0.783 | 0.816 | 0.979 | 0.983 |
Wdbc | 0.758 | 0.995 | 0.913 | 0.642 | 0.992 | 0.996 | 0.984 | 0.459 | 0.975 | 0.993 | 0.997 |
Sonar | 0.951 | 0.989 | 0.891 | 0.784 | 0.899 | 0.991 | 0.957 | 0.982 | 0.966 | 0.993 | 0.994 |
Tic | 0.516 | 0.621 | 0.809 | 0.640 | 0.440 | 0.724 | 0.679 | 0.746 | 0.930 | 0.812 | 0.815 |
Bands | 0.487 | 0.812 | 0.884 | 0.442 | 0.467 | 0.905 | 0.775 | 0.342 | 0.874 | 0.917 | 0.786 |
CreditA | 0.788 | 0.984 | 0.870 | 0.475 | 0.469 | 0.984 | 0.451 | 0.750 | 0.854 | 0.980 | 0.981 |
Horse | 0.834 | 0.896 | 0.768 | 0.691 | 0.852 | 0.953 | 0.791 | 0.820 | 0.825 | 0.943 | 0.967 |
Tab. 3 Experimental comparison results on AUC values
数据集 | COF | ROD | CD | LOCI | LMDD | HBOS | LODA | LOF | ILGNI | MFGAD | ADFIIS |
---|---|---|---|---|---|---|---|---|---|---|---|
平均值 | 0.758 | 0.867 | 0.807 | 0.642 | 0.707 | 0.883 | 0.716 | 0.715 | 0.846 | 0.929 | 0.920 |
Vert | 0.406 | 0.384 | 0.417 | 0.665 | 0.449 | 0.319 | 0.349 | 0.355 | 0.384 | 0.663 | 0.670 |
Wine | 0.853 | 0.865 | 0.573 | 0.526 | 0.825 | 0.902 | 0.849 | 0.889 | 0.712 | 0.947 | 0.968 |
Pima | 0.871 | 0.948 | 0.897 | 0.623 | 0.888 | 0.948 | 0.944 | 0.838 | 0.871 | 0.967 | 0.945 |
Hep | 0.808 | 0.986 | 0.874 | 0.925 | 0.846 | 0.981 | 0.341 | 0.711 | 0.999 | 0.983 | 0.986 |
Germ | 0.868 | 0.945 | 0.913 | 0.627 | 0.451 | 0.912 | 0.682 | 0.903 | 0.943 | 0.970 | 0.942 |
Heart | 0.950 | 0.980 | 0.873 | 0.663 | 0.868 | 0.978 | 0.790 | 0.783 | 0.816 | 0.979 | 0.983 |
Wdbc | 0.758 | 0.995 | 0.913 | 0.642 | 0.992 | 0.996 | 0.984 | 0.459 | 0.975 | 0.993 | 0.997 |
Sonar | 0.951 | 0.989 | 0.891 | 0.784 | 0.899 | 0.991 | 0.957 | 0.982 | 0.966 | 0.993 | 0.994 |
Tic | 0.516 | 0.621 | 0.809 | 0.640 | 0.440 | 0.724 | 0.679 | 0.746 | 0.930 | 0.812 | 0.815 |
Bands | 0.487 | 0.812 | 0.884 | 0.442 | 0.467 | 0.905 | 0.775 | 0.342 | 0.874 | 0.917 | 0.786 |
CreditA | 0.788 | 0.984 | 0.870 | 0.475 | 0.469 | 0.984 | 0.451 | 0.750 | 0.854 | 0.980 | 0.981 |
Horse | 0.834 | 0.896 | 0.768 | 0.691 | 0.852 | 0.953 | 0.791 | 0.820 | 0.825 | 0.943 | 0.967 |
1 | HAWKINS D M. Identification of Outliers[M]. Dordrecht: Springer, 1980:1-12. |
2 | 祁超帅, 何文思, 焦毅, 等. 无人机飞行数据异常检测算法综述[J]. 计算机应用, 2023, 43(6): 1833-1841. |
QI C S, HE W S, JIAO Y, et al. Survey on anomaly detection algorithms for unmanned aerial vehicle flight data[J]. Journal of Computer Applications, 2023, 43(6): 1833-1841. | |
3 | 袁钟, 冯山. 基于邻域值差异度量的离群点检测算法[J]. 计算机应用, 2018, 38(7): 1905-1909. |
YUAN Z, FENG S. Outlier detection algorithm based on neighborhood value difference metric[J]. Journal of Computer Applications, 2018, 38(7): 1905-1909. | |
4 | 李衍志, 范勇, 高琳. 基于形态流的石油钻井水流异常检测[J]. 计算机应用, 2021, 41(6): 1842-1848. |
LI Y Z, FAN Y, GAO L. Anomaly detection of oil drilling water flow based on shape flow[J]. Journal of Computer Applications, 2021, 41(6): 1842-1848. | |
5 | KRYSZKIEWICZ M. Rough set approach to incomplete information systems[J]. Information Sciences, 1998, 112(1/2/3/4):39-49. |
6 | JELONEK J, KRAWIEC K, SLOWIŃSKI R. Rough set reduction of attributes and their domains for neural networks[J]. Computational Intelligence, 1995, 11(2):339-347. |
7 | MROZEK A, PLONKA L, KEDZIERA J. The methodology of rough controller synthesis[C]// Proceedings of the IEEE 5th International Conference on Fuzzy Systems. Piscataway: IEEE, 1996, 2: 1135-1139. |
8 | DONG L, CHEN D, WANG N, et al. Key energy-consumption feature selection of thermal power systems based on robust attribute reduction with rough sets[J]. Information Sciences, 2020, 532: 61-71. |
9 | JIANG F, SUI Y, CAO C. Some issues about outlier detection in rough set theory[J]. Expert Systems with Applications, 2009, 36(3): 4680-4687. |
10 | SANGEETHA T, MARY A G. A fuzzy proximity relation approach for outlier detection in the mixed dataset by using rough entropy-based weighted density method[J]. Soft Computing Letters, 2021, 3: 100027. |
11 | LIN T Y. Neighborhood systems-A qualitative theory for fuzzy and rough sets[J]. Advances in Machine Intelligence and Soft Computing, 1997, 4: 132-155. |
12 | HU Q, YU D, LIU J, et al. Neighborhood rough set based heterogeneous feature subset selection[J]. Information Sciences, 2008, 178(18): 3577-3594. |
13 | CHEN Y, MIAO D, ZHANG H. Neighborhood outlier detection[J]. Expert Systems with Applications, 2010, 37(12): 8745-8749. |
14 | GOH P Y, TAN S C, CHEAH W P. Mining outliers from medical datasets using neighbourhood rough set and data classification with neural network[C]// Emerging Trends in Neuro Engineering and Neural Computation. Singapore: Springer, 2017: 219-228. |
15 | YUAN Z, CHEN H, LI T, et al. Fuzzy information entropy-based adaptive approach for hybrid feature outlier detection[J]. Fuzzy Sets and Systems, 2021, 421: 1-28. |
16 | YUAN Z, CHEN H, LI T R, et al. Outlier detection based on fuzzy rough granules in mixed attribute data[J]. IEEE Transactions on Cybernetics, 2021, 52(8): 8399-8412. |
17 | ZHAO H, QIN K. Mixed feature selection in incomplete decision table[J]. Knowledge-Based Systems, 2014, 57: 181-190. |
18 | WEI D-K, ZHOU X-Z. Rough set model in incomplete and fuzzy decision information system based on improved-tolerance relation[C]// Proceedings of the 2005 IEEE International Conference on Granular Computing. Piscataway: IEEE, 2005, 1: 278-283. |
19 | PATRICIAN P A. Multiple imputation for missing data[J]. Research in Nursing & Health, 2002, 25(1): 76-84. |
20 | TAN A, WU W, LI J, et al. Evidence-theory-based numerical characterization of multigranulation rough sets in incomplete information systems[J]. Fuzzy Sets and Systems, 2016, 294: 18-35. |
21 | STEFANOWSKI J, TSOUKIÀS A. Incomplete information tables and rough classification[J]. Computational Intelligence, 2001, 17(3): 545-566. |
22 | LEUNG Y, LI D. Maximal consistent block technique for rule acquisition in incomplete information systems[J]. Information Sciences, 2003, 153: 85-106. |
23 | PAWLAK Z. Rough sets[J]. International Journal of Computer & Information Sciences, 1982, 11: 341-356. |
24 | DAI J. Rough set approach to incomplete numerical data[J]. Information Sciences, 2013, 241: 43-57. |
25 | HU Q, YU D, XIE Z, et al. Fuzzy probabilistic approximation spaces and their information measures[J]. IEEE Transactions on Fuzzy Systems, 2006, 14(2): 191-201. |
26 | LIANG J, CHIN K S, DANG C, et al. A new method for measuring uncertainty and fuzziness in rough set theory[J]. International Journal of General Systems, 2002, 31(4): 331-342. |
27 | QIAN Y, LIANG J, YAO Y, et al. MGRS: a multi-granulation rough set[J]. Information Sciences, 2010, 180(6): 949-970. |
28 | TANG J, CHEN Z, FU A W-C, et al. Enhancing effectiveness of outlier detections for low density patterns[C]// Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining. Berlin: Springer, 2002: 535-548. |
29 | ALMARDENY Y, BOUJNAH N, CLEARY F. A novel outlier detection method for multivariate data[J]. IEEE Transactions on Knowledge and Data Engineering, 2022, 34(9): 4052-4062. |
30 | COOK R D. Detection of influential observation in linear regression[J]. Technometrics, 2000, 42(1): 65-68. |
31 | PAPADIMITRIOU S, KITAGAWA H, GIBBONS P B, et al. LOCI: fast outlier detection using the local correlation integral[C]// Proceedings of the 19th International Conference on Data Engineering. Piscataway: IEEE, 2003: 315-326. |
32 | ARNING A, AGRAWAL R, RAGHAVAN P. A linear method for deviation detection in large databases[C]// Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. New York: ACM, 1996: 164-169. |
33 | GOLDSTEIN M, DENGEL A. Histogram-Based Outlier Score (HBOS): a fast unsupervised anomaly detection algorithm[C]// KI-2012: Poster and Demo Track. Kaiserslautern, Germany: DFKI, 2012, 1: 59-63. |
34 | BIRGÉ L, ROZENHOLC Y. How many bins should be put in a regular histogram[J]. ESAIM: Probability and Statistics, 2006, 10: 24-45. |
35 | BREUNIG M M, H-P KRIEGEL, NG R T, et al. LOF: identifying density-based local outliers[J]. ACM SIGMOD Record, 2000, 29(2): 93-104. |
36 | YUAN Z, CHEN H, LUO C, et al. MFGAD: multi-fuzzy granules anomaly detection[J]. Information Fusion, 2023, 95: 17-25. |
37 | LI R, CHEN H, LIU S, et al. Incomplete mixed data-driven outlier detection based on local — global neighborhood information[J]. Information Sciences, 2023, 633: 204-225. |
38 | ZHAO Y, NASRULLAH Z, LI Z. PyOD: a Python toolbox for scalable outlier detection[J]. Journal of Machine Learning Research, 2019, 20: 1-7. |
[1] | Tingwei CHEN, Jiacheng ZHANG, Junlu WANG. Random validation blockchain construction for federated learning [J]. Journal of Computer Applications, 2024, 44(9): 2770-2776. |
[2] | Hong CHEN, Bing QI, Haibo JIN, Cong WU, Li’ang ZHANG. Class-imbalanced traffic abnormal detection based on 1D-CNN and BiGRU [J]. Journal of Computer Applications, 2024, 44(8): 2493-2499. |
[3] | Yuhan LIU, Genlin JI, Hongping ZHANG. Video pedestrian anomaly detection method based on skeleton graph and mixed attention [J]. Journal of Computer Applications, 2024, 44(8): 2551-2557. |
[4] | Qianhong SHI, Yan YANG, Yongquan JIANG, Xiaocao OUYANG, Wubo FAN, Qiang CHEN, Tao JIANG, Yuan LI. Multi-granularity abrupt change fitting network for air quality prediction [J]. Journal of Computer Applications, 2024, 44(8): 2643-2650. |
[5] | Xinrui LIN, Xiaofei WANG, Yan ZHU. Academic anomaly citation group detection based on local extended community detection [J]. Journal of Computer Applications, 2024, 44(6): 1855-1861. |
[6] | Fan MENG, Qunli YANG, Jing HUO, Xinkuan WANG. EraseMTS: iterative active multivariable time series anomaly detection algorithm based on margin anomaly candidate set [J]. Journal of Computer Applications, 2024, 44(5): 1458-1463. |
[7] | Zimeng ZHU, Zhixin LI, Zhan HUAN, Ying CHEN, Jiuzhen LIANG. Weakly supervised video anomaly detection based on triplet-centered guidance [J]. Journal of Computer Applications, 2024, 44(5): 1452-1457. |
[8] | Qiye ZHANG, Xinrui ZENG. Efficient active-set method for support vector data description problem with Gaussian kernel [J]. Journal of Computer Applications, 2024, 44(12): 3808-3814. |
[9] | Pei ZHAO, Yan QIAO, Rongyao HU, Xinyu YUAN, Minyue LI, Benchu ZHANG. Multivariate time series anomaly detection based on multi-domain feature extraction [J]. Journal of Computer Applications, 2024, 44(11): 3419-3426. |
[10] | Yongjiang LIU, Bin CHEN. Pixel-level unsupervised industrial anomaly detection based on multi-scale memory bank [J]. Journal of Computer Applications, 2024, 44(11): 3587-3594. |
[11] | Hui JIANG, Qiuyan YAN, Zhujun JIANG. Symmetric positive definite autoencoder method for multivariate time series anomaly detection [J]. Journal of Computer Applications, 2024, 44(10): 3294-3299. |
[12] | Lishuo YE, Zhixue HE. Multiscale time series anomaly detection incorporating wavelet decomposition [J]. Journal of Computer Applications, 2024, 44(10): 3300-3306. |
[13] | Yuning ZHANG, Abudukelimu ABULIZI, Tisheng MEI, Chun XU, Maierdana MAIMAITIREYIMU, Halidanmu ABUDUKELIMU, Yutao HOU. Anomaly detection method for skeletal X-ray images based on self-supervised feature extraction [J]. Journal of Computer Applications, 2024, 44(1): 175-181. |
[14] | Chaoshuai QI, Wensi HE, Yi JIAO, Yinghong MA, Wei CAI, Suping REN. Survey on anomaly detection algorithms for unmanned aerial vehicle flight data [J]. Journal of Computer Applications, 2023, 43(6): 1833-1841. |
[15] | Xiaoyan ZHANG, Jiayi WANG. Comparison of three-way concepts under attribute clustering [J]. Journal of Computer Applications, 2023, 43(5): 1336-1341. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||