[1] YANG Q, WU X. 10 challenging problems in data mining research[J]. International Journal of Information Technology & Decision Making. 2006, 5(4):597-604. [2] YANG Z, TANG W H, SHINTEMIROV A, et al. Association rule mining-based dissolved gas analysis for fault diagnosis of power transformers[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part C:Applications and Reviews, 2009, 39(6):597-610. [3] KHREICH W, GRANGER E, MIRI A, et al. Iterative Boolean combination of classifiers in the ROC space:an application to anomaly detection with HMMs[J]. Pattern Recognition. 2010, 43(8):2732-2752. [4] MAZUROWSKI M A, HABAS P A, ZURADA J M, et al. 2008 special issue:training neural network classifiers for medical decision making:the effects of imbalanced datasets on classification performance[J]. Neural Networks. 2008, 21(2/3):427-436. [5] LIU Y-H, CHEN Y-T. Total margin based adaptive fuzzy support vector machines for multiview face recognition[C]//Proceedings of the 2005 IEEE International Conference on Systems, Man and Cybernetics. Piscataway, NJ:IEEE, 2005:1704-1711. [6] QUINLAN J R. Improved estimates for the accuracy of small disjuncts[J]. Machine Learning, 1991, 6(1):93-98. [7] ZADROZNY B, ELKAN C. Learning and making decisions when costs and probabilities are both unknown[C]//SIGKDD 2001:Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM, 2001:204-213. [8] WU G, CHANG E Y. KBA:kernel boundary alignment considering imbalanced data distribution[J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(6):786-795. [9] BATISTA G E A P A, PRATI R C, MONARD M C. A study of the behavior of several methods for balancing machine learning training data[J]. ACM SIGKDD Explorations Newsletter-Special Issue on Learning from Imbalanced Datasets, 2004, 6(1):20-29. [10] CHAWLA N V, JAPKOWICZ N, KOTCZ A. Editorial:special issue on learning from imbalanced data sets[J]. ACM SIGKDD Explorations Newsletter-Special Issue on Learning from Imbalanced Datasets, 2004, 6(1):1-6. [11] GENG G-G, WANG C-H, LI Q-D, et al. Boosting the performance of Web spam detection with ensemble under-sampling classification[C]//FSKD'07:Proceedings of the IEEE Fourth International Conference on Fuzzy Systems and Knowledge Discovery. Piscataway, NJ:IEEE, 2007, 4:583-587. [12] CHAWLA N V, BOWYER K W, HALL L O, et al. SMOTE:synthetic minority over-sampling technique[J]. Journal of Artificial Intelligence Research, 2002, 16(1):321-357. [13] CHAWLA N V, CIESLAK D A, HALL L O, et al. Automatically countering imbalance and its empirical relationship to cost[J]. Data Mining and Knowledge Discovery, 2008, 17(2):225-252. [14] FREITAS A, COSTA-PEREIRA A, BRAZDIL P. Cost-sensitive decision trees applied to medical data[C]//DaWaK 2007:Proceedings of the 9th International Conference on Data Warehousing and Knowledge Discovery, LNCS 4654. Berlin Heidelberg:Springer, 2007:303-312. [15] SPIRIN N, HAN J. Survey on Web spam detection:principles and algorithms[J]. ACM SIGKDD Explorations Newsletter, 2012, 13(2):50-64. [16] CASTILLO C, DONATO D, BECCHETTI L, et al. A reference collection for Web spam[J]. ACM SIGIR Forum. 2006, 40(2):11-24. [17] FAWCETT T. An introduction to ROC analysis[J]. Pattern Recognition Letters, 2006, 27(8):861-874. [18] DAVIS J, GOADRICH M. The relationship between precision-recall and ROC curves[C]//Proceedings of the 23rd International Conference on Machine Learning. New York:ACM, 2006:233-240. [19] 卢晓勇,陈木生.基于随机森林和克隆选择的垃圾网页检测[J].计算机应用,2016,36(1):156-159. (LU X Y, CHEN M S. Web spam detection based on random forests and under-sampling ensemble[J]. Journal of Computer Applications, 2016, 36(1):156-159.). [20] 卢晓勇,陈木生,吴政隆,等.基于免疫克隆特征选择和欠采样集成的垃圾网页检测[J].计算机应用,2016,36(7):1899-1903. (LU X Y, CHEN M S, WU J L, et al. Web spam detection based on immune clonal feature selection and under-sampling ensemble[J]. Journal of Computer Applications, 2016, 36(7):1899-1903.) |