Abstract:Aiming at the problem that traditional confidence threshold setting methods for positive and negative association rules are difficult to limit the number of low-reliability rules and easy to miss some interesting association rules, a new two-level confidence threshold setting method combined with the rule's itemset correlation was proposed, called PNMC-TWO. Firstly, taking into account the consistency, validity and interestingness of rules, under the framework of correlation-support-confidence, on the basis of the computation relationship between rule confidence and itemset support of the rule, the law of confidence of rule changing with support of itemsets of the rule was analyzed systematically. And then, combined with the user's requirement of high confidence and interesting rules in actual mining, a new confidence threshold setting model was proposed to avoid the blindness and randomness of the traditional methods when setting the threshold. Finally, the proposed method was compared with the original two-threshold method in terms of the quantity and quality of the rule. The experimental results show that the new two-level threshold method not only can ensure that the extracted association rules are more effective and interesting, but also can reduce the number of low-reliability rules significantly.
[1] AGRAWAL R, SRIKAN R. Fast algorithms for mining association rules in large databases[C]//Proceedings of the 20th International Conference on Very Large Data Bases. San Francisco, CA:Morgan Kaufmann Publishers Inc., 1994:487-499. [2] HAN J, PEI J, YIN Y. Mining frequent patterns without candidate generation[J]. ACM SIGMOD Record, 1999, 29(2):1-12. [3] ZAKI M J. Scalable algorithms for association mining[J]. IEEE Transactions on Knowledge & Data Engineering, 2000,12(3):372-390. [4] BRIN S, MOTWANI R, SILVERSTEIN C. Beyond market baskets:generalizing association rules to correlations[J]. ACM SIGMOD Record, 1997, 26(2):265-276. [5] 冯山, 游晋峰. 含负项的关联规则挖掘研究综述[J]. 四川师范大学学报(自然科学版), 2011, 34(5):746-750.(FENG S, YOU J F. The mining association rules with negative review[J]. Journal of Sichuan Normal University (Natural Science Edition), 2011, 34(5):746-750.) [6] WU X, ZHANG C, ZHANG S. Efficient mining of both positive and negative association rules[J]. ACM Transactions on Information Systems, 2004, 22(3):381-405. [7] PAUL A. Positive and negative association rule mining using correlation threshold and dual confidence approach[C]//Proceedings of the 2015 International Conference on Computational Intelligence in Data Mining. Berlin:Springer, 2016:249-260. [8] DONG X, SUN F, HAN X, et al. Study of positive and negative association rules based on multi-confidence and chi-squared test[C]//ADMA 2006:International Conference on Advanced Data Mining and Applications, LNCS 4093. Berlin:Springer, 2006:100-109. [9] HAMALAINEN W. Kingfisher:an efficient algorithm for searching for both positive and negative dependency rules with statistical significance measures[J]. Knowledge & Information Systems, 2012, 32(2):383-414. [10] PIAO X, WANG Z, LIU G. Research on mining positive and negative association rules based on dual confidence[C]//Proceedings of the 2010 International Conference on Internet Computing for Science & Engineering. Washington, DC:IEEE Computer Society, 2010:102-105. [11] WU T, CHEN Y, HAN J. Re-examination of interestingness measures in pattern mining:a unified framework[J]. Data Mining & Knowledge Discovery, 2010, 21(3):371-397. [12] 董祥军, 王淑静, 宋瀚涛. 基于两级支持度的正、负关联规则挖掘[J]. 计算机工程, 2005, 31(10):16-18.(DONG X J, WANG S J, SONG H T. Mining positive and negative association rules based on two level support[J]. Computer Engineering, 2005, 31(10):16-18.) [13] DONG X, NIU Z, SHI X, et al. Mining both positive and negative association rules from frequent and infrequent itemsets[C]//Proceedings of the 3rd International Conference on Advanced Data Mining and Applications. Berlin:Springer-Verlag, 2007:122-133. [14] SWESI I M A O, BAKAR A A, KADIR A S A. Mining positive and negative association rules from interesting frequent and infrequent itemsets[C]//Proceedings of the 20129th International Conference on Fuzzy Systems and Knowledge Discovery. Piscataway, NJ:IEEE, 2012:650-655. [15] ANTONIE M L. Mining positive and negative association rules:an approach for confined rules[C]//Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases. New York:Springer-Verlag, 2004:27-38.