Application of biclustering algorithm in high-value telecommunication customer segmentation
LIN Qin1,XUE Yun2
1. School of Information Engineering, Guangdong Medical College, Dongguan Guangdong 523808, China;
2. School of Physics and Telecommunication Engineering, South China Normal University, Guangzhou Guangdong 510006, China
To improve the accuracy of traditional method for customer segmentation, the Large Average Submatrix (LAS) biclustering algorithm was used, which performed clusting on customer samples and consumer attributes simultaneously to identify the upscale and high-value customers. By introducing a new value yardstick and a novel index named PA, the LAS biclustering algorithm was compared with K-means clustering algorithm based on a simulation experiment on consumption data of a telecom corporation. The experimental result shows that the LAS biclustering algorithm finds more groups of high-value customers and obtains more accurate clusters. Therefore, it is more suitable for recognition and segmentation of high-value customers.
ZEITHAML V A, RUST R T, LEMON K N. The customer pyramid: creating and serving profitable customers [J]. California Management Review, 2001,43(4):118-142.
[2]
JACKSON B B. Build customer relationships that last [J]. Harvard Business Review, 1985,63(10):120-128.
[3]
BERGER P D, NASR N I. Customer lifetime value: marketing models and applications [J]. Journal of Interactive Marketing, 1998,12(1):17-30.
[4]
CHEN M. Research of customer retention and lifetime value [D]. Xi'an: Xi'an Jiaotong University, 2001.(陈明亮.客户保持与生命周期研究 [D].西安:西安交通大学,2001.)
[5]
QI J. Research of enterprise customer value [D]. Xi'an: Xi'an Jiaotong University, 2002.(齐佳音.企业客户价值研究[D].西安:西安交通大学,2002.)
[6]
QU Z, ZHENG Y, LYU T. Realizing customer behavious analysis based on clustering [J]. Journal of Northeast Normal University: Natural Science, 2006,38(2):19-21.(曲昭伟,郑岩,吕廷杰.基于聚类实现客户行为分析[J].东北师大学报:自然科学版,2006,38(2):19-21.)
[7]
ZHAO M, NI Z, LIU B. Application research of K-means clustering and naive Bayesian algorithm in business intelligence [J]. Computer Technology and Development, 2010,20(4):179-182.(赵敏, 倪志伟, 刘斌.K-means与朴素贝叶斯在商务智能中的应用[J].计算机技术与发展,2010,20(4):179-182.)
[8]
ZHENG G, ZHANG B, GUO P, et al. Analysis of clustering algorithm in behavior mode of customers in China telecom [J]. Journal of Chongqing University: Natural Science, 2006,29(4):119-121.(郑国荣,张邦礼,郭鹏,等.聚类分析在电信消费模式中的应用[J].重庆大学学报:自然科学版,2006,29(4):119-121.)
[9]
SHABALIN A A, WEIGMAN V J, PEROU C M, et al. Finding large average submatrices in high dimensional data [J]. The Annals of Applied Stastistics, 2009,3(3):985-1012.
[10]
CHENG Y, CHURCH G M. Biclustering of expression data [EB/OL]. [2013-07-03]. ftp://samba.ad.sdsc.edu/pub/sdsc/biology/ISMB00/157.pdf.
[11]
DHILLON I S. Co-clustering documents and words using bipartite spectral graph partitioning [C]// KDD 2001: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2001:269-274.
[12]
BANERJEE A, DHILLON L, GHOSH J, et al. A generalized maximum entropy approach to Bregman co-clustering and matrix approximations [C]// KDD 2004: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2004:509-514.
[13]
SU X, KHOSHGOFTAAR T M. A survey of collaborative filtering techniques [J]. Advances in Artificial Intelligence, 2009,2009(4):421-445.ns [J]. Journal of Cybernetica, 1974,4(1):95-104.
[15]
CALINSKI T, HARABASZ J. A dendrite method for cluster analysis [J]. Communication in Stastistics, 1974,3(1):1-27.