计算机应用 ›› 2016, Vol. 36 ›› Issue (7): 2046-2050.DOI: 10.11772/j.issn.1001-9081.2016.07.2046

• 行业与领域应用 • 上一篇    下一篇

基于呼叫详细记录数据的城市功能区识别

江贵林, 胡访宇, 石立兴   

  1. 中国科学技术大学 信息科学与技术学院, 合肥 230027
  • 收稿日期:2015-12-25 修回日期:2016-03-26 出版日期:2016-07-10 发布日期:2016-07-14
  • 通讯作者: 江贵林
  • 作者简介:江贵林(1990-),男,安徽安庆人,硕士研究生,主要研究方向:移动通信数据分析、数据挖掘;胡访宇(1955-),男,浙江永康人,教授,硕士,主要研究方向:通信网理论、智能信息系统、通信信号处理;石立兴(1987-),男,福建龙岩人,硕士研究生,主要研究方向:移动通信数据分析。
  • 基金资助:
    安徽省科技计划项目(1201b0403021)。

Urban functional area identification based on call detail record data

JIANG Guilin, HU Fangyu, SHI Lixing   

  1. School of Information Science and Technology, University of Science and Technology of China, Hefei Anhui 230027, China
  • Received:2015-12-25 Revised:2016-03-26 Online:2016-07-10 Published:2016-07-14
  • Supported by:
    This work is partially supported by the Anhui Province Science and Technology Plan (1201b0403021).

摘要: 不同城市功能区区域之间具有外在物理差异和内在功能差异,且随时间和人类活动不断发生演变。针对卫星遥感等传统监测方法存在运行周期长、成本高,不能表征内在功能差异等问题,利用通信运营商提供的用户生活数据——呼叫详细记录(CDR),进行城市功能区识别研究。首先,以人工标注的方法对基站小区进行功能区分类,得到住宅区、办公区、商业区、高校区、景点区五类训练样本;然后,提取各功能区内用户群体的通话行为和移动行为特征,进行差异性对比分析;最后,利用高斯混合模型(GMM)设计出多特征加权判决的功能区识别算法,并用训练集对该算法进行仿真。实验结果表明,CDR数据可以对城市功能区之间的内在差异进行表征,功能区性质与用户的通话行为和移动行为存在对应关系;判决权重为0.6时,该算法在现有数据集下的功能区召回率达到了最大,为51.08%。结合误差分析表明CDR数据在城市功能区识别应用上具有可行性。

关键词: 呼叫详细记录, 功能区, 机器学习, 城市感知, 高斯混合模型

Abstract: Urban function areas can be differentiated either by their external physical characteristics or by inherent social functions. And, they have been keeping in dynamic process over time. Remote sensing, as a typical traditional method in urban function area classification, has its critical defects such as high time cost and helpless in their social functions. In order to solve the problem, a new urban functional area identification method based on Call Detail Record (CDR) data was proposed. The application of this new data source in urban land use classification was verified as follow steps. First, communication station cells were labeled with five categories (residence area, office area, commercial area, college area, scenic-spot area). Second, call duration distribution features and move-frequency features, extracted from these five urban function areas were compared and analyzed. Finally, a weighted decision algorithm based on the Gaussian Mixture Model (GMM) was designed, and the simulation on the training set was conducted. The experimental results prove that the CDR data is capable of delivering useful information between different urban function areas. There are corresponding relationships between the nature of urban functional areas and the behavior characteristics of mobile phone users. When decision weight is 0.6, the weighted decision algorithm achieves 51.08% recall rate in current datasets. Combined with the error analysis, this work indicates the feasibility of CDR data in solving the problem of urban functional area identification.

Key words: Call Detail Record (CDR), function area, machine learning, urban sensing, Gaussian Mixture Model (GMM)

中图分类号: