计算机应用 ›› 2016, Vol. 36 ›› Issue (1): 44-51.DOI: 10.11772/j.issn.1001-9081.2016.01.0044

• 第32届中国数据库学术会议(NDBC 2015) • 上一篇    下一篇

基于手机轨迹数据的人口流动分析

孔扬鑫, 金澈清, 王晓玲   

  1. 华东师范大学 数据科学与工程研究院, 上海 200062
  • 收稿日期:2015-09-15 修回日期:2015-10-12 出版日期:2016-01-10 发布日期:2016-01-09
  • 通讯作者: 金澈清(1977-),男,浙江文成人,教授,博士生导师,博士,CCF会员,主要研究方向:数据流管理、基于位置服务、不确定数据管理
  • 作者简介:孔扬鑫(1991-),男,河北辛集人,硕士研究生,主要研究方向:位置服务技术及应用、数据挖掘;王晓玲(1975-),女,山东烟台人,教授,博士生导师,博士,CCF会员,主要研究方向:面向数据密集型计算的数据管理、位置服务技术及应用。
  • 基金资助:
    国家973计划项目(2012CB316203);国家自然科学基金资助项目(61170085,61472141,61370101)。

Population flow analysis based on cellphone trajectory data

KONG Yangxin, JIN Cheqing, WANG Xiaoling   

  1. Institute for Data Science and Engineering, East China Normal University, Shanghai 200062, China
  • Received:2015-09-15 Revised:2015-10-12 Online:2016-01-10 Published:2016-01-09
  • Supported by:
    This work is partially supported by the National Basic Research Program (973 Program) of China (2012CB316203), the National Natural Science Foundation of China (61170085, 61472141, 61370101).

摘要: 随着通信技术的发展和智能手机的普及,运营商基站所采集的大规模手机轨迹数据在城市规划、人口迁移等领域中发挥了重要价值。针对城市人口流动问题,提出一种利用手机轨迹数据的基于轨迹行为特征的人口流动判定(MF-JUPF)算法。首先,可对手机轨迹数据进行数据预处理,以提取用户活动轨迹;然后根据进出城市的行为模式提取重要特征,再根据真实标注数据集合利用多种分类模型进行参数训练;最后,根据模型训练结果判定用户轨迹是否为进出城市行为。所提系统使用MapReduce框架进行数据分析,以提高性能和可扩展性。基于真实数据集合的实验结果表明,对于进出城市的判定,该方法的准确率和召回率可达80%以上,与基于信号消失时长的人口流动判定(SD-JUPF)算法相比,在判定进入城市的准确率上提高了19.0%,召回率提高了13.9%;在判定离开城市的准确率上提高了17.3%,召回率提高了6.1%。相比非过滤算法,根据手机轨迹数据特点进行的数据过滤算法可减少处理时间36.1%以上。理论分析和实验结果表明MF-JUPF方法精度高,可扩展性好,因此对城市规划等领域有重要应用价值。

关键词: 基于位置服务, 手机轨迹数据, 人口流动, 城市规划, MapReduce

Abstract: With the development of communication technology and popularization of smartphones, the massive cellphone trajectory data gathered by base stations plays an important role in some applications, such as urban planning and population flow analysis. In this paper, a Movement Features-based Judging Urban Population Flow (MF-JUPF) algorithm utilizing cellphone trajectory data was proposed to deal with the issue about the population flow. First, users' activity trajectories were mined from cellphone trajectory data after data preprocessing. Second, the movement features were extracted according to the pattern of entering and leaving a city, and the parameters of these features were trained using various classification models upon real data sets. Finally, trained classification models were used to judge whether a user came in/out of the city. To enhance the efficiency and scalability, a MapReduce-based algorithm was developed to analyze massive cellphone trajectory data sets. As reported in the experimental part upon real data sets, the precision and recall of the proposed solution to judge the entering and leaving behaviors were greater than 80%. In comparison with Signal Disappears-based Judging Urban Population Flow (SD-JUPF) algorithm, the precision and recall of entering city judgment increased by 19.0% and 13.9%, and the precision and recall of leaving city judgment increased by 17.3% and 6.1%. Compared with the non-filtering algorithm, the time cost of the improved filtering algorithm was reduced by 36.1% according to the traits of these data. The theoretical analyses and experimental results illustrate the high accuracy and flexibility of MF-JUPF which has applicable values in urban planning and other fields.

Key words: location-based service, cellphone trajectory data, population flow, urban planning, MapReduce

中图分类号: