计算机应用 ›› 2019, Vol. 39 ›› Issue (11): 3293-3297.DOI: 10.11772/j.issn.1001-9081.2019040738

• 数据科学与技术 • 上一篇    下一篇

基于轮廓系数的参数无关空中交通轨迹聚类方法

孙石磊, 王超, 赵元棣   

  1. 中国民航大学 空中交通管理研究基地, 天津 300300
  • 收稿日期:2019-04-29 修回日期:2019-06-30 发布日期:2019-08-21 出版日期:2019-11-10
  • 通讯作者: 孙石磊
  • 作者简介:孙石磊(1982-),男,河北唐山人,讲师,硕士,主要研究方向:机器学习、数据挖掘;王超(1971-),男,天津人,教授,博士,主要研究方向:空中交通系统仿真与分析、交通运输规划与管理;赵元棣(1983-),男,天津人,助理研究员,博士,主要研究方向:空管信息处理。
  • 基金资助:
    中国民航大学空中交通管理研究基地开放基金资助项目(KGJD201702)。

Parameter independent clustering of air traffic trajectory based on silhouette coefficient

SUN Shilei, WANG Chao, ZHAO Yuandi   

  1. Research Base of Air Traffic Management, Civil Aviation University of China, Tianjin 300300, China
  • Received:2019-04-29 Revised:2019-06-30 Online:2019-08-21 Published:2019-11-10
  • Supported by:
    This work is partially supported by the Foundation of Research Base of Air Traffic Management of Civil Aviation University of China (KGJD201702).

摘要: 为消除专家经验的主观性、避免依赖轨迹特征并且减轻实验调参的负担,提出一种基于轮廓系数的参数无关聚类分析(PICBASIC)算法。首先,比较了现有基于欧氏距离的航迹配对方法,并且建立基于动态时间弯曲(DWT)距离和高斯核函数的轨迹相似度计算模型;其次,利用谱聚类对空中交通轨迹进行聚类划分;最后,提出一种基于轮廓系数的最佳簇数寻优方法,并且其具有对聚类结果量化评价功能。利用真实进场轨迹进行实验验证,PICBASIC判断将28L跑道的365条轨迹聚为5个簇,28R跑道的530条轨迹聚为6个簇时聚类质量最佳,平均轮廓系数分别为0.8099和0.8056。相同实验数据条件下,PICBASIC与MeanShift聚类的平均轮廓系数差异率分别为-1.23%和0.19%。实验结果表明:PICBASIC包容轨迹的速度和长度差异,全程无需人工指导或实验调参,而且能够筛除异常轨迹对聚类质量的不利影响。

关键词: 空中交通轨迹, 聚类分析, 轮廓系数, 谱聚类, 动态时间弯曲, 高斯核函数, 参数无关

Abstract: In order to eliminate the subjectivity of expert experience, get rid of the dependence on trajectory characteristics and reduce the burden of experimental parameter tuning, a Parameter Independent Clustering BAsed on SIlhoutte Coefficient (PICBASIC) algorithm was proposed. Firstly, existing Euclidean distance based track pairing methods were compared, and a trajectory similarity calculation model based on Dynamic Time Warping (DWT) distance and Gaussian kernel function was established. Secondly, the air traffic trajectories were partitioned and clustered by spectral clustering. Finally, a cluster number optimization method based on silhouette coefficient was proposed, and it had the function of quantitative evaluation of clustering results. Experiments were carried out by using real arrival trajectories to verify the validity of the proposed algorithm. PICBASIC judged that the clustering quality would be respectively optimum if the 365 trajectories of runway 28L were clustered into 5 clusters and the 530 trajectories of runway 28R were clustered into 6 clusters. The average silhouette coefficients in the two situations were respectively 0.8099 and 0.8056. Under the same experimental conditions, the difference rates of average silhouette coefficient between PICBASIC and MeanShift clustering were respectively -1.23% and 0.19%. The experimental results demonstrate that PICBASIC can tolerate the speed and length differences of trajectories, dispense with manual guidance or experimental parameter tuning and filter out the adverse impact of abnormal trajectories on the clustering quality.

Key words: air traffic trajectory, clustering analysis, silhouette coefficient, spectral clustering, Dynamic Time Warping (DTW), Gaussian kernel function, parameter independent

中图分类号: