Journal of Computer Applications ›› 2019, Vol. 39 ›› Issue (4): 1041-1045.DOI: 10.11772/j.issn.1001-9081.2018081837

Previous Articles     Next Articles

Time series similarity measure based on Siamese neural network

JIANG Yifan, YE Qing   

  1. School of of Electrical & Information Engineering, Changsha University of Science & Technology, Changsha Hunan 411000, China
  • Received:2018-09-05 Revised:2018-10-19 Online:2019-04-10 Published:2019-04-10

基于孪生神经网络的时间序列相似性度量

姜逸凡, 叶青   

  1. 长沙理工大学 电气与信息工程学院, 长沙 411000
  • 通讯作者: 叶青
  • 作者简介:姜逸凡(1993-),男,湖南娄底人,硕士研究生,主要研究方向:机器学习、数据挖掘;叶青(1963-),女,湖南长沙人,教授,硕士,主要研究方向:模式识别、道路交通智能检测。

Abstract: In data mining such as time series classification, the similarity performance based on category of different datasets are significantly different from each other. Therefore, a reasonable and effective similarity measure is crucial to data mining. The traditional methods such as Euclidean Distance (ED), cosine distance and Dynamic Time Warping (DTW) only focus on the similarity formula of the data themselves, but ignore the influence of the knowledge annotation contained in different datasets on the similarity measure. To solve this problem, a learning method of time series similarity measure based on Siamese Neural Network (SNN) was proposed. In the method, the neighborhood relationship between the data was learnt from the supervision information of sample tags, and an efficient distance measure between time series was established. The similarity measurement and confirmatory classification experiments were performed on UCR-provided time series datasets. Experimental results show that compared with ED/DTW-1NN(one Nearest Neighbors), the overall classification quality of SNN is improved significantly. The Dynamic Time Warping (DTW)-based 1NN calssification method outperforms the SNN-based 1NN classification method on some data, but SNN outperforms DTW in complexity and speed of similarity calculation during the classification. The results show that the proposed method can significantly improve the measurement efficiency of the classification of dataset similarity, and has good performance for high-dimensional and complex time-series data classification.

Key words: time serie, similarity measure, neural network, Siamese Neural Network (SNN)

摘要: 在时间序列分类等数据挖掘工作中,不同数据集基于类别的相似性表现有明显不同,因此一个合理有效的相似性度量对数据挖掘非常关键。传统的欧氏距离、余弦距离和动态时间弯曲等方法仅针对数据自身进行相似度公式计算,忽略了不同数据集所包含的知识标注对于相似性度量的影响。为了解决这一问题,提出基于孪生神经网络(SNN)的时间序列相似性度量学习方法。该方法从样例标签的监督信息中学习数据之间的邻域关系,建立时间序列之间的高效距离度量。在UCR提供的时间序列数据集上进行的相似性度量和验证性分类实验的结果表明,与ED/DTW-1NN相比SNN在分类质量总体上有明显的提升。虽然基于动态时间弯曲(DTW)的1近邻(1NN)分类方法在部分数据上表现优于基于SNN的1NN分类方法,但在分类过程的相似度计算复杂度和速度上SNN优于DTW。可见所提方法能明显提高分类数据集相似性的度量效率,在高维、复杂的时间序列的数据分类上有不错的表现。

关键词: 时间序列, 相似性度量, 神经网络, 孪生神经网络

CLC Number: