Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (6): 1832-1841.DOI: 10.11772/j.issn.1001-9081.2023060761

Special Issue: 数据科学与技术

• Data science and technology • Previous Articles     Next Articles

Early classification model of multivariate time series based on orthogonal locality preserving projection and cost optimization

Zixuan YUAN, Xiaoqing WENG(), Ningzhen GE   

  1. School of Information Technology,Hebei University of Economics and Business,Shijiazhuang Hebei 050061,China
  • Received:2023-06-16 Revised:2023-09-20 Accepted:2023-09-21 Online:2023-10-09 Published:2024-06-10
  • Contact: Xiaoqing WENG
  • About author:YUAN Zixuan, born in 1997, M. S. candidate. Her research interests include data mining, time series analysis.
    GE Ningzhen, born in 1992, M. S. His research interests include data mining, time series anomaly recognition.
  • Supported by:
    Cultivation Project of Hebei University of Economics and Business(2021PY058)

基于正交局部保持映射和成本优化的多变量时间序列早期分类模型

袁子璇, 翁小清(), 戈宁振   

  1. 河北经贸大学 信息技术学院,石家庄 050061
  • 通讯作者: 翁小清
  • 作者简介:袁子璇(1997—),女,河北石家庄人,硕士研究生,主要研究方向:数据挖掘、时间序列分析
    戈宁振(1992—),男,河北张家口人,硕士,主要研究方向:数据挖掘、时间序列异常识别。
  • 基金资助:
    河北经贸大学培育项目(2021PY058)

Abstract:

Early Time Series Classification (ETSC) has two contradictory goals: earliness and accuracy. The realization of early classification is always at the expense of its accuracy. The existing optimization-based early classification methods of Multivariate Time Series (MTS) consider the costs of wrong classification and delayed decision-making in the cost function, but ignore the influence of local structure between samples in MTS dataset on classification performance. To solve the problem, an early classification model of MTS based on Orthogonal Locality Preserving Projection (OLPP) and cost Optimization for Accuracy and Earliness (OLPPMOAE) was proposed. First, MTS sample prefixes were mapped to a low-dimensional space by using OLPP to keep the local structure of the original dataset. Then, a group of Gaussian Process (GP) classifiers were trained in low-dimensional space, and the class probabilities of the training set at each moment were generated. Finally, Particle Swarm Optimization (PSO) algorithm was used to learn the optimal parameters in the stopping rule from these kinds of probabilities. The experimental results on six MTS datasets show that, the accuracy of OLPPMOAE is significantly higher than that of the cost-based model R1_Clr(stopping Rule and Cost function with regularization term l1 and l2) with essentially the same earliness, the average accuracy is improved by 11.33% to 15.35%, and the Harmonic Mean (HM) is improved by 4.71% to 9.01%. Therefore, the proposed model can classify MTS as early as possible with high accuracy.

Key words: Multivariate Time Series (MTS), early classification, Orthogonal Locality Preserving Projection (OLPP), cost optimization, Gaussian Process (GP) classifier

摘要:

时间序列早期分类(ETSC)有两个矛盾的目标:早期性和准确率。分类早期性的实现,总是以牺牲它的准确率为代价。现有基于优化的多变量时间序列(MTS)早期分类方法,虽然在成本函数中考虑了错误分类成本和延迟决策成本,却忽视了MTS数据集样本之间的局部结构对分类性能的影响。针对这个问题,提出一种基于正交局部保持映射(OLPP)和成本优化的MTS早期分类模型(OLPPMOAE)。首先,使用OLPP将MTS样本前缀映射到低维空间,保持原数据集的局部结构;其次,在低维空间训练一组高斯过程(GP)分类器,生成训练集每个时刻的类概率;最后,使用粒子群优化(PSO)算法从这些类概率中学习停止规则中的最优参数。在6个MTS数据集上的实验结果表明,在早期性基本持平的情况下,OLPPMOAE的准确率显著高于基于成本的R1_Clr(stopping Rule and Cost function with regularization term l1 and l2)模型,平均准确率能够提升11.33%~15.35%,调和均值(HM)能够提升4.71%~9.01%。因此,所提模型能够以较高的准确率尽早地分类MTS。

关键词: 多变量时间序列, 早期分类, 正交局部保持映射, 成本优化, 高斯过程分类器

CLC Number: