《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (11): 3473-3478.DOI: 10.11772/j.issn.1001-9081.2021091692

• CCF 2021中国数字服务大会 • 上一篇    下一篇

面向边缘智能计算的数据场分类算法

孙志于1, 王琪2, 高彬2, 梁中军3(), 徐晓斌2, 王尚广4   

  1. 1.新疆气象信息中心, 乌鲁木齐 830002
    2.北京工业大学 信息学部, 北京 100124
    3.国家气象信息中心 资料服务室, 北京 100081
    4.网络与交换技术国家重点实验室(北京邮电大学), 北京 100876
  • 收稿日期:2021-09-29 修回日期:2021-10-29 接受日期:2021-11-08 发布日期:2022-03-02 出版日期:2022-11-10
  • 通讯作者: 梁中军
  • 作者简介:孙志于(1973—),男,江苏新沂人,高级工程师,主要研究方向:云计算、气象大数据
    王琪(1998—),女,北京人,硕士研究生,主要研究方向:天地一体化信息网络、物联网、移动边缘计算
    高彬(1996—),男,山西太原人,硕士研究生,主要研究方向:网络大数据
    梁中军(1983—),男,新疆乌鲁木齐人,高级工程师,博士,主要研究方向:云计算、气象大数据 liangzj@cma.gov.cn
    徐晓斌(1986—),男,河南鹤壁人,讲师,博士,CCF会员,主要研究方向:天地一体化信息网络、物联网、移动边缘计算
    王尚广(1982—),男,河南周口人,教授,博士,CCF会员,主要研究方向:服务计算、6G、移动边缘计算。

Data field classification algorithm for edge intelligent computing

Zhiyu SUN1, Qi WANG2, Bin GAO2, Zhongjun LIANG3(), Xiaobin XU2, Shangguang WANG4   

  1. 1.Xinjiang Meteorological Information Center,Urumqi Xinjiang 830002,China
    2.Faculty of Information Technology,Beijing University of Technology,Beijing 100124,China
    3.Information Service Department,National Meteorological Information Center,Beijing 100081,China
    4.State Key Laboratory of Networking and Switching Technology,Beijing University of Posts and Telecommunications,Beijing 100876,China
  • Received:2021-09-29 Revised:2021-10-29 Accepted:2021-11-08 Online:2022-03-02 Published:2022-11-10
  • Contact: Zhongjun LIANG
  • About author:SUN Zhiyu, born in 1973, senior engineer. His research interests include cloud computing, meteorological big data.
    WANG Qi, born in 1998, M. S. candidate. Her research interests include space‑air‑ground integrated information network, internet of things, mobile edge computing.
    GAO Bin, born in 1996, M. S. candidate. His research interests include network big data.
    LIANG Zhongjun, born in 1983, Ph. D., senior engineer. His research interests include cloud computing, meteorological big data.
    XU Xiaobin, born in 1986, Ph. D., lecturer. His research interests include space‑air‑ground integrated information network, internet of things, mobile edge computing.
    WANG Shangguang, born in 1982, Ph. D., professor. His research interests include service computing, 6G, mobile edge computing.

摘要:

针对聚类算法研究中普遍存在不能充分利用历史信息、参数优化过程慢的问题,结合边缘智能计算提出了一种基于数据场的分布式自适应分类算法,算法部署于边缘计算(EC)节点,提供本地的智能分类服务。该算法通过引入监督信息改造传统数据场聚类模型的结构,使其能够应用于分类问题,扩展了数据场理论可应用的领域。基于数据场思想,该算法将数据的域值空间转化为数据势场空间,依据空间势值将数据分为无标签的多个类簇结果,再将类簇结果与历史监督信息进行云相似度比较,并将其归属于与其最相似的类中;同时,提出了一种基于滑动步长的参数搜索策略以提高算法参数的优化速度。在此算法基础上还提出了一种基于分布式的数据处理方案,通过云中心与边缘设备的协作,将分类任务切割分配到不同层次的节点,实现模块化、低耦合。仿真结果表明,所提算法的查准率和查全率均保持在96%以上,且汉明损失均低于0.022。实验结果表明,所提算法可以准确分类并提高参数优化速度,整体性能优于逻辑回归(LR)算法与随机森林(RF)算法。

关键词: 边缘智能计算, 分布式数据处理, 参数优化, 数据场, 自适应分类

Abstract:

In view of the general problems of not fully utilizing historical information and slow parameter optimization process in the research of clustering algorithms, an adaptive classification algorithm based on data field was proposed in combination with edge intelligent computing, which can be deployed on Edge Computing (EC) nodes to provide local intelligent classification service. By introducing supervision information to modify the structure of the traditional data field clustering model, the proposed algorithm enabled the traditional data field to be applied to classification problems, extending the applicable fields of data field theory. Based on the idea of the data field, the proposed algorithm transformed the domain value space of the data into the data potential field space, and divided the data into several unlabeled cluster results according to the spatial potential value. After comparing the cluster results with the historical supervision information for cloud similarity, the cluster results were attributed to the most similar category. Besides, a parameter search strategy based on sliding step length was proposed to speeded up the parameter optimization of the proposed algorithm. Based on this algorithm, a distributed data processing scheme was proposed. Through the cooperation of cloud center and edge devices, classification tasks were cut and distributed to different levels of nodes to achieve modularity and low coupling. Simulation results show that the precision and recall of the proposed algorithm maintained above 96%, and the Hamming loss was less than 0.022. Experimental results show that the proposed algorithm can accurately classify and accelerate the speed of parameter optimization, and outperforms than Logistic Regression (LR) algorithm and Random Forest (RF) algorithm in overall performance.

Key words: edge intelligent computing, distributed data processing, parameter optimization, data field, adaptive classification

中图分类号: