计算机应用 ›› 2018, Vol. 38 ›› Issue (6): 1665-1669.DOI: 10.11772/j.issn.1001-9081.2017102790

• 先进计算 • 上一篇    下一篇

基于云平台的任务性能采集和分类方法

柳春懿, 张晓, 覃源淞, 芦尚奇   

  1. 西北工业大学 计算机学院, 西安 710029
  • 收稿日期:2017-11-09 修回日期:2017-11-25 出版日期:2018-06-10 发布日期:2018-06-13
  • 通讯作者: 柳春懿
  • 作者简介:柳春懿(1993-),男,安徽滁州人,硕士研究生,主要研究方向:云计算;张晓(1978-),男,河南新乡人,副教授,博士,CCF会员,主要研究方向:云计算、云存储;覃源淞(1997-),女,广西柳州人,主要研究方向:云存储;芦尚奇(1996-),男,河南开封人,主要研究方向:云计算。
  • 基金资助:
    国家自然科学基金面上项目(61472323)。

Task performance collection and classification method in cloud platforms

LIU Chunyi, ZHANG Xiao, QIN Yuansong, LU Shangqi   

  1. School of Computer Science, Northwestern Polytechnical University, Xi'an Shaanxi 710029, China
  • Received:2017-11-09 Revised:2017-11-25 Online:2018-06-10 Published:2018-06-13
  • Supported by:
    This work is partially supported by the General Program of National Natural Science Foundation of China (61472323).

摘要: 由于用户在实际使用云平台时,很难确定云平台的云主机类型,所以造成了云平台资源利用率低下的问题。许多典型的解决资源利用率低下的方法,都是从云提供商的角度优化放置算法,而用户选择将限制资源利用率增加;也有一些方法采用云平台下的任务性能短时间采集并预测,但会降低任务分类的准确性。为了达到提高云平台资源利用率、简化用户操作的目的,首先提出一种多属性的任务性能采集工具Lbenchmark,全面采集任务的性能特征,和Ganglia相比负载降低了50%以上。然后,利用该性能数据,提出一种基于权值可配的多KD树-K最近邻(KNN)应用性能分类算法,挑选适合参数建立多个基于KD树的KNN分类器,通过交叉验证方法调整每个属性在不同分类器的权重,进行选举分类。实验结果表明,所提算法与传统的KNN相比,计算量明显提高了约10倍以上,而准确性平均提高约10%。该算法可利用数据特征映射将资源建议提供给用户和云提供商,进而提高云平台整体的利用率。

关键词: 性能采集, KD树, 虚拟机配置, 应用分类, 云平台

Abstract: It is difficult for users to determine the type of cloud hosts on cloud platforms when they are actually using cloud platforms, which results in low utilization of cloud platform resources. In some typical methods to solve the low resource utilization, the placement algorithms are optimized from the perspective of cloud provider, and the user selection will limit the utilization of resources; while in other methods, the collection and prediction of task performance under the cloud platform in a short time are made, but it will reduce the accuracy of task classification. In order to achieve the goals of improving cloud platform resource utilization and simplifying user operations, a multi-attribute task performance collection tool, named Lbenchmark, was proposed to collect the performance characteristics of task comprehensively, and the load was reduced by more than 50% compared with Ganglia. Then, with the performance data, a K-Nearest Neighbor (KNN) application performance classification algorithm with the multiple K-Dimension (KD) tree based on configurable weights was proposed. The suitable parameters were selected to establish multiple KNN classifiers with KD tree, and the cross validation method was used to adjust the weight of each attribute in different classifiers. The experimental results show that, compared with the traditional KNN algorithm, the calculation of the proposed algorithm is significantly increased by about 10 times, and its accuracy is averagely improved by about 10%. The proposed algorithm can use data feature mapping to provide resource recommendations to users and cloud providers, improving the overall utilization of cloud platforms.

Key words: performance collection, multiple K-Dimension (KD) tree, virtual machine configuration, application classification, cloud platform

中图分类号: