Journal of Computer Applications ›› 2017, Vol. 37 ›› Issue (7): 2044-2049.DOI: 10.11772/j.issn.1001-9081.2017.07.2044

Previous Articles     Next Articles

Challenges and recent progress in big data visualization

CUI Di1,2, GUO Xiaoyan3, CHEN Wei2   

  1. 1. College of Electronic and Information Engineering, Ningbo University of Technology, Ningbo Zhejiang 315211, China;
    2. State Key Laboratory of Computer Aided Design and Computer Graphics(Zhejiang University), Hangzhou Zhejiang 310058, China;
    3. College of Information Science and Technology, Gansu Agricultural University, Lanzhou Gansu 730070, China
  • Received:2017-01-13 Revised:2017-03-10 Online:2017-07-10 Published:2017-07-18
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61422211).

大数据可视化的挑战与最新进展

崔迪1,2, 郭小燕3, 陈为2   

  1. 1. 宁波工程学院 电子与信息工程学院, 浙江 宁波 315211;
    2. 计算机辅助设计与图形学国家重点实验室(浙江大学), 杭州 310058;
    3. 甘肃农业大学 信息科学技术学院, 兰州 730070
  • 通讯作者: 郭小燕
  • 作者简介:崔迪(1985-),女,浙江宁波人,讲师,博士研究生,CCF会员,主要研究方向:大数据分析、智能信息处理;郭小燕(1976-),女,甘肃天水人,副教授,博士,CCF会员,主要研究方向:智能优化算法;陈为(1976-),男,浙江杭州人,副教授,博士,CCF会员,主要研究方向:可视化。
  • 基金资助:
    国家自然科学基金资助项目(61422211)。

Abstract: The advent of big data era elicits the importance of visualization. As an import data analysis method, visual analytics explores the cognitive ability and advantages of human beings, integrates the abilities of human and computer, and gains insights into big data with human-computer interaction. In view of the characteristics of large amount of data, high dimension, multi-source and multi-form, the visualization method of large scale data was discussed firstly: 1) divide and rule principle was used to divide big problem into a number of smaller tasks, and parallel processing was used to improve the processing speed; 2) the means of aggregation, sampling and multi-resolution express were used to reduce data; 3) multi-view was used to present high dimensional data. Then, the visualization process of flow data was discussed for the two types of flow data, which were monitoring and superposition. Finally, the visualization of unstructured data and heterogeneous data was described. In a word, the visualization could make up for the disadvantages and shortcomings of computer automatic analysis, integrate computer analysis ability and human perception of information, and find the information and wisdom behind big data effectively. However, the research results of this theory are very limited, and it is faced with the challenge of large scale, dynamic change, high dimension and multi-source heterogeneity, which are becoming the hot spot and direction of large data visualization research in the future.

Key words: big data, visualization, challenge, visual analysis, progress

摘要: 大数据的来临增强了可视化的重要性。可视化分析挖掘人类对于信息的认知能力与优势,将人、机有机融合,借助人机交互高效洞悉大数据背后的信息与规律,是大数据分析的重要方法。针对大数据数据量大、维度高、多来源、多形态等特点论述了大规模数据、流数据、非结构和异构数据的可视化方法。首先讨论了大规模数据的可视化技术:1)采用分而治之的原则将大问题分解成较小的任务并采用并行处理的方式解决以提高处理的速度;2)通过聚合、采样、多分辨表示的方法进行数据约简;3)针对高维数据选择若干个视图,在多个角度下生成不同的可视化结果。然后针对监控型、叠加型两类流数据探讨了流数据的可视化过程。最后阐述了非结构化数据以及异构性数据的可视化技术。总之,可视化能够克服计算机自动化分析方法的劣势与不足,整合计算机的分析能力和人们对信息的感知能力,有效地洞悉大数据背后的信息与智慧,但其理论研究成果也非常有限,同时面临着数据规模大、动态变化、维度高、多源异构等方面的挑战,这些也逐渐成为今后的大数据可视化研究的热点与方向。

关键词: 大数据, 可视化, 挑战, 可视分析, 进展

CLC Number: