Survey of sub-topic detection technology based on internet social media

LI Shanshan1, YANG Wenzhong2,3, WANG Ting1, WANG Lihua1   

  1. 1. College of Software, Xinjiang University, Urumqi Xinjiang 830046, China
    2. College of Information Science and Engineering, Xinjiang University, Urumqi Xinjiang 830046, China
    3. National Engineering Laboratory for Public Safety Risk Perception and Control by Big Data (China Academy of Electronics and Information Technology), Urumqi Xinjiang 830000, China
  • Received:2019-11-01 Revised:2019-12-12 Online:2020-06-18 Published:2020-06-10
  • Contact: YANG Wenzhong, born in 1971, Ph. D. , associate professor. His research interests include Internet public opinion, intelligence analysis, information security, wireless sensor network.
  • About author:LI Shanshan, born in 1996, M. S. candidate. Her research interests include natural language processing, text data mining, information security.YANG Wenzhong, born in 1971, Ph. D. , associate professor. His research interests include Internet public opinion, intelligence analysis, information security, wireless sensor network.WANG Ting, born in 1996, M. S. candidate. Her research interests include natural language processing, text emotional analysis, information security.WANG Lihua, born in 1995, M. S. candidate. Her research interests include natural language processing, text intention detection.
  • Supported by:

    National Key Research and Development Program of China (2017YFC0820702-3), the National Natural Science Foundation of China (U1603115, U1435215), the Laboratory Director Foundation of National Engineering Laboratory for Public Safety Risk Perception and Control by Big Data.


The data in internet social media has the characteristics of fast transmission, high user participation and complete coverage compared with traditional media under the background of the rise of various platforms on the internet.There are various topics that people pay attention to and publish comments in, and there may exist deeper and more fine-grained sub-topics in the related information of one topic. A survey of sub-topic detection based on internet social media, as a newly emerging and developing research field, was proposed. The method of obtaining topic and sub-topic information through social media and participating in the discussion is changing people’s lives in an all-round way. However, the technologies in this field are not mature at present, and the researches are still in the initial stage in China. Firstly, the development background and basic concept of the sub-topic detection in internet social media were described. Secondly, the sub-topic detection technologies were divided into seven categories, each of which was introduced, compared and summarized. Thirdly, the methods of sub-topic detection were divided into online and offline methods, and the two methods were compared, then the general technologies and the frequently used technologies of the two methods were listed. Finally, the current shortages and future development trends of this field were summarized.

Key words: sub-topic, Topic Detection and Tracking (TDT), internet social media, topic hierarchy, sub-event



关键词: 子话题, 话题检测和追踪, 网络社交媒体, 话题层次, 子事件

CLC Number: