计算机应用 ›› 2017, Vol. 37 ›› Issue (11): 3139-3144.DOI: 10.11772/j.issn.1001-9081.2017.11.3139

• 第十六届中国机器学习会议(CCML 2017) • 上一篇    下一篇

基于用户兴趣语义的视频关键帧提取

俞璜悦, 王晗, 郭梦婷   

  1. 北京林业大学 信息学院, 北京 100083
  • 收稿日期:2017-05-16 修回日期:2017-06-26 出版日期:2017-11-10 发布日期:2017-11-11
  • 通讯作者: 王晗
  • 作者简介:俞璜悦(1996-),女,江西南昌人,主要研究方向:数字图像处理、视频检索;王晗(1986-),女,湖南长沙人,讲师,博士,主要研究方向:视频图像检索、机器学习;郭梦婷(1996-),女,北京人,主要研究方向:图像处理、图像检索。
  • 基金资助:
    中央高校基本科研业务费专项资金资助项目(2015ZCQ-XX)。

Video keyframe extraction based on users' interests

YU Huangyue, WANG Han, GUO Mengting   

  1. College of Information Science and Technology, Beijing Forestry University, Beijing 100083, China
  • Received:2017-05-16 Revised:2017-06-26 Online:2017-11-10 Published:2017-11-11
  • Supported by:
    This work is partially supported by the Fundamental Research Funds for the Central Universities (2015ZCQ-XX).

摘要: 目前,视频关键信息提取技术主要集中于根据视频低层特征进行关键帧的提取,忽略了与用户兴趣相关的语义信息。对视频进行语义建模需收集大量已标注的视频训练样本,费时费力。为缓解这一问题,使用大量互联网图像数据构建基于用户兴趣的语义模型,这些图像数据内容丰富、同时涵盖大量事件信息;然而,从互联网获取的图像知识多样且常伴随图像噪声,使用蛮力迁移将大幅影响视频最终提取效果,提出使用近义词联合权重模型衡量互联网中存在差异但语义相近的图像组,并利用这些图像组构建语义模型。通过联合权重学习获取语义权重,每一图像组在知识迁移中所起的作用由权重值决定。使用来自不同视频网站的多段视频对所提方法进行验证,实验结果表明对用户感兴趣的内容进行联合权重语义建模能更加全面、准确地获取信息,从而有效指导视频关键帧提取。

关键词: 视频检索, 关键帧提取, 视频分析, 知识迁移

Abstract: At present, the video key information extraction technology mainly focuses on the extraction of key frames according to the characteristics of video low-level, and ignores the semantic information related to users' interests. Semantic modeling of video requires a large number of marked video training samples, which is time consuming and laborious. To alleviate this problem, a large amount of Internet image data was used to construct a semantic model based on users' interests, which was rich in content and covered a large amount of event information. However, the images obtained from the Internet were diversed and often accompanied by image noise, the final extraction of video would be greatly affected by brute force migration. The synonym-weight model was used to measure the differences of the semantically similar image groups on the Internet, and these image groups were used to construct a semantic model. The weight of each image group in knowledge migration was determined by the weight value. The experimental results on several challenging video datasets demonstrate that semantic modeling based on users' interests combined with weights is more comprehensive and accurate, so as to effectively guide the video key frame extraction.

Key words: video retrieval, keyframe extraction, video analysis, knowledge transfer

中图分类号: