CCML2017+367+基于用户兴趣语义的视频关键帧提取

摘要/Abstract

摘要： 视频关键帧提取是视频摘要、检索等领域的热点研究问题。目前，视频关键信息提取技术主要集中于根据视频低层特征进行关键帧的提取，忽略了与用户兴趣相关的语义信息。对视频进行语义建模需收集大量已标注的视频训练样本，这个过程费时费力。为缓解这一问题，本文使用大量互联网图像数据构建基于用户兴趣的语义模型，这些图像数据内容丰富、同时涵盖大量事件信息。然而，从互联网获取的图像知识多样且常伴随图像噪声，使用蛮力迁移将大大影响视频最终提取效果。本文提出使用近义词联合权重模型衡量互联网中存在差异但语义相近的图像组，并利用这些图像组构建语义模型。在此框架下，通过联合权重学习获取语义权重，每一图像组在知识迁移中所起的作用由权重值决定。本文使用来自不同视频网站的多段视频对该方法进行验证，实验结果表明对用户感兴趣的内容进行联合权重语义建模能更加全面、准确地获取信息，从而有效指导视频关键帧提取。

关键词: 视频检索, 关键帧提取, 视频分析, 知识迁移

Abstract: Extracting keyframes is of great interest in video summary, organization, browsing and indexing. Current researches mainly focus on obtaining extractions by optimizing low-level?feature diversity or representativeness of the video frames ignoring the interests of the users. And it is time consuming and labor expensive to collect a large amount of required labelled videos to model different user-interest concepts for different videos. To alleviate the labelling process, we propose to learn models for user- interest concepts on different videos by leveraging abundant Web images which cover many roughly annotated concepts and often captured in a maximally informative way. However, knowledge from the Web is noisy and diverse, brute force knowledge transfer may hurt the keyframe extraction performance. To address this problem, we propose a novel joint group weighting learning framework to leverage different but related groups of knowledges learnt from the Web images to videos. Under this framework, weights of different groups are learnt in a joint optimization problem, and each weight represents how contributive the corresponding image group is to the knowledge transferred to the video. Experimental results on several challenging video datasets demonstrate that it is effective to use grouped knowledge gained from Web images for video keyframe extraction and provides more comprehensive results.

Key words: Video retrieval, Keyframe extraction, Video analysis, Knowledge transfer

中图分类号:

TP391.4

俞璜悦王晗郭梦婷. CCML2017+367+基于用户兴趣语义的视频关键帧提取[J]. 计算机应用.

[1]	王晓兵, 张雄伟, 曹铁勇, 郑云飞, 王勇. 基于尺度注意知识迁移的自蒸馏目标分割方法[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 129-137.
[2]	柏财通, 崔翛龙, 郑会吉, 李爱. 基于自监督知识迁移的鲁棒性语音识别技术[J]. 《计算机应用》唯一官方网站, 2022, 42(10): 3217-3223.
[3]	魏淳武, 赵涓涓, 唐笑先, 强彦. 基于多时期蒸馏网络的随访数据知识提取方法[J]. 计算机应用, 2021, 41(10): 2871-2878.
[4]	石念峰, 侯小静, 张平. 时空特征局部保持的运动视频关键帧提取[J]. 计算机应用, 2017, 37(9): 2605-2609.
[5]	侯荣波, 魏武, 黄婷, 邓超锋. 基于ORB-SLAM的室内机器人定位和三维稠密地图构建[J]. 计算机应用, 2017, 37(5): 1439-1444.
[6]	俞璜悦, 王晗, 郭梦婷. 基于用户兴趣语义的视频关键帧提取[J]. 计算机应用, 2017, 37(11): 3139-3144.
[7]	郑併斌, 范新南, 李敏, 张继. 基于轨迹分段LDA主题模型的视频异常行为检测方法[J]. 计算机应用, 2015, 35(2): 515-518.
[8]	王松韩永国吴亚东张赛楠. 基于图像主色彩的视频关键帧提取方法[J]. 计算机应用, 2013, 33(09): 2631-2635.
[9]	周渝斌. 海量监控视频快速回放与检索技术[J]. 计算机应用, 2012, 32(11): 3185-3197.
[10]	张建明蒋兴杰李广翠姜靓. 基于粒子群的关键帧提取算法[J]. 计算机应用, 2011, 31(02): 358-361.
[11]	吴渝贾学鹏李红波. 基于多特征相似度曲线曲率检测的关键帧提取[J]. 计算机应用, 2008, 28(12): 3084-3088.
[12]	张静俞辉. 一种多模态信息融合的视频检索模型[J]. 计算机应用, 2008, 28(1): 199-201,.
[13]	张培珍; 江华俊; 沈玉利. 自适应块匹配搜索算法研究[J]. 计算机应用, 2006, 26(4): 797-798.
[14]	李争名肖国强江健民 . 基于宏块类型信息的自适应场景变换检测算法[J]. 计算机应用, 2006, 26(11): 2727-2729.
[15]	刘宏哲，鲍泓，须德. 基于内容的视频分层语义联想模型[J]. 计算机应用, 2005, 25(08): 1797-1780.

CCML2017+367+基于用户兴趣语义的视频关键帧提取

CCML2017+367+Video Keyframe Extraction based on Users’ interests

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics