基于长短时记忆单元和卷积神经网络混合神经网络模型的视频着色方法

doi:10.11772/j.issn.1001-9081.2019020264

计算机应用 ›› 2019, Vol. 39 ›› Issue (9): 2726-2730.DOI: 10.11772/j.issn.1001-9081.2019020264

• 虚拟现实与多媒体计算 • 上一篇下一篇

基于长短时记忆单元和卷积神经网络混合神经网络模型的视频着色方法

张政, 何山, 贺靖淇

西南石油大学计算机科学学院, 成都 610500

收稿日期:2019-02-19 修回日期:2019-05-06 发布日期:2019-05-14 出版日期:2019-09-10
通讯作者: 张政
作者简介:张政(1994-),男,四川成都人,硕士研究生,主要研究方向:深度学习、图像处理;何山(1972-),男,四川成都人,副教授,硕士,主要研究方向:数据挖掘、机器学习;贺靖淇(1993-),男,四川成都人,硕士研究生,主要研究方向:嵌入式系统。

Video colorization method based on hybrid neural network model of long short term memory and convolutional neural network

ZHANG Zheng, HE Shan, HE Jingqi

School of Computer Science, Southwest Petroleum University, Chengdu Sichuan 610500, China

Received:2019-02-19 Revised:2019-05-06 Online:2019-05-14 Published:2019-09-10

摘要/Abstract

摘要：

视频可以看作是连续的视频帧图像组成的序列，视频彩色化的实质是对图像进行彩色化处理，但由于视频的长期序列性，若直接将现有的图像着色方法应用到视频彩色化上极易产生抖动或闪烁现象。针对这个问题，提出一种结合长短时记忆（LSTM）和卷积神经网络（CNN）的混合神经网络模型用于视频的着色。该方法用CNN提取视频帧的语义特征，同时使用LSTM单元学习灰度视频的时序信息，保证视频的时空一致性，然后融合局部语义特征和时序特征，生成最终的彩色视频帧序列。通过对实验结果的定量分析和用户研究表明，该方法在视频彩色化上实现了较好的效果。

关键词: 视频彩色化, 长短时记忆, 卷积神经网络, 时空一致性

Abstract:

A video can be seen as a sequence formed by continuous video frames of images, and the colorization process of video actually is the colorization of images. If the existing image colorization method is directly applied to video colorization, it tends to cause flutter or twinkle because of long-term sequentiality of videos. For this problem, a method based on Long Short Term Memory (LSTM) cells and Convolutional Neural Network (CNN) was proposed to colorize the grayscale video. In the method, the semantic features of video frames were extracted with CNN and the time sequence information of video was learned by LSTM cells to keep the time-space consistency of video, then local semantic features and time sequence features were fused to generate the final colorized video frames. The quantitative assessment and user study of the experimental results show that this method achieves good performance in video colorization.

Key words: video colorization, Long Short Term Memory (LSTM), Convolutional Neural Network (CNN), time-space consistency

中图分类号:

TP391.4

张政, 何山, 贺靖淇. 基于长短时记忆单元和卷积神经网络混合神经网络模型的视频着色方法[J]. 计算机应用, 2019, 39(9): 2726-2730.

ZHANG Zheng, HE Shan, HE Jingqi. Video colorization method based on hybrid neural network model of long short term memory and convolutional neural network[J]. Journal of Computer Applications, 2019, 39(9): 2726-2730.

参考文献

[1] CHENG Z, YANG Q, SHENG B. Colorization using neural network ensemble[J]. IEEE Transactions on Image Processing, 2017, 26(11):5491-5505.
[2] DESHPANDE A, ROCK J, FORSYTH D. Learning large-scale automatic image colorization[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision. Washington, DC:IEEE Computer Society, 2015:567-575.
[3] ⅡZUKA S, SIMO-SERRA E, ISHIKAWA H. Let there be color!:joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification[J]. ACM Transactions on Graphics, 2016, 35(4):Article No. 110.
[4] CHENG Z, YANG Q, SHENG B. Deep colorization[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision. Washington, DC:IEEE Computer Society, 2015:415-423.
[5] LARSSON G, MAIRE M, SHAKHNAROVICH G. Learning representations for automatic colorization[C]//Proceedings of the 2016 European Conference on Computer Vision, LNCS 9908. Berlin:Springer, 2016:577-593.
[6] ZHANG R, ISOLA P, EFROS A A. Colorful image colorization[C]//Proceedings of the 2016 European Conference on Computer Vision, LNCS 9907. Berlin:Springer, 2016:649-666.
[7] HOCHREITER S, SCHMIDHUBER J. LSTM can solve hard long time lag problems[C]//Proceedings of the 9th International Conference on Neural Information Processing Systems. Cambridge, MA:MIT Press, 1996:473-479.
[8] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL].[2019-01-03]. https://arxiv.org/pdf/1409.1556.pdf.
[9] KARPATHY A, TODERICI G, SHETTY S, et al. Large-scale video classification with convolutional neural networks[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2014:1725-1732.
[10] ULLAH A, AHMAD J, MUHAMMAD K, et al. Action recognition in video sequences using deep bi-directional LSTM with CNN features[J]. IEEE Access, 2018, 6:1155-1166.
[11] HOCHREITER S, SCHMIDHUBER J. LSTM can solve hard long time lag problems[C]//Proceedings of the 9th International Conference on Neural Information Processing Systems. Cambridge, MA:MIT Press, 1996:473-479.
[12] SHELHAMER E, LONG J, DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4):640-651.
[13] PERAZZI F, PONT-TUSET J, McWILLIAMS B, et al. A benchmark dataset and evaluation methodology for video object segmentation[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2016:724-732.
[14] GUPTA R K, CHIA A Y-S, RAJAN D, et al. A learning-based approach for automatic image and video colorization[EB/OL].[2019-01-20]. https://arxiv.org/pdf/1704.04610.pdf.
[15] RUSSO F. Performance evaluation of noise reduction filters for color images through Normalized Color Difference (NCD) decomposition[J]. ISRN Machine Vision, 2014, 2014:Article No. 579658.

基于长短时记忆单元和卷积神经网络混合神经网络模型的视频着色方法

Video colorization method based on hybrid neural network model of long short term memory and convolutional neural network

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	李云, 王富铕, 井佩光, 王粟, 肖澳. 基于不确定度感知的帧关联短视频事件检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2903-2910.
[2]	秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974.
[3]	陈虹, 齐兵, 金海波, 武聪, 张立昂. 融合1D-CNN与BiGRU的类不平衡流量异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2493-2499.
[4]	赵宇博, 张丽萍, 闫盛, 侯敏, 高茂. 基于改进分段卷积神经网络和知识蒸馏的学科知识实体间关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2421-2429.
[5]	张春雪, 仇丽青, 孙承爱, 荆彩霞. 基于两阶段动态兴趣识别的购买行为预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2365-2371.
[6]	高阳峄, 雷涛, 杜晓刚, 李岁永, 王营博, 闵重丹. 基于像素距离图和四维动态卷积网络的密集人群计数与定位方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2233-2242.
[7]	王东炜, 刘柏辰, 韩志, 王艳美, 唐延东. 基于低秩分解和向量量化的深度网络压缩方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 1987-1994.
[8]	黄梦源, 常侃, 凌铭阳, 韦新杰, 覃团发. 基于层间引导的低光照图像渐进增强算法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1911-1919.
[9]	李健京, 李贯峰, 秦飞舟, 李卫军. 基于不确定知识图谱嵌入的多关系近似推理模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1751-1759.
[10]	姚迅, 秦忠正, 杨捷. 生成式标签对抗的文本分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1781-1785.
[11]	沈君凤, 周星辰, 汤灿. 基于改进的提示学习方法的双通道情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1796-1806.
[12]	孙敏, 成倩, 丁希宁. 基于CBAM-CGRU-SVM的Android恶意软件检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1539-1545.
[13]	席治远, 唐超, 童安炀, 王文剑. 基于双路时空网络的驾驶员行为识别[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1511-1519.
[14]	高文烁, 陈晓云. 基于节点结构的点云分类网络[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1471-1478.
[15]	王杰, 孟华. 基于点云整体拓扑结构的图像分类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1107-1113.