Learning method of indoor scene semantic annotation based on texture information

doi:10.11772/j.issn.1001-9081.2018040892

Journal of Computer Applications ›› 2018, Vol. 38 ›› Issue (12): 3409-3413.DOI: 10.11772/j.issn.1001-9081.2018040892

Previous Articles Next Articles

Learning method of indoor scene semantic annotation based on texture information

ZHANG Yuanyuan^1,2, HUANG Yijun¹, WANG Yuefei¹

1. School of Mechanical and Marine Engineering, Qinzhou University, Qinzhou Guangxi 535011, China;
2. Qinzhou Key Laboratory for Advanced Technology to Internet of Things, Qinzhou Guangxi 535011, China

Received:2018-05-02 Revised:2018-07-19 Online:2018-12-15 Published:2018-12-10
Contact: 张圆圆
Supported by:
This work is partially supported by the Project of Improving the Basic Ability of Young and Middle-Aged Teachers in Universities of Guangxi (17KY0793), the Key Laboratory for Design, Manufacture and Control of Marine Machinery and Equipment in Guangxi Universities (GXLH2014ZD-05, GXLH2016YB-07), the Key Laboratory of Advanced Technology for the Internet of Things of Qinzhou (IOT2018C02).

基于纹理信息的室内场景语义标注学习方法

张圆圆^1,2, 黄宜军¹, 王跃飞¹

1. 钦州学院机械与船舶海洋工程学院, 广西钦州 535011;
2. 钦州市物联网先进技术重点实验室, 广西钦州 535011

通讯作者: 张圆圆
作者简介:张圆圆(1987-),女,河南民权人,讲师,硕士,主要研究方向:智能化检测与控制、模式识别;黄宜军(1965-),男,广西宜州人,教授,博士,主要研究方向:模式识别、智能控制;王跃飞(1986-),男,山西偏关人,助教,硕士,主要研究方向:复杂工业控制综合自动化。
基金资助:
广西高校中青年教师基础能力提升项目（17KY0793）；广西高校临海机械装备设计制造及控制重点实验室项目（GXLH2014ZD-05，GXLH2016YB-07）；钦州市物联网先进技术重点实验室项目（IOT2018C02）。

Abstract

Abstract: The manual processing method is mainly used for the detection, tracking and information editing of key objects in indoor scene video, which has the problems of low efficiency and low precision. In order to solve the problems, a new learning method of indoor scene semantic annotation based on texture information was proposed. Firstly, the optical flow method was used to obtain the motion information between video frames, and the key frame annotation and interframe motion information were used to initialize the annotation of non-key frames. Then, the image texture information constraint of non-key frames and its initialized annotation were used to construct an energy equation. Finally, the graph-cuts method was used for optimizing to obtain the solution of the energy equation, which was the non-key frame semantic annotation. The experimental results of the annotation accuracy and visual effects show that, compared with the motion estimation method and the model-based learning method, the proposed learning method of indoor scene semantic annotation based on texture information has the better effect. The proposed method can provide the reference for low-latency decision-making systems such as service robots, smart home and emergency response.

Key words: indoor scene, semantic annotation learning, motion estimation, image texture, graph-cuts

摘要： 针对目前室内场景视频中关键物体的检测、跟踪及信息编辑等方面主要是采用人工处理方式，存在效率低、精度不高等问题，提出了一种基于纹理信息的室内场景语义标注学习方法。首先，采用光流方法获取视频帧间的运动信息，利用关键帧标注和帧间运动信息进行非关键帧的标注初始化；然后，利用非关键帧的图像纹理信息约束及其初始化标注构建能量方程；最后，利用图割方法优化得到该能量方程的解，即为非关键帧语义标注。标注的准确率和视觉效果的实验结果表明，与运动估计法和基于模型的学习法相比较，所提基于纹理信息的室内场景语义标注学习法具有较好的效果。该方法可以为服务机器人、智能家居、应急响应等低时延决策系统提供参考。

关键词: 室内场景, 语义标注学习, 运动估计, 图像纹理, 图割

CLC Number:

ZHANG Yuanyuan, HUANG Yijun, WANG Yuefei. Learning method of indoor scene semantic annotation based on texture information[J]. Journal of Computer Applications, 2018, 38(12): 3409-3413.

张圆圆, 黄宜军, 王跃飞. 基于纹理信息的室内场景语义标注学习方法[J]. 计算机应用, 2018, 38(12): 3409-3413.

References

[1] ZHANG Z B, ZHOU C, WANG Y Z, et al. Interactive stereoscopic video conversion[J]. IEEE Transactions on Circuits and System for Video Technology, 2013, 23(10):1795-1808.
[2] ZHANG J A, ZHANG Z B. Depth map propagation with the texture image guidance[C]//Proceedings of the 2014 International Conference on Image Processing. Piscataway, NJ:IEEE, 2014:3813-3817.
[3] ZHANG Z B, ZHOU C B, WANG R G, et al. A compact representation for compressing converted stereo videos[J]. IEEE Transaction on Image Processing, 2014, 23(5):2343-2355.
[4] SHAR L K, TAN H B K. Defending against cross-site scripting attack[J]. Computer, 2012, 45(3):55-62.
[5] HUANG T M, KECMAN V, KOPRIVA I. Kernel Based Algorithms for Mining Huge Data Sets:Supervised, Semi-supervised, and Un-supervised Learning[M]. Berlin:Springer, 2006:297-301.
[6] 崔桐,徐欣.一种基于语义分析的大数据视频标注方法[J].南京航空航天大学学报:2016,48(5):677-682.(CUI T, XU X. Big data video snnotation based on semantia analysis[J]. Journal of Nanjing University of Aeronautics & Aatronautics, 2016, 48(5):677-682.)
[7] 罗冰.语义对象分割研究方法[D].成都:电子科技大学,2017:9-23.(LUO B. Research on segmentation of semantic objects[D]. Chengdu:University of Electronic Science and Technology of China, 2017:9-23.)
[8] CHEN A Y C, CORSO J J. Propagating multi-class pixel labels throughout video frames[C]//Proceedings of the 2010 Western New York Image Processing Workshop. Piscataway, NJ:IEEE, 2010:14-17.
[9] FATHI A, BALCAN M F, REN X F, et al. Combining self training and active learning for video segmentation[C]//Proceedings of the 2011 British Machine Vision Conference. Durham:BMVA Press, 2011:78.1-78.11.
[10] BADRINARAYANAN V, GALASSO F, CIPOLLA R. Label propagation in video sequences[C]//Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE, 2010:3265-3272.
[11] BAI X, SAPRIO G. Geodesic matting:a framework for fast inter-active image and video segmentation and matting[C]//Proceedings of the 2007 IEEE 11th International Conference on Computer Vision. Piscataway, NJ:IEEE, 2007:1-8.
[12] NAGARAJA N, OCHS P, LIU K, et al. Hierarchy of localized random forests for video annotation[C]//Proceedings of the 34th Symposium of the German Association for Pattern Recognition and the 36th Symposium of the Austrian Association for Pattern Recognition, LNCS 7476. Berlin:Springer, 2012:21-30.
[13] BUDVYTIS I, BADRINARAYANAN V, CIPOLLA R. Semi-supervised video segmentation using tree structed graphical models[C]//Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE, 2011:2257-2264.
[14] LIU C. Beyond pixels:exploring new representations and applications for motion analysis[D]. Cambridge:Massachusetts Institute of Technology, 2009:55-110.
[15] KAPPES J, ANDRES B, HAMPRECHT F. A comparative study of modern inference techniques for discrete energy minimization problems[C]//Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE:1328-1335.
[16] BOYKOV Y, KOLMOGOROV V. An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(9):1124-1137.
[17] DELONG A, OSOKIN A, ISACK H N, et al. Fast approximate energy minimization with label costs[J]. International Journal of Computer Vision, 2012, 96(1):1-27.
[18] SILBERMAN N, HEIEM D, KOHLI P, et al. Indoor segmentation and support inference from RGBD images[C]//Proceedings of the 2012 European Conference on Computer Vision, LNCS 7576. Berlin:Springer, 2012:746-760.

Learning method of indoor scene semantic annotation based on texture information

基于纹理信息的室内场景语义标注学习方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

[1]	Kegui GUO, Rui CAO, Neng WAN, Xiao WANG, Yue YIN, Xuming TANG, Junlin XIONG. Image matching algorithm based on transmission tower area extraction [J]. Journal of Computer Applications, 2022, 42(5): 1591-1597.
[2]	XU Jianglang, LI Linyan, WAN Xinjun, HU Fuyuan. Indoor scene recognition method combined with object detection [J]. Journal of Computer Applications, 2021, 41(9): 2720-2725.
[3]	YU Yingdong, YANG Yi, LIN Lan. Image style transfer network based on texture feature analysis [J]. Journal of Computer Applications, 2020, 40(3): 638-644.
[4]	XI Zhihong, HAN Shuangquan, WANG Hongxu. Simultaneous localization and semantic mapping of indoor dynamic scene based on semantic segmentation [J]. Journal of Computer Applications, 2019, 39(10): 2847-2851.
[5]	YU Yinghuai, XIE Shiyi, MEI Qixiang. Accurate motion estimation algorithm based on upsampled phase correlation with kernel regression refining [J]. Journal of Computer Applications, 2016, 36(8): 2316-2321.
[6]	YIN Jun, DONG Lida, CHI Tianyang. Mobile robot motion estimation based on classified feature points [J]. Journal of Computer Applications, 2015, 35(2): 590-594.
[7]	WAN Jinliang YE Long. Multi-region image reconstruction algorithm [J]. Journal of Computer Applications, 2013, 33(12): 3544-3547.
[8]	ZHUANG Yanbin GUI Yuan XIAO Xianjian. Compressed Video Sensing Method Based on Motion Estimation and Backtracking based Adaptive Orthogonal Matching Pursuit [J]. Journal of Computer Applications, 2013, 33(09): 2577-2579.
[9]	MENG Bo HAN Guang-liang. Electronic image stabilization algorithm using improved scale invariant feature transform [J]. Journal of Computer Applications, 2012, 32(10): 2817-2820.
[10]	LI Shi-ping ZHENG Wen-bin SHI Xin. Fibonacci optimized UMHexagonS algorithm for H.264 motion estimation [J]. Journal of Computer Applications, 2012, 32(09): 2580-2584.
[11]	DANG Xin-peng LIU Wen-ping. Face recognition algorithm based on multi-level texture spectrum features and PCA [J]. Journal of Computer Applications, 2012, 32(08): 2316-2319.
[12]	WANG Feng-qin CHEN Xiao-lei CHEN Yan. Side information interpolation algorithm based on spatio-temporal correlations at decoder [J]. Journal of Computer Applications, 2012, 32(08): 2324-2327.
[13]	LUO Gui-e XU Yun-bin. Depth map acquisition technique based on Quaternion-Gabor wavelet motion estimation [J]. Journal of Computer Applications, 2012, 32(01): 238-240.
[14]	WANG Qiang LI Yue-e. Fast block-matching motion estimation algorithm based on directional adaptive sampling search [J]. Journal of Computer Applications, 2011, 31(10): 2721-2723.
[15]	DING Zhihong WANG Gang LIU Lizhu. Adaptive error concealment algorithm based on residual distribution for whole frame losses in H.264 [J]. Journal of Computer Applications, 2011, 31(06): 1569-1571.