Application of improved DeepLabV3+ model in mural segmentation

doi:10.11772/j.issn.1001-9081.2020071101

Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (5): 1471-1476.DOI: 10.11772/j.issn.1001-9081.2020071101

Special Issue: 多媒体计算与计算机仿真

• Virtual reality and multimedia computing • Previous Articles Next Articles

Application of improved DeepLabV3+ model in mural segmentation

CAO Jianfang^1,2, TIAN Xiaodong¹, JIA Yiming¹, YAN Minmin¹

1. College of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan Shanxi 030024, China;
2. Department of Computer, Xinzhou Teachers University, Xinzhou Shanxi 034000, China

Received:2020-07-11 Revised:2020-10-06 Online:2021-05-10 Published:2021-05-19
Supported by:
This work is partially supported by the Program of the Humanities and Social Sciences Key Research Base of Higher Education Institutions of Shanxi (20190130).

改进DeepLabV3+模型在壁画分割中的应用

曹建芳^1,2, 田晓东¹, 贾一鸣¹, 闫敏敏¹

1. 太原科技大学计算机科学与技术学院, 太原 030024;
2. 忻州师范学院计算机系, 山西忻州 034000

通讯作者: 曹建芳
作者简介:曹建芳(1976-),女,山西忻州人,教授,博士,CCF高级会员,主要研究方向:数字图像理解、大数据;田晓东(1996-),男,山西朔州人,硕士研究生,主要研究方向:数字图像处理;贾一鸣(1996-),男,山西太原人,硕士研究生,主要研究方向:深度学习、图像处理;闫敏敏(1997-),女,山西长治人,硕士研究生,主要研究方向:智能信息处理。
基金资助:
山西省高等学校人文社会科学重点研究基地项目（20190130）。

Abstract

Abstract: Aiming at the problems of blurred target boundaries and low image segmentation efficiency in the image segmentation process of ancient murals, a multi-class image segmentation model fused with a lightweight convolutional neural network named MC-DM (Multi-Class DeepLabV3+MobileNetV2 (Mobile Networks Vision 2)) was proposed. In the model, DeepLabV3+ architecture and MobileNetV2 network were combined together, and the unique spatial pyramid structure of DeepLabV3+ was utilized to perform multi-scale fusion of the convolutional features of the mural to reduce the loss of image details during the mural segmentation. First of all, the features of the input image were extracted by MobileNetV2 to ensure the accurate extraction of image information and reduce the time consumption at the same time. Secondly, the image features were processed through the dilated convolution, so that the receptive field was expanded, and more semantic information was obtained without changing the number of parameters. Finally, the bilinear interpolation method was utilized to up-sample the output feature image to obtain a pixel-level prediction segmentation map, so that the accuracy of image segmentation was ensured to the greatest extent. In the JetBrains PyCharm Community Edition 2019 environment, a dataset made of 1 000 mural scanning pictures was used for testing. Experimental results showed that the MC-DM model had a 1% improvement in training accuracy compared with the traditional SegNet (Segment Network)-based image segmentation model, and had a 2% improvement in accuracy compared with the image segmentation model based on PSPNet (Pyramid Scene Parsing Network), and the Peak Signal-to-Noise Ratio (PSNR) of the MC-DM model was 3 to 8 dB higher than those of the experimental comparison models on average, which verified the effectiveness of the model in the field of mural segmentation. The proposed model provides a new idea for the segmentation of ancient mural images.

Key words: mural segmentation, multi-scale information fusion, depthwise separable convolution, inverted residual, spatial pyramid pool

摘要： 针对古代壁画图像分割过程中出现的目标边界模糊、图像分割效率低等问题，提出一种融合轻量级卷积神经网络的多分类图像分割模型MC-DM，该模型将DeepLabV3+结构和MobileNetV2相结合，利用DeepLabV3+特有的空间金字塔结构对壁画的卷积特征进行多尺度融合，从而减少壁画分割时的图像细节损失。首先，通过MobileNetV2对输入图像进行特征提取，从而在确保图像信息准确提取的同时减少耗时；其次，通过空洞卷积处理图像特征，从而扩展感受野，并在不改变参数数量的情况下得到更多的语义信息；最后，采用双线性插值的方法对输出特征图像进行上采样，以得到像素级的预测分割图，从而最大限度保证图像分割的准确性。在JetBrains PyCharm Community Edition 2019环境下，利用以1 000张壁画扫描图片制作而成的数据集进行测试，实验结果表明，MC-DM模型较传统的基于SegNet的图像分割模型在训练精确度上提升了1个百分点，较基于PSPNet的图像分割模型在精确度上提升了2个百分点，且MC-DM模型的峰值信噪比（PSNR）较实验对比模型平均提高了3～8 dB，充分验证了该模型在壁画分割领域的有效性。所提模型为古代壁画图像分割提供了新的思路。

关键词: 壁画分割, 多尺度信息融合, 深度可分离卷积, 倒转残差, 空间金字塔池

CLC Number:

TP391.41

CAO Jianfang, TIAN Xiaodong, JIA Yiming, YAN Minmin. Application of improved DeepLabV3+ model in mural segmentation[J]. Journal of Computer Applications, 2021, 41(5): 1471-1476.

曹建芳, 田晓东, 贾一鸣, 闫敏敏. 改进DeepLabV3+模型在壁画分割中的应用[J]. 计算机应用, 2021, 41(5): 1471-1476.

References

[1] WU Z,SHEN C,VAN DEN HENGEL A. Wider or deeper; revisiting the ResNet model for visual recognition[J]. Pattern Recognition,2019,90:119-133.
[2] LONG J,SHELHAMER E,DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2015:3431-3440.
[3] CHEN L C,ZHU Y,PAPANDREOU G,et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the 2018 European Conference on Computer Vision,LNCS 11211. Cham:Springer,2018:833-851.
[4] CHEN L C,PAPANDREOU G,KOKKINOS I,et al. DeepLab:semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4):834-848.
[5] 任凤雷, 何昕, 魏仲慧, 等. 基于DeepLabV3+与超像素优化的语义分割[J]. 光学精密工程,2019,27(12):2722-2729.(REN F L,HE X,WEI Z H,et al. Semantic segmentation based on DeepLabV3+ and super pixel optimization[J]. Optical and Precision Engineering,2019,27(12):2722-2729.)
[6] BADRINARAYANAN V,KENDALL A,CIPOLLA R. SegNet:a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(12):2481-2495.
[7] 兰蓉, 林洋. 抑制式非局部空间直觉模糊C-均值图像分割算法[J]. 电子与信息学报,2019,41(6):1472-1479.(LAN R,LIN Y. Suppressed non-local spatial intuitionistic fuzzy C-means image segmentation algorithm[J]. Journal of Electronics and Information Technology,2019,41(6):1472-1479.)
[8] 孙建军, 徐岩. 基于加权改进模糊C均值聚类的欠定混合矩阵估计[J]. 计算机应用,2020,40(6):1769-1773.(SUN J J,XU Y. Estimation of underdetermined mixing matrix based on improved weighted fuzzy C-means clustering[J]. Journal of Computer Applications,2020,40(6):1769-1773.)
[9] 赵胜男, 王文剑. 融合SVM和快速均值漂移的图像分割算法[J]. 小型微型计算机系统,2017,38(7):1614-1618.(ZHAO S N,WANG W J. Image segmentation algorithm based on SVM and fast mean shift[J]. Journal of Chinese Computer Systems,2017,38(7):1614-1618.)
[10] ISWANTO I A,CHOA T W,LI B. Object tracking based on meanshift and particle-Kalman filter algorithm with multi features[J]. Procedia Computer Science,2019,157:521-529.
[11] 孟琭, 杨旭. 目标跟踪算法综述[J]. 自动化学报,2019,45(7):1244-1260.(MENG L,YANG X. A survey of object tracking algorithms[J]. Acta Automatica Sinica,2019,45(7):1244-1260.)
[12] MARTIN B,PAULUSMA D,VAN LEEUWEN E J. Disconnected cuts in claw-free graphs[J]. Journal of Computer and System Sciences,2020,113:60-75.
[13] BALAJI V R,SUGANTHI S T,RAJADEVI R,et al. Skin disease detection and segmentation using dynamic graph cut algorithm and classification through Naive Bayes classifier[J]. Measurement, 2020,163:No. 107922.
[14] ANTONELLO M, CHIESURIN S, GHIDONI S. Enhancing semantic segmentation with detection priors and iterated graph cuts for robotics[J]. Engineering Applications of Artificial Intelligence,2020,90:No. 103467.
[15] SANDLER M, HOWARDA, ZHU M, et al. MobileNetV2:inverted residuals and linear bottlenecks[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2018:4510-4520.
[16] 庞彦辉. 基于深度学习的非限定视角下车牌检测方法研究[D]. 合肥:安徽大学,2020:32-35.(PANG Y H. Research on license plate detection method based on deep learning in unrestricted view[D]. Hefei:Anhui University,2020:32-35.)
[17] 李青华, 李翠平, 张静, 等. 深度神经网络压缩综述[J]. 计算机科学,2019,46(9):1-14.(LI Q H,LI C P,ZHANG J,et al. Survey of compressed deep neural network[J]. Computer Science, 2019,46(9):1-14.)
[18] NAN K,LIU S,DU J,et al. Deep model compression for mobile platforms:a survey[J]. Tsinghua Science and Technology,2019, 24(6):677-693.
[19] 任杰, 高岭, 于佳龙, 等. 面向边缘设备的高能效深度学习任务调度策略[J]. 计算机学报,2020,43(3):440-452.(REN J, GAO L,YU J L,et al. Energy-efficient deep learning task scheduling strategy for edge device[J]. Chinese Journal of Computers,2020,43(3):440-452.)
[20] JIANG J,XIONG Y,XIA X. A manual inspection of Defects4J bugs and its implications for automatic program repair[J]. SCIENCE CHINA Information Sciences, 2019, 62(10):No. 2000102.
[21] 罗嗣卿, 张志超, 岳琪. 基于改进SEGNET模型的图像语义分割[J/OL]. 计算机工程.[2020-06-29]. https://doi.org/10.19678/j.issn.1000-3428.0058015. (LUO S Q, ZHANG Z C,YUE Q. Image semantic segmentation based on improved SEGNET model[J/OL]. Computer Engineering.[2020-06-29]. https://doi.org/10.19678/j.issn.1000-3428.0058015.)
[22] ZHAO H S,SHI J,QI X,et al. Pyramid scene parsing network[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2017:6230-6239.
[23] 田萱, 王亮, 丁琪. 基于深度学习的图像语义分割方法综述[J]. 软件学报,2019,30(2):440-468.(TIAN X,WANG L, DING Q. Review of image semantic segmentation based on deep learning[J]. Journal of Software,2019,30(2):440-468.)

Application of improved DeepLabV3+ model in mural segmentation

改进DeepLabV3+模型在壁画分割中的应用

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 10

Recommended Articles

Metrics

[1]	FENG Xingjie, ZHANG Tianze. Panoptic segmentation algorithm based on grouped convolution for feature fusion [J]. Journal of Computer Applications, 2021, 41(7): 2054-2061.
[2]	JIA Heming, LANG Chunbo, JIANG Zichao. Plant leaf disease recognition method based on lightweight convolutional neural network [J]. Journal of Computer Applications, 2021, 41(6): 1812-1819.
[3]	HU Die, FENG Ziliang. Light-weight road image semantic segmentation algorithm based on deep learning [J]. Journal of Computer Applications, 2021, 41(5): 1326-1331.
[4]	WANG Yongjin, ZUO Yu, WU Lian, CUI Zhongwei, ZHAO Chenjie. Image super-resolution reconstruction based on attention mechanism [J]. Journal of Computer Applications, 2021, 41(3): 845-850.
[5]	JIANG Jinhong, BAO Shengli, SHI Wenxu, WEI Zhenkun. Improved traffic sign recognition algorithm based on YOLO v3 algorithm [J]. Journal of Computer Applications, 2020, 40(8): 2472-2478.
[6]	LIU Shangwang, LIU Chengwei, ZHANG Aili. Real-time facial expression and gender recognition based on depthwise separable convolutional neural network [J]. Journal of Computer Applications, 2020, 40(4): 990-995.
[7]	DENG Tianmin, FANG Fang, ZHOU Zhenhao. Traffic sign recognition based on improved convolutional neural network with spatial pyramid pooling [J]. Journal of Computer Applications, 2020, 40(10): 2872-2880.
[8]	GAO Yuan, WANG Xiaochen, QIN Pinle, WANG Lifang. Medical image super-resolution reconstruction based on depthwise separable convolution and wide residual network [J]. Journal of Computer Applications, 2019, 39(9): 2731-2737.
[9]	XU Zhe, FENG Changhua. Modified scale dependent pooling model for traffic image recognition [J]. Journal of Computer Applications, 2018, 38(3): 671-676.
[10]	LUO Jian, JIANG Min. Object recognition method based on RGB-D image kernel descriptor [J]. Journal of Computer Applications, 2017, 37(1): 255-261.