Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (12): 3927-3932.DOI: 10.11772/j.issn.1001-9081.2022121887
• Multimedia computing and computer simulation • Previous Articles Next Articles
Xiaofei JI(), Kexin ZHANG, Lirong TANG
Received:
2022-12-22
Revised:
2023-03-21
Accepted:
2023-03-22
Online:
2023-04-03
Published:
2023-12-10
Contact:
Xiaofei JI
About author:
ZHANG Kexin, born in 1996, M. S. candidate. Her research interests include image processing, video analysis and processing.Supported by:
通讯作者:
姬晓飞
作者简介:
张可心(1996—),女,辽宁锦州人,硕士研究生,主要研究方向:图像处理、视频分析与处理;基金资助:
CLC Number:
Xiaofei JI, Kexin ZHANG, Lirong TANG. Book spine segmentation algorithm based on improved DeepLabv3+ network[J]. Journal of Computer Applications, 2023, 43(12): 3927-3932.
姬晓飞, 张可心, 唐李荣. 改进DeepLabv3+网络的图书书脊分割算法[J]. 《计算机应用》唯一官方网站, 2023, 43(12): 3927-3932.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022121887
网络层数 | MIoU/% | 网络层数 | MIoU/% |
---|---|---|---|
3 | 79.4 | 5 | 91.2 |
4 | 88.5 | 6 | 89.9 |
Tab.1 Influence of number of network layers of DenseASPP module on segmentation effect
网络层数 | MIoU/% | 网络层数 | MIoU/% |
---|---|---|---|
3 | 79.4 | 5 | 91.2 |
4 | 88.5 | 6 | 89.9 |
骨架网络 | 引入自注意模块 | MIoU |
---|---|---|
Xception | 是 | 92.7 |
否 | 92.2 | |
MobileNetV2 | 是 | 93.8 |
否 | 93.1 |
Tab.2 Comparison of experimental results before and after introduction of self-attention module
骨架网络 | 引入自注意模块 | MIoU |
---|---|---|
Xception | 是 | 92.7 |
否 | 92.2 | |
MobileNetV2 | 是 | 93.8 |
否 | 93.1 |
数据库 | 算法 | 批次 | 骨架网络 | MIoU/% |
---|---|---|---|---|
近竖直 书脊 数据库 | Mask R-CNN算法* | 2 | ResNet50 | 87.5 |
改进Mask R-CNN算法* | 2 | ResNet50 | 85.3 | |
DeepLabv3+算法* | 4 | MobileNet V2 | 92.3 | |
本文算法 | 4 | MobileNet V2 | 94.1 | |
倾斜 书脊 数据库 | Mask R-CNN算法* | 2 | ResNet50 | 81.3 |
改进Mask R-CNN算法* | 2 | ResNet50 | 93.5 | |
DeepLabv3+算法* | 4 | MobileNet V2 | 89.2 | |
本文算法 | 4 | MobileNet V2 | 93.3 |
Tab.3 Test results of different network segmentation algorithms on open-source database
数据库 | 算法 | 批次 | 骨架网络 | MIoU/% |
---|---|---|---|---|
近竖直 书脊 数据库 | Mask R-CNN算法* | 2 | ResNet50 | 87.5 |
改进Mask R-CNN算法* | 2 | ResNet50 | 85.3 | |
DeepLabv3+算法* | 4 | MobileNet V2 | 92.3 | |
本文算法 | 4 | MobileNet V2 | 94.1 | |
倾斜 书脊 数据库 | Mask R-CNN算法* | 2 | ResNet50 | 81.3 |
改进Mask R-CNN算法* | 2 | ResNet50 | 93.5 | |
DeepLabv3+算法* | 4 | MobileNet V2 | 89.2 | |
本文算法 | 4 | MobileNet V2 | 93.3 |
1 | TABASSUM N, CHOWDHURY S, HOSSEN M K, et al. An approach to recognize book title from multi-cell bookshelf images [C]// Proceedings of the 2017 IEEE International Conference on Imaging, Vision & Pattern Recognition. Piscataway: IEEE, 2017:1-6. 10.1109/icivpr.2017.7890886 |
2 | 康洪雷,牛连强,冯庸,等.基于视觉的错序在架图书检测系统 [J].软件工程,2018,21(4):18-22. |
KANG H L, NIU L Q, FENG Y, et al. A vision-based system to detect books with incorrect sequence on shelf [J]. Software Engineering, 2018, 21(4):18-22. | |
3 | 崔晨,任明武.一种基于文本检测的书脊定位方法 [J].计算机与数字工程,2020,48(1):178-182,251. 10.3969/j.issn.1672-9722.2020.01.034 |
CUI C, REN M W. A spine location method based on text detection [J]. Computer and Digital Engineering, 2020, 48(1): 178-182,251. 10.3969/j.issn.1672-9722.2020.01.034 | |
4 | NEVETHA M P, BARSKAR A. Automatic book spine extraction and recognition for library inventory management [C]// Proceedings of the 3rd International Symposium on Women in Computing and Informatics. New York: ACM, 2015:44-48. 10.1145/2791405.2791506 |
5 | UÇKUN F A, ÖZER H, NURBAŞ E, et al. Direction finding using convolutional neural networks and convolutional recurrent neural networks [C]// Proceedings of the 2020 28th Signal Processing and Communications Applications Conference. Piscataway: IEEE, 2020:1-4. 10.1109/siu49456.2020.9302448 |
6 | CAI W, HU D. QRS complex detection using novel deep learning neural networks [J]. IEEE Access, 2020, 8: 97082-97089. 10.1109/access.2020.2997473 |
7 | SAXENA N, K B N, RAMAN B. Semantic segmentation of multispectral images using Res-Seg-net model [C]// Proceedings of the 2020 IEEE 14th International Conference on Semantic Computing. Piscataway: IEEE, 2020:154-157. 10.1109/icsc.2020.00030 |
8 | ZHANG Z, LIU Q, WANG Y. Road extraction by deep residual U-Net [J]. IEEE Geoscience and Remote Sensing Letters, 2018, 15(5): 749-753. 10.1109/lgrs.2018.2802944 |
9 | ZHOU Z, SIDDIQUEE M M R, TAJBAKHSH N, et al. UNet++: a nested U-Net architecture for medical image segmentation [EB/OL]. (2018-07-18) [2022-12-18]. . 10.1007/978-3-030-00889-5_1 |
10 | CAO K, ZHANG X. An improved Res-UNet model for tree species classification using airborne high-resolution images [J]. Remote Sensing, 2020, 12(7): 1128. 10.3390/rs12071128 |
11 | CHEN L-C, PAPANDREOU G, KOKKINOS I. Semantic image segmentation with deep convolutional nets and fully connected CRFs [EB/OL]. (2014-12-22) [2022-12-18]. . 10.1109/tpami.2017.2699184 |
12 | CHEN L-C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4):834-848. 10.1109/tpami.2017.2699184 |
13 | CHEN L-C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation [EB/OL]. (2017-06-05) [2022-12-18]. . 10.1007/978-3-030-01234-2_49 |
14 | CHEN L-C, ZHU Y, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation [EB/OL]. (2018-08-22) [2022-12-18]. . 10.1007/978-3-030-01234-2_49 |
15 | XIE Y, ZHANG J, SHEN C, et al. CoTr: efficiently bridging CNN and Transformer for 3D medical image segmentation [C]// Proceedings of the 2021 International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer, 2021: 171-180. 10.1007/978-3-030-87199-4_16 |
16 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale [EB/OL]. (2020-10-22) [2022-12-18]. . |
17 | LIU Z, LIN Y, CAO Y, et al. Swin Transformer: hierarchical vision Transformer using shifted windows [EB/OL]. (2021-08-17) [2022-12-18]. . 10.1109/iccv48922.2021.00986 |
18 | CHEN J, LU Y, YU Q, et al. TransUNet: Transformers make strong encoders for medical image segmentation [EB/OL]. (2021-02-08) [2022-12-18]. . 10.48550/arXiv.2102.04306 |
19 | AZAD R, HEIDARI M, SHARIATNIA M, et al. TransDeepLab: convolution-free Transformer-based DeepLabv3+ for medical image segmentation [EB/OL]. (2022-08-01) [2022-12-18]. . 10.1007/978-3-031-16919-9_9 |
20 | SRINIVAS A, LIN T-Y, PARMAR N, et al. Bottleneck Transformers for visual recognition [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2021: 16514-16524. 10.1109/cvpr46437.2021.01625 |
21 | 曾文雯,杨阳,钟小品.一种用于在架图书书脊语义分割的山字形网络 [J].图像与信号处理, 2020, 9(4): 218-225. 10.12677/JISP.2020.94026 |
ZENG W W, YANG Y, ZHONG X P. A mountain-shaped network for semantic segmentation of books spines on-shelves [J]. Image and Signal Processing, 2020, 9(4): 218-225. 10.12677/JISP.2020.94026 | |
22 | 曾文雯,杨阳,钟小品. 基于改进Mask R-CNN的在架图书书脊图像实例分割方法 [J].计算机应用研究, 2021,38(11):3456-3459,3505. 10.19734/j.issn.1001-3695.2021.01.0069 |
ZENG W W, YANG Y, ZHONG X P. Improved Mask R-CNN based instance segmentation method for spine image of books on shelves [J]. Application Research of Computers, 2021, 38(11):3456-3459,3505. 10.19734/j.issn.1001-3695.2021.01.0069 |
[1] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. |
[2] | Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738. |
[3] | Yue LIU, Fang LIU, Aoyun WU, Qiuyue CHAI, Tianxiao WANG. 3D object detection network based on self-attention mechanism and graph convolution [J]. Journal of Computer Applications, 2024, 44(6): 1972-1977. |
[4] | Zexin XU, Lei YANG, Kangshun LI. Shorter long-sequence time series forecasting model [J]. Journal of Computer Applications, 2024, 44(6): 1824-1831. |
[5] | Rong HUANG, Junjie SONG, Shubo ZHOU, Hao LIU. Image aesthetic quality evaluation method based on self-supervised vision Transformer [J]. Journal of Computer Applications, 2024, 44(4): 1269-1276. |
[6] | Xinran LUO, Tianrui LI, Zhen JIA. Chinese medical named entity recognition based on self-attention mechanism and lexicon enhancement [J]. Journal of Computer Applications, 2024, 44(2): 385-392. |
[7] | Ziqi HUANG, Jianpeng HU. Entity category enhanced nested named entity recognition in automotive domain [J]. Journal of Computer Applications, 2024, 44(2): 377-384. |
[8] | Liqing QIU, Xiaopan SU. Personalized multi-layer interest extraction click-through rate prediction model [J]. Journal of Computer Applications, 2024, 44(11): 3411-3418. |
[9] | Xingyao YANG, Hongtao SHEN, Zulian ZHANG, Jiong YU, Jiaying CHEN, Dongxiao WANG. Sequential recommendation based on hierarchical filter and temporal convolution enhanced self-attention network [J]. Journal of Computer Applications, 2024, 44(10): 3090-3096. |
[10] | Yanbo LI, Qing HE, Shunyi LU. Aspect sentiment triplet extraction integrating semantic and syntactic information [J]. Journal of Computer Applications, 2024, 44(10): 3275-3280. |
[11] | Jia CHEN, Hong ZHANG. Image text retrieval method based on feature enhancement and semantic correlation matching [J]. Journal of Computer Applications, 2024, 44(1): 16-23. |
[12] | Hanxiao SHI, Leichun WANG. Short-term power load forecasting by graph convolutional network combining LSTM and self-attention mechanism [J]. Journal of Computer Applications, 2024, 44(1): 311-317. |
[13] | Li’an CHEN, Yi GUO. Text sentiment analysis model based on individual bias information [J]. Journal of Computer Applications, 2024, 44(1): 145-151. |
[14] | Guolong YUAN, Yujin ZHANG, Yang LIU. Image tampering forensics network based on residual feedback and self-attention [J]. Journal of Computer Applications, 2023, 43(9): 2925-2931. |
[15] | Yi ZHANG, Zhenmei WANG. circRNA-disease association prediction by two-stage fusion on graph auto-encoder [J]. Journal of Computer Applications, 2023, 43(6): 1979-1986. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||