《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (3): 943-948.DOI: 10.11772/j.issn.1001-9081.2022020218
所属专题: 多媒体计算与计算机仿真
收稿日期:
2022-02-28
修回日期:
2022-05-20
接受日期:
2022-05-20
发布日期:
2022-08-16
出版日期:
2023-03-10
通讯作者:
张汝荟
作者简介:
杨有(1965—),男,重庆人,教授,博士,主要研究方向:数字图像处理、计算机视觉基金资助:
You YANG1,2, Ruhui ZHANG2(), Pengcheng XU2, Kang KANG2, Hao ZHAI2
Received:
2022-02-28
Revised:
2022-05-20
Accepted:
2022-05-20
Online:
2022-08-16
Published:
2023-03-10
Contact:
Ruhui ZHANG
About author:
YANG You, born in 1965, Ph. D., professor. His research interests include digital image processing, computer vision.Supported by:
摘要:
精准分割民国档案图像中的印章,有助于该类档案的智慧应用。针对民国档案印侵严重和过多噪声的问题,提出用于印章分割的网络UNet-S。该网络在保留U-Net的编解码器结构和跳跃连接的基础上从三个方面进行改进:一是使用多尺度残差模块替代U-Net原有的卷积层,使UNet-S既能有效提取多尺度特征,又能避免网络退化和梯度爆炸等问题;二是在多尺度残差模块中将普通卷积替换为深度可分离卷积(DSConv),大幅减少网络的参数量;三是使用BCEDiceLoss并根据仿真实验结果优选权重因子,以解决民国档案数据不平衡的问题。实验结果表明,相较于U-Net、DeepLab v2等网络,UNet-S的Dice相似系数(DSC)、平均交并比(mIoU)、平均像素准确率(MPA)取得了最优结果,最多提高了17.38%、32.68%和0.6%,参数量最多下降了76.64%。可见,UNet-S在民国档案数据集中分割效果更佳。
中图分类号:
杨有, 张汝荟, 许鹏程, 康慷, 翟浩. 面向民国档案印章分割的改进U-Net[J]. 计算机应用, 2023, 43(3): 943-948.
You YANG, Ruhui ZHANG, Pengcheng XU, Kang KANG, Hao ZHAI. Improved U-Net for seal segmentation of Republican archives[J]. Journal of Computer Applications, 2023, 43(3): 943-948.
网络 | MPA | mIoU | DSC | 参数量/106 |
---|---|---|---|---|
U-Net | 0.995 | 0.784 | 0.879 | 28.90 |
DeepLab v2 | 0.990 | 0.667 | 0.800 | 23.60 |
UNet++ | 0.995 | 0.834 | 0.910 | 9.20 |
DFANet | 0.995 | 0.803 | 0.891 | 2.20 |
DDRNet | 0.995 | 0.795 | 0.886 | 5.70 |
UNet-S | 0.996 | 0.885 | 0.939 | 6.75 |
表1 不同网络的实验结果对比
Tab. 1 Comparison of experimental results of different networks
网络 | MPA | mIoU | DSC | 参数量/106 |
---|---|---|---|---|
U-Net | 0.995 | 0.784 | 0.879 | 28.90 |
DeepLab v2 | 0.990 | 0.667 | 0.800 | 23.60 |
UNet++ | 0.995 | 0.834 | 0.910 | 9.20 |
DFANet | 0.995 | 0.803 | 0.891 | 2.20 |
DDRNet | 0.995 | 0.795 | 0.886 | 5.70 |
UNet-S | 0.996 | 0.885 | 0.939 | 6.75 |
网络 | 时间/ms | MAC/109 | 参数量/106 |
---|---|---|---|
U-Net | 52.280 | 196.320 | 28.90 |
DeepLab v2 | 76.730 | 183.370 | 23.60 |
UNet++ | 53.960 | 138.450 | 9.20 |
DFANet | 39.630 | 1.780 | 2.20 |
UNet-S | 36.260 | 15.300 | 6.75 |
表2 不同网络的复杂度对比
Tab. 2 Complexity comparison of different networks
网络 | 时间/ms | MAC/109 | 参数量/106 |
---|---|---|---|
U-Net | 52.280 | 196.320 | 28.90 |
DeepLab v2 | 76.730 | 183.370 | 23.60 |
UNet++ | 53.960 | 138.450 | 9.20 |
DFANet | 39.630 | 1.780 | 2.20 |
UNet-S | 36.260 | 15.300 | 6.75 |
网络层数 | MPA | mIoU | DSC | 参数量/106 |
---|---|---|---|---|
1 | 0.989 | 0.388 | 0.559 | 0.03 |
2 | 0.993 | 0.704 | 0.826 | 0.11 |
3 | 0.994 | 0.811 | 0.896 | 0.44 |
4 | 0.996 | 0.872 | 0.932 | 1.71 |
5 | 0.996 | 0.885 | 0.939 | 6.75 |
6 | 0.996 | 0.868 | 0.929 | 26.78 |
表3 不同层数的实验结果对比
Tab. 3 Comparison of experimental results with different layers
网络层数 | MPA | mIoU | DSC | 参数量/106 |
---|---|---|---|---|
1 | 0.989 | 0.388 | 0.559 | 0.03 |
2 | 0.993 | 0.704 | 0.826 | 0.11 |
3 | 0.994 | 0.811 | 0.896 | 0.44 |
4 | 0.996 | 0.872 | 0.932 | 1.71 |
5 | 0.996 | 0.885 | 0.939 | 6.75 |
6 | 0.996 | 0.868 | 0.929 | 26.78 |
MPA | mIoU | DSC | |
---|---|---|---|
0.1 | 0.996 | 0.864 | 0.927 |
0.2 | 0.996 | 0.871 | 0.931 |
0.3 | 0.996 | 0.868 | 0.929 |
0.4 | 0.996 | 0.876 | 0.934 |
0.5 | 0.996 | 0.872 | 0.932 |
0.6 | 0.996 | 0.880 | 0.936 |
0.7 | 0.996 | 0.878 | 0.935 |
0.8 | 0.996 | 0.882 | 0.937 |
0.9 | 0.996 | 0.885 | 0.939 |
1.0 | 0.996 | 0.879 | 0.936 |
表4 不同权重因子的结果对比
Tab. 4 Comparison of results with different weighting factors
MPA | mIoU | DSC | |
---|---|---|---|
0.1 | 0.996 | 0.864 | 0.927 |
0.2 | 0.996 | 0.871 | 0.931 |
0.3 | 0.996 | 0.868 | 0.929 |
0.4 | 0.996 | 0.876 | 0.934 |
0.5 | 0.996 | 0.872 | 0.932 |
0.6 | 0.996 | 0.880 | 0.936 |
0.7 | 0.996 | 0.878 | 0.935 |
0.8 | 0.996 | 0.882 | 0.937 |
0.9 | 0.996 | 0.885 | 0.939 |
1.0 | 0.996 | 0.879 | 0.936 |
DSConv | 多尺度残差块 | 网络层数 | BCEWithLogitsLoss | BCEDiceLoss | 通道数 | MPA | mIoU | DSC | 参数量/106 |
---|---|---|---|---|---|---|---|---|---|
— | — | 4 | √ | — | 64 | 0.995 | 0.784 | 0.879 | 28.900 |
— | — | 4 | — | √ | 64 | 0.995 | 0.832 | 0.908 | 28.900 |
— | — | 5 | √ | — | 64 | 0.995 | 0.817 | 0.899 | 116.042 |
— | — | 5 | — | √ | 64 | 0.996 | 0.858 | 0.924 | 116.042 |
— | — | 5 | √ | — | 32 | 0.995 | 0.795 | 0.886 | 29.020 |
— | — | 5 | — | √ | 32 | 0.995 | 0.836 | 0.911 | 29.020 |
√ | — | 5 | √ | — | 32 | 0.995 | 0.762 | 0.865 | 6.750 |
√ | — | 5 | — | √ | 32 | 0.995 | 0.819 | 0.901 | 6.750 |
√ | √ | 5 | √ | — | 32 | 0.995 | 0.789 | 0.882 | 6.750 |
√ | √ | 5 | — | √ | 32 | 0.996 | 0.885 | 0.939 | 6.750 |
表5 不同模块实验结果对比
Tab. 5 Comparison of experimental results of different modules
DSConv | 多尺度残差块 | 网络层数 | BCEWithLogitsLoss | BCEDiceLoss | 通道数 | MPA | mIoU | DSC | 参数量/106 |
---|---|---|---|---|---|---|---|---|---|
— | — | 4 | √ | — | 64 | 0.995 | 0.784 | 0.879 | 28.900 |
— | — | 4 | — | √ | 64 | 0.995 | 0.832 | 0.908 | 28.900 |
— | — | 5 | √ | — | 64 | 0.995 | 0.817 | 0.899 | 116.042 |
— | — | 5 | — | √ | 64 | 0.996 | 0.858 | 0.924 | 116.042 |
— | — | 5 | √ | — | 32 | 0.995 | 0.795 | 0.886 | 29.020 |
— | — | 5 | — | √ | 32 | 0.995 | 0.836 | 0.911 | 29.020 |
√ | — | 5 | √ | — | 32 | 0.995 | 0.762 | 0.865 | 6.750 |
√ | — | 5 | — | √ | 32 | 0.995 | 0.819 | 0.901 | 6.750 |
√ | √ | 5 | √ | — | 32 | 0.995 | 0.789 | 0.882 | 6.750 |
√ | √ | 5 | — | √ | 32 | 0.996 | 0.885 | 0.939 | 6.750 |
1 | 韦晓宇. 民国档案图像倾斜校正[D]. 重庆:重庆师范大学, 2020:8-14. |
WEI X Y. Image skew correction of the Republican China documents[D]. Chongqing: Chongqing Normal University, 2020: 8-14. | |
2 | KAYHAN N, FEKRI-ERSHAD S. Content based image retrieval based on weighted fusion of texture and color features derived from modified local binary patterns and local neighborhood difference patterns[J]. Multimedia Tools and Applications, 2021, 80(21/22/23): 32763-32790. 10.1007/s11042-021-11217-z |
3 | REN X H, ZHOU Y, HUANG Z, et al. A novel text structure feature extractor for Chinese scene text detection and recognition[J]. IEEE Access, 2017, 5: 3193-3204. 10.1109/access.2017.2676158 |
4 | LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640-651. 10.1109/tpami.2016.2572683 |
5 | RONNEBERGER O, FISCHE P, BROX T. U-net: convolutional networks for biomedical image segmentation[C]// Proceedings of the 2015 International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer, 2015: 234-241. 10.1007/978-3-319-24574-4_28 |
6 | PASZKE A, CHAURASIA A, KIM S, et al. ENet: a deep neural network architecture for real-time semantic segmentation[EB/OL]. [2021-03-25].. 10.48550/arXiv.1606.02147 |
7 | BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495. 10.1109/tpami.2016.2644615 |
8 | ZHOU Z W, SIDDIQUEE M M R, TAJBAKHSH N, et al. UNet++: redesigning skip connections to exploit multiscale features in image segmentation[J]. IEEE Transactions on Medical Imaging, 2020, 39(6): 1856-1867. 10.1109/tmi.2019.2959609 |
9 | LI H C, XIONG P F, FAN H Q, et al. DFANet: deep feature aggregation for real-time semantic segmentation[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 9514-9523. 10.1109/cvpr.2019.00975 |
10 | WANG Y, ZHOU Q, LIU J, et al. LEDNet: a lightweight encoder-decoder network for real-time semantic segmentation[C]// Proceedings of the 2019 IEEE International Conference on Image Processing. Piscataway: IEEE, 2019:1860-1864. 10.1109/icip.2019.8803154 |
11 | ZHOU Y K, CHEN Z L, SHEN H L, et al. A refined equilibrium generative adversarial network for retinal vessel segmentation[J]. Neurocomputing, 2021, 437: 118-130. 10.1016/j.neucom.2020.06.143 |
12 | 景庄伟,管海燕,彭代峰,等. 基于深度神经网络的图像语义分割研究综述[J]. 计算机工程, 2020, 46(10):1-17. 10.19678/j.issn.1000-3428.0058018 |
JING Z W, GUAN H Y, PENG D F, et al. Survey of research in image semantic segmentation based on deep neural network[J]. Computer Engineering, 2020, 46(10): 1-17. 10.19678/j.issn.1000-3428.0058018 | |
13 | CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017:1800-1807. 10.1109/cvpr.2017.195 |
14 | 吕佳,马超,程超. 改进的U-Net网络用于视网膜血管分割[EB/OL]. [2021-09-04].. 10.3778/j.issn.1673-9418.2107060 |
LYU J, MA C, CHENG C. Improved U-Net network for retina vascular segmentation[EB/OL]. [2021-09-04].. 10.3778/j.issn.1673-9418.2107060 | |
15 | PUNN N S, AGARWAL S. Modality specific U-Net variants for biomedical image segmentation: a survey[J]. Artificial Intelligence Review, 2022, 55(7): 5845-5889. 10.1007/s10462-022-10152-1 |
16 | 黄梨,卢龙. 基于长距离依赖编码与深度残差U-Net的缺血性卒中病灶分割[J]. 计算机应用, 2021, 41(6): 1820-1827. 10.11772/j.issn.1001-9081.2020111788 |
HUANG L, LU L. Segmentation of ischemic stroke lesion based on long-distance dependency encoding and deep residual U-Net[J]. Journal of Computer Applications, 2021, 41(6): 1820-1827. 10.11772/j.issn.1001-9081.2020111788 | |
17 | 聂滋森,陈辛阳,杨耿超,等. 基于U-Net的格子玻尔兹曼方法[J]. 中山大学学报(自然科学版), 2022, 61(3): 101-109. |
NIE Z S, CHEN X Y, YANG G C, et al. Lattice Boltzmann method based on U-Net[J]. Acta Scientiarum Naturalium Universitatis Sunyatseni, 2022, 61(3): 101-109. | |
18 | 肖振久,杨晓迪,魏宪,等. 改进的轻量型网络在图像识别上的应用[J]. 计算机科学与探索, 2021, 15(4):743-753. 10.3778/j.issn.1673-9418.2004057 |
XIAO Z J, YANG X D, WEI X, et al. Improved lightweight network in image recognition[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(4):743-753. 10.3778/j.issn.1673-9418.2004057 | |
19 | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778.). 10.1109/cvpr.2016.90 |
20 | 吕佳,许鹏程. 多尺度自适应上采样的图像超分辨率重建算法[J/OL]. 计算机科学与探索 (2021-10-25) [2022-01-16].. 10.1109/acctcs52002.2021.00051 |
LYU J, XU P C. Image super-resolution reconstruction algorithm based on adaptive up-sampling of multi-scale[J/OL]. Journal of Frontiers of Computer Science and Technology (2021-10-25) [2022-01-16].. 10.1109/acctcs52002.2021.00051 | |
21 | TU W C, LIU M Y, JAMPANI V, et al. Learning superpixels with segmentation-aware affinity loss[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 568-576. 10.1109/cvpr.2018.00066 |
22 | TAI Y L, HUANG S J, CHEN C C, et al. Computational complexity reduction of neural networks of brain tumor image segmentation by introducing Fermi-Dirac correction functions[J]. Entropy, 2021, 23(2): No.223. 10.3390/e23020223 |
23 | KOFLER F, EZHOV I, ISENSEE F, et al. Are we using appropriate segmentation metrics? identifying correlates of human expert perception for CNN training beyond rolling the DICE coefficient[EB/OL]. (2021-03-10) [2021-09-05].. |
24 | CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4):834-848. 10.1109/tpami.2017.2699184 |
25 | PAN H H, HONG Y D, SUN W C, et al. Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes[J]. IEEE Transactions on Intelligent Transportation Systems, 2022(Early Access): 1-13. 10.48550/arXiv.2101.06085 |
[1] | 陈彤, 杨丰玉, 熊宇, 严荭, 邱福星. 基于多尺度频率通道注意力融合的声纹库构建方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2407-2413. |
[2] | 许立君, 黎辉, 刘祖阳, 陈侃松, 马为駽. 基于3D‑Ghost卷积神经网络的脑胶质瘤MRI图像分割算法3D‑GA‑Unet[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1294-1302. |
[3] | 刘雨生, 肖学中. 基于扩散模型微调的高保真图像编辑[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3574-3580. |
[4] | 周迪, 张自力, 陈佳, 胡新荣, 何儒汉, 张俊. 基于EfficientNetV2和物体上下文表示的胃癌图像分割方法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2955-2962. |
[5] | 姜钧舰, 刘达维, 刘逸凡, 任酉贵, 赵志滨. 基于孪生网络的小样本目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2325-2329. |
[6] | 郭奕裕, 周箩鱼, 刘新瑜, 李尧. 改进注意力机制的电梯场景下危险品检测方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2295-2302. |
[7] | 詹春兰, 王安志, 王明辉. 基于通道注意力和边缘融合的伪装目标分割方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2166-2172. |
[8] | 刘辉, 张琳玉, 王复港, 何如瑾. 基于注意力机制和上下文信息的目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1557-1564. |
[9] | 蒋瑞林, 覃仁超. 基于深度可分离卷积的多神经网络恶意代码检测模型[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1527-1533. |
[10] | 朱利安, 张鸿. 基于双分支条件生成对抗网络的非均匀图像去雾[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 567-574. |
[11] | 张志昂, 廖光忠. 基于U-Net的多尺度特征增强视网膜血管分割算法[J]. 《计算机应用》唯一官方网站, 2023, 43(10): 3275-3281. |
[12] | 林荐壮, 杨文忠, 谭思翔, 周乐鑫, 陈丹妮. 融合滤波增强和反转注意力网络用于息肉分割[J]. 《计算机应用》唯一官方网站, 2023, 43(1): 265-272. |
[13] | 张怡, 孙永荣, 赵科东, 李华, 曾庆化. 空中加油场景下的目标联合检测跟踪算法[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2893-2899. |
[14] | 靳华中, 张修洋, 叶志伟, 张闻其, 夏小鱼. 基于近似U型网络结构的图像去噪模型[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2571-2577. |
[15] | 李坤, 侯庆. 基于注意力机制的轻量型人体姿态估计[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2407-2414. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||