面向民国档案印章分割的改进U-Net

doi:10.11772/j.issn.1001-9081.2022020218

《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (3): 943-948.DOI: 10.11772/j.issn.1001-9081.2022020218

所属专题：多媒体计算与计算机仿真

• 多媒体计算与计算机仿真 • 上一篇下一篇

面向民国档案印章分割的改进U-Net

杨有¹^,², 张汝荟²(), 许鹏程², 康慷², 翟浩²

^1.重庆国家应用数学中心（重庆师范大学），重庆 401331
^2.重庆师范大学计算机与信息科学学院，重庆 401331

收稿日期:2022-02-28 修回日期:2022-05-20 接受日期:2022-05-20 发布日期:2022-08-16 出版日期:2023-03-10
通讯作者: 张汝荟
作者简介:杨有（1965—），男，重庆人，教授，博士，主要研究方向：数字图像处理、计算机视觉
张汝荟（1998—），女，江苏南京人，硕士研究生，主要研究方向：数字图像处理
许鹏程（1997—），男，江苏南京人，硕士研究生，主要研究方向：数字图像处理
康慷（1998—），男，浙江绍兴人，硕士研究生，主要研究方向：计算机视觉
翟浩（1987—），男，山西大同人，博士，主要研究方向：数字图像处理、计算机视觉。
基金资助:
重庆市研究生联合培养基地项目(2019-45);重庆师范大学（博士启动/人才引进）基金资助项目(21XLB032)

Improved U-Net for seal segmentation of Republican archives

You YANG¹^,², Ruhui ZHANG²(), Pengcheng XU², Kang KANG², Hao ZHAI²

^1.National Center for Applied Mathematics in Chongqing （Chongqing Normal University），Chongqing 401331，China
^2.College of Computer and Information Science，Chongqing Normal University，Chongqing 401331，China

Received:2022-02-28 Revised:2022-05-20 Accepted:2022-05-20 Online:2022-08-16 Published:2023-03-10
Contact: Ruhui ZHANG
About author:YANG You， born in 1965， Ph. D.， professor. His research interests include digital image processing， computer vision.
XU Pengcheng， born in 1997， M. S. candidate. His research interests include digital image processing.
KANG Kang， born in 1998， M. S. candidate. His research interests include computer vision.
ZHAI Hao， born in 1987， Ph. D. His research interests include digital image processing， computer vision.
Supported by:
Chongqing Postgraduate Joint Training Base Project(2019-45);Doctoral Start-Up /Talent Introduction Funded Project of Chongqing Normal University(21XLB032)

摘要/Abstract

摘要：

精准分割民国档案图像中的印章，有助于该类档案的智慧应用。针对民国档案印侵严重和过多噪声的问题，提出用于印章分割的网络UNet-S。该网络在保留U-Net的编解码器结构和跳跃连接的基础上从三个方面进行改进：一是使用多尺度残差模块替代U-Net原有的卷积层，使UNet-S既能有效提取多尺度特征，又能避免网络退化和梯度爆炸等问题；二是在多尺度残差模块中将普通卷积替换为深度可分离卷积（DSConv），大幅减少网络的参数量；三是使用BCEDiceLoss并根据仿真实验结果优选权重因子，以解决民国档案数据不平衡的问题。实验结果表明，相较于U-Net、DeepLab v2等网络，UNet-S的Dice相似系数（DSC）、平均交并比（mIoU）、平均像素准确率（MPA）取得了最优结果，最多提高了17.38%、32.68%和0.6%，参数量最多下降了76.64%。可见，UNet-S在民国档案数据集中分割效果更佳。

关键词: 深度可分离卷积, U-Net, 多尺度特征提取, 民国档案, 印章分割

Abstract:

Achieving seal segmentation precisely， it is benefit to intelligent application of the Republican archives. Concerning the problems of serious printing invasion and excessive noise， a network for seal segmentation was proposed， namely U-Net for Seal （UNet-S）. Based on the encoder-decoder framework and skip connections of U-Net， this proposed network was improved from three aspects. Firstly， multi-scale residual module was employed to replace the original convolution layer of U-Net. In this way， the problems such as network degradation and gradient explosion were avoided， while multi-scale features were extracted effectively by UNet-S. Next improvement was using Depthwise Separable Convolution （DSConv） to replace the ordinary convolution in the multi-scale residual module， thereby greatly reducing the number of network parameters. Thirdly， Binary Cross Entropy Dice Loss （BCEDiceLoss） was used and weight factors were determined by experimental results to solve the data imbalance problem of archives of the Republic of China. Experimental results show that compared with U-Net， DeepLab v2 and other networks， the Dice Similarity Coefficient （DSC）， mean Intersection over Union （mIoU） and Mean Pixel Accuracy （MPA） of UNet-S have achieved the best results， which have increased by 17.38%， 32.68% and 0.6% at most， and the number of parameters have decreased by 76.64% at most. It can be seen that UNet-S has good segmentation effect in the dataset of Republican archives.

Key words: Depthwise Separable Convolution (DSConv), U-Net, multi-scale feature extraction, archives of the Republic of China, seal segmentation

中图分类号:

TP391.41

杨有, 张汝荟, 许鹏程, 康慷, 翟浩. 面向民国档案印章分割的改进U-Net[J]. 计算机应用, 2023, 43(3): 943-948.

You YANG, Ruhui ZHANG, Pengcheng XU, Kang KANG, Hao ZHAI. Improved U-Net for seal segmentation of Republican archives[J]. Journal of Computer Applications, 2023, 43(3): 943-948.

图/表 10

图1 UNet-S结构

Fig. 1 UNet-S structure

图2 多尺度残差块结构

Fig. 2 Multi-scale residual block structure

图3 民国档案印章和掩膜图

Fig. 3 Seal and mask diagrams of Republican archives

表1 不同网络的实验结果对比

Tab. 1 Comparison of experimental results of different networks

网络	MPA	mIoU	DSC	参数量/10⁶
U-Net	0.995	0.784	0.879	28.90
DeepLab v2	0.990	0.667	0.800	23.60
UNet++	0.995	0.834	0.910	9.20
DFANet	0.995	0.803	0.891	2.20
DDRNet	0.995	0.795	0.886	5.70
UNet-S	0.996	0.885	0.939	6.75

表2 不同网络的复杂度对比

Tab. 2 Complexity comparison of different networks

网络	时间/ms	MAC/10⁹	参数量/10⁶
U-Net	52.280	196.320	28.90
DeepLab v2	76.730	183.370	23.60
UNet++	53.960	138.450	9.20
DFANet	39.630	1.780	2.20
UNet-S	36.260	15.300	6.75

图4 测试集分割结果示例

Fig. 4 Example of test set segmentation results

表3 不同层数的实验结果对比

Tab. 3 Comparison of experimental results with different layers

网络层数	MPA	mIoU	DSC	参数量/10⁶
1	0.989	0.388	0.559	0.03
2	0.993	0.704	0.826	0.11
3	0.994	0.811	0.896	0.44
4	0.996	0.872	0.932	1.71
5	0.996	0.885	0.939	6.75
6	0.996	0.868	0.929	26.78

表4 不同权重因子的结果对比

Tab. 4 Comparison of results with different weighting factors

$λ$	MPA	mIoU	DSC
0.1	0.996	0.864	0.927
0.2	0.996	0.871	0.931
0.3	0.996	0.868	0.929
0.4	0.996	0.876	0.934
0.5	0.996	0.872	0.932
0.6	0.996	0.880	0.936
0.7	0.996	0.878	0.935
0.8	0.996	0.882	0.937
0.9	0.996	0.885	0.939
1.0	0.996	0.879	0.936

表4 不同权重因子的结果对比

Tab. 4 Comparison of results with different weighting factors

$λ$	MPA	mIoU	DSC
0.1	0.996	0.864	0.927
0.2	0.996	0.871	0.931
0.3	0.996	0.868	0.929
0.4	0.996	0.876	0.934
0.5	0.996	0.872	0.932
0.6	0.996	0.880	0.936
0.7	0.996	0.878	0.935
0.8	0.996	0.882	0.937
0.9	0.996	0.885	0.939
1.0	0.996	0.879	0.936

表5 不同模块实验结果对比

Tab. 5 Comparison of experimental results of different modules

DSConv	多尺度残差块	网络层数	BCEWithLogitsLoss	BCEDiceLoss	通道数	MPA	mIoU	DSC	参数量/10⁶
—	—	4	√	—	64	0.995	0.784	0.879	28.900
—	—	4	—	√	64	0.995	0.832	0.908	28.900
—	—	5	√	—	64	0.995	0.817	0.899	116.042
—	—	5	—	√	64	0.996	0.858	0.924	116.042
—	—	5	√	—	32	0.995	0.795	0.886	29.020
—	—	5	—	√	32	0.995	0.836	0.911	29.020
√	—	5	√	—	32	0.995	0.762	0.865	6.750
√	—	5	—	√	32	0.995	0.819	0.901	6.750
√	√	5	√	—	32	0.995	0.789	0.882	6.750
√	√	5	—	√	32	0.996	0.885	0.939	6.750

图5 不同Epoch下的结果

Fig. 5 Results in different Epoch

参考文献 25

1	韦晓宇. 民国档案图像倾斜校正［D］. 重庆：重庆师范大学， 2020：8-14.
	WEI X Y. Image skew correction of the Republican China documents［D］. Chongqing： Chongqing Normal University， 2020： 8-14.
2	KAYHAN N， FEKRI-ERSHAD S. Content based image retrieval based on weighted fusion of texture and color features derived from modified local binary patterns and local neighborhood difference patterns［J］. Multimedia Tools and Applications， 2021， 80（21/22/23）： 32763-32790. 10.1007/s11042-021-11217-z
3	REN X H， ZHOU Y， HUANG Z， et al. A novel text structure feature extractor for Chinese scene text detection and recognition［J］. IEEE Access， 2017， 5： 3193-3204. 10.1109/access.2017.2676158
4	LONG J， SHELHAMER E， DARRELL T. Fully convolutional networks for semantic segmentation［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017， 39（4）： 640-651. 10.1109/tpami.2016.2572683
5	RONNEBERGER O， FISCHE P， BROX T. U-net： convolutional networks for biomedical image segmentation［C］// Proceedings of the 2015 International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham： Springer， 2015： 234-241. 10.1007/978-3-319-24574-4_28
6	PASZKE A， CHAURASIA A， KIM S， et al. ENet： a deep neural network architecture for real-time semantic segmentation［EB/OL］. ［2021-03-25］.. 10.48550/arXiv.1606.02147
7	BADRINARAYANAN V， KENDALL A， CIPOLLA R. SegNet： a deep convolutional encoder-decoder architecture for image segmentation［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017， 39（12）： 2481-2495. 10.1109/tpami.2016.2644615
8	ZHOU Z W， SIDDIQUEE M M R， TAJBAKHSH N， et al. UNet++： redesigning skip connections to exploit multiscale features in image segmentation［J］. IEEE Transactions on Medical Imaging， 2020， 39（6）： 1856-1867. 10.1109/tmi.2019.2959609
9	LI H C， XIONG P F， FAN H Q， et al. DFANet： deep feature aggregation for real-time semantic segmentation［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 9514-9523. 10.1109/cvpr.2019.00975
10	WANG Y， ZHOU Q， LIU J， et al. LEDNet： a lightweight encoder-decoder network for real-time semantic segmentation［C］// Proceedings of the 2019 IEEE International Conference on Image Processing. Piscataway： IEEE， 2019：1860-1864. 10.1109/icip.2019.8803154
11	ZHOU Y K， CHEN Z L， SHEN H L， et al. A refined equilibrium generative adversarial network for retinal vessel segmentation［J］. Neurocomputing， 2021， 437： 118-130. 10.1016/j.neucom.2020.06.143
12	景庄伟，管海燕，彭代峰，等. 基于深度神经网络的图像语义分割研究综述［J］. 计算机工程， 2020， 46（10）：1-17. 10.19678/j.issn.1000-3428.0058018
	JING Z W， GUAN H Y， PENG D F， et al. Survey of research in image semantic segmentation based on deep neural network［J］. Computer Engineering， 2020， 46（10）： 1-17. 10.19678/j.issn.1000-3428.0058018
13	CHOLLET F. Xception： deep learning with depthwise separable convolutions［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017：1800-1807. 10.1109/cvpr.2017.195
14	吕佳，马超，程超. 改进的U-Net网络用于视网膜血管分割［EB/OL］. ［2021-09-04］.. 10.3778/j.issn.1673-9418.2107060
	LYU J， MA C， CHENG C. Improved U-Net network for retina vascular segmentation［EB/OL］. ［2021-09-04］.. 10.3778/j.issn.1673-9418.2107060
15	PUNN N S， AGARWAL S. Modality specific U-Net variants for biomedical image segmentation： a survey［J］. Artificial Intelligence Review， 2022， 55（7）： 5845-5889. 10.1007/s10462-022-10152-1
16	黄梨，卢龙. 基于长距离依赖编码与深度残差U-Net的缺血性卒中病灶分割［J］. 计算机应用， 2021， 41（6）： 1820-1827. 10.11772/j.issn.1001-9081.2020111788
	HUANG L， LU L. Segmentation of ischemic stroke lesion based on long-distance dependency encoding and deep residual U-Net［J］. Journal of Computer Applications， 2021， 41（6）： 1820-1827. 10.11772/j.issn.1001-9081.2020111788
17	聂滋森，陈辛阳，杨耿超，等. 基于U-Net的格子玻尔兹曼方法［J］. 中山大学学报（自然科学版）， 2022， 61（3）： 101-109.
	NIE Z S， CHEN X Y， YANG G C， et al. Lattice Boltzmann method based on U-Net［J］. Acta Scientiarum Naturalium Universitatis Sunyatseni， 2022， 61（3）： 101-109.
18	肖振久，杨晓迪，魏宪，等. 改进的轻量型网络在图像识别上的应用［J］. 计算机科学与探索， 2021， 15（4）：743-753. 10.3778/j.issn.1673-9418.2004057
	XIAO Z J， YANG X D， WEI X， et al. Improved lightweight network in image recognition［J］. Journal of Frontiers of Computer Science and Technology， 2021， 15（4）：743-753. 10.3778/j.issn.1673-9418.2004057
19	HE K M， ZHANG X Y， REN S Q， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778.）. 10.1109/cvpr.2016.90
20	吕佳，许鹏程. 多尺度自适应上采样的图像超分辨率重建算法［J/OL］. 计算机科学与探索（2021-10-25）［2022-01-16］.. 10.1109/acctcs52002.2021.00051
	LYU J， XU P C. Image super-resolution reconstruction algorithm based on adaptive up-sampling of multi-scale［J/OL］. Journal of Frontiers of Computer Science and Technology （2021-10-25）［2022-01-16］.. 10.1109/acctcs52002.2021.00051
21	TU W C， LIU M Y， JAMPANI V， et al. Learning superpixels with segmentation-aware affinity loss［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 568-576. 10.1109/cvpr.2018.00066
22	TAI Y L， HUANG S J， CHEN C C， et al. Computational complexity reduction of neural networks of brain tumor image segmentation by introducing Fermi-Dirac correction functions［J］. Entropy， 2021， 23（2）： No.223. 10.3390/e23020223
23	KOFLER F， EZHOV I， ISENSEE F， et al. Are we using appropriate segmentation metrics？ identifying correlates of human expert perception for CNN training beyond rolling the DICE coefficient［EB/OL］. （2021-03-10）［2021-09-05］..
24	CHEN L C， PAPANDREOU G， KOKKINOS I， et al. DeepLab： semantic image segmentation with deep convolutional nets， atrous convolution， and fully connected CRFs［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2018， 40（4）：834-848. 10.1109/tpami.2017.2699184
25	PAN H H， HONG Y D， SUN W C， et al. Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes［J］. IEEE Transactions on Intelligent Transportation Systems， 2022（Early Access）： 1-13. 10.48550/arXiv.2101.06085

[1]	陈彤, 杨丰玉, 熊宇, 严荭, 邱福星. 基于多尺度频率通道注意力融合的声纹库构建方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2407-2413.
[2]	许立君, 黎辉, 刘祖阳, 陈侃松, 马为駽. 基于3D‑Ghost卷积神经网络的脑胶质瘤MRI图像分割算法3D‑GA‑Unet[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1294-1302.
[3]	刘雨生, 肖学中. 基于扩散模型微调的高保真图像编辑[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3574-3580.
[4]	周迪, 张自力, 陈佳, 胡新荣, 何儒汉, 张俊. 基于EfficientNetV2和物体上下文表示的胃癌图像分割方法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2955-2962.
[5]	姜钧舰, 刘达维, 刘逸凡, 任酉贵, 赵志滨. 基于孪生网络的小样本目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2325-2329.
[6]	郭奕裕, 周箩鱼, 刘新瑜, 李尧. 改进注意力机制的电梯场景下危险品检测方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2295-2302.
[7]	詹春兰, 王安志, 王明辉. 基于通道注意力和边缘融合的伪装目标分割方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2166-2172.
[8]	刘辉, 张琳玉, 王复港, 何如瑾. 基于注意力机制和上下文信息的目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1557-1564.
[9]	蒋瑞林, 覃仁超. 基于深度可分离卷积的多神经网络恶意代码检测模型[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1527-1533.
[10]	朱利安, 张鸿. 基于双分支条件生成对抗网络的非均匀图像去雾[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 567-574.
[11]	张志昂, 廖光忠. 基于U-Net的多尺度特征增强视网膜血管分割算法[J]. 《计算机应用》唯一官方网站, 2023, 43(10): 3275-3281.
[12]	林荐壮, 杨文忠, 谭思翔, 周乐鑫, 陈丹妮. 融合滤波增强和反转注意力网络用于息肉分割[J]. 《计算机应用》唯一官方网站, 2023, 43(1): 265-272.
[13]	张怡, 孙永荣, 赵科东, 李华, 曾庆化. 空中加油场景下的目标联合检测跟踪算法[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2893-2899.
[14]	靳华中, 张修洋, 叶志伟, 张闻其, 夏小鱼. 基于近似U型网络结构的图像去噪模型[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2571-2577.
[15]	李坤, 侯庆. 基于注意力机制的轻量型人体姿态估计[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2407-2414.

面向民国档案印章分割的改进U-Net

Improved U-Net for seal segmentation of Republican archives

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献 25

相关文章 15

编辑推荐

Metrics