《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (7): 2227-2238.DOI: 10.11772/j.issn.1001-9081.2021050882
收稿日期:
2021-05-27
修回日期:
2021-11-24
接受日期:
2021-12-21
发布日期:
2022-03-08
出版日期:
2022-07-10
通讯作者:
刘尚旺
作者简介:
张新明(1963—),男,湖北孝感人,教授,硕士,CCF会员,主要研究方向:智能优化算法、图像分割基金资助:
Shangwang LIU1,2(), Xinming ZHANG1,2, Fei ZHANG1,2
Received:
2021-05-27
Revised:
2021-11-24
Accepted:
2021-12-21
Online:
2022-03-08
Published:
2022-07-10
Contact:
Shangwang LIU
About author:
ZHANG Xinming, born in 1963, M. S., professor. His research interests include intelligent optimization algorithm, image segmentation.Supported by:
摘要:
在当今国际化的社会,作为国际通用语言的英文字符及中文环境下的拼音字符出现在众多公共场合。当这些字符出现在图像中时,尤其在风格复杂的图像中时,难以直接对其进行编辑修改。针对上述问题,提出了一种改进文字生成网络(FANnet)的图像字符编辑方法。首先,利用基于直方图对比度(HC)的显著性检测算法改进自适应字符检测(CAD)模型,准确提取出用户所选择的图像字符;接着,根据FANnet,生成与源字符字体几乎一致的目标字符的二值图;然后,通过所提出的局部颜色分布(CDL)迁移模型,迁移源字符颜色至目标字符;最后,生成与源字符字体结构和颜色变化均高度一致的目标可编辑修改字符,从而达到字符编辑目的。实验结果表明,在MSRA-TD500、COCO-Text和ICDAR数据集上,所提方法的结构相似性(SSIM)、峰值信噪比(PSNR)和归一化均方根误差(NRMSE)平均值分别为0.776 5、18.321 1 dB和0.435 8,相较于基于字体自适应神经网络的场景文本编辑器(STEFANN)算法分别提高了18.59%、14.02%和降低了2.97%,相较于多模态小样本字体迁移模型MC-GAN算法(输入1个字符时)分别提高了30.24%、23.92%和降低了4.68%;而且针对字体结构和颜色渐变分布比较复杂的实际场景图像字符,所提方法的编辑效果也较好。该方法可以应用于图像重利用、图像字符计算机自动纠错和图像文本信息重存储。
中图分类号:
刘尚旺, 张新明, 张非. 改进字体自适应神经网络的图像字符编辑方法[J]. 计算机应用, 2022, 42(7): 2227-2238.
Shangwang LIU, Xinming ZHANG, Fei ZHANG. Image character editing method based on improved font adaptive neural network[J]. Journal of Computer Applications, 2022, 42(7): 2227-2238.
方法 | 评价指标 | ||
---|---|---|---|
NRMSE | PSNR/dB | ASSIMRGB | |
MC-GAN(输入字符数为1) | 0.457 2 | 14.785 4 | 0.596 2 |
MC-GAN(输入字符数大于等于3) | 0.370 4 | 18.587 4 | 0.790 2 |
STEFANN | 0.449 1 | 16.069 2 | 0.654 8 |
本文方法 | 0.435 8 | 18.321 1 | 0.776 5 |
表1 不同方法定量评价指标结果
Tab. 1 Results of different methods on quantitative evaluation indices
方法 | 评价指标 | ||
---|---|---|---|
NRMSE | PSNR/dB | ASSIMRGB | |
MC-GAN(输入字符数为1) | 0.457 2 | 14.785 4 | 0.596 2 |
MC-GAN(输入字符数大于等于3) | 0.370 4 | 18.587 4 | 0.790 2 |
STEFANN | 0.449 1 | 16.069 2 | 0.654 8 |
本文方法 | 0.435 8 | 18.321 1 | 0.776 5 |
1 | 范一华,邓德祥,颜佳.基于色彩空间的最大稳定极值区域的自然场景文本检测[J].计算机应用,2018,38(1):264-269,294. 10.11772/j.issn.1001-9081.2017061389 |
FAN Y H, DENG D X, YAN J. Natural scene text detection based on maximally stable extremal region in color space [J]. Journal of Computer Applications, 2018, 38(1): 264-269, 294. 10.11772/j.issn.1001-9081.2017061389 | |
2 | 张矿,朱远平.基于超像素融合的文本分割[J].计算机应用,2016,36(12):3418-3422. 10.11772/j.issn.1001-9081.2016.12.3418 |
ZHANG K, ZHU Y P. Text segmentation based on superpixel fusion [J]. Journal of Computer Applications, 2016, 36(12): 3418-3422. 10.11772/j.issn.1001-9081.2016.12.3418 | |
3 | CAMPBELL N D F, KAUTZ J. Learning a manifold of fonts [J]. ACM Transactions on Graphics, 2014, 33(4): Article No.91. 10.1145/2601097.2601212 |
4 | PHAN H Q, FU H, CHAN A B. FlexyFont: learning transferring rules for flexible typeface synthesis [J]. Computer Graphics Forum, 2015, 34(7): 245-256. 10.1111/cgf.12763 |
5 | LIAO J, YAO Y, YUAN L, et al. Visual attribute transfer through deep image analogy [J]. ACM Transactions on Graphics, 2017, 36(4): Article No.120. 10.1145/3072959.3073683 |
6 | GATYS L A, ECKER A S, BETHGE M. Image style transfer using convolutional neural networks [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 2414-2423. 10.1109/cvpr.2016.265 |
7 | LYU P Y, BAI X, YAO C, et al. Auto-encoder guided GAN for Chinese calligraphy synthesis [C]// Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition. Piscataway: IEEE, 2017: 1095-1100. 10.1109/icdar.2017.181 |
8 | SHAMIR A, RAPPOPORT A. Feature-based design of fonts using constraints [C]// Proceedings of the 1998 International Conference on Raster Imaging and Digital Typography, LNCS 1375. Berlin: Springer, 1998: 93-108. |
9 | SUVEERANONT R, IGARASHI T. Example-based automatic font generation [C]// Proceedings of the 2010 International Symposium on Smart Graphics, LNCS 6133. Berlin: Springer, 2010: 127-138. |
10 | TENENBAUM J B, FREEMAN W T. Separating style and content with bilinear models [J]. Neural Computation, 2000, 12(6): 1247-1283. 10.1162/089976600300015349 |
11 | BALUJA S. Learning typographic style: from discrimination to synthesis [J]. Machine Vision and Applications, 2017, 28(5/6): 551-568. 10.1007/s00138-017-0842-6 |
12 | BERNHARDSSON E. Analyzing 50k fonts using deep neural networks [EB/OL]. [2021-03-03]. . |
13 | AZADI S, FISHER M, KIM V, et al. Multi-content GAN for few-shot font style transfer [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7564-7573. 10.1109/cvpr.2018.00789 |
14 | BUŠTA M, NEUMANN L, MATAS J. Deep TextSpotter: an end-to-end trainable scene text localization and recognition framework [C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2223-2231. 10.1109/iccv.2017.242 |
15 | KEVIN K. Project Naptha [EB/OL]. [2021-03-03]. . |
16 | 吴亮.基于GAN的文字编辑技术的研究[D].武汉:华中科技大学,2019:7-17. |
WU L. Research on text editing technology based on GAN [D]. Wuhan: Huazhong University of Science and Technology, 2019: 7-17. | |
17 | ROY P, BHATTACHARYA S, GHOSH S, et al. STEFANN: scene text editor using font adaptive neural network [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 13225-13234. 10.1109/cvpr42600.2020.01324 |
18 | EGMONT-PETERSEN M, DE RIDDER D, HANDELS H. Image processing with neural networks — a review [J]. Pattern recognition, 2002, 35(10): 2279-2301. 10.1016/s0031-3203(01)00178-9 |
19 | GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets [C]// Proceedings of the 2014 27th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2014: 2672-2680. |
20 | DOSOVITSKIY A, SPRINGENBERG J T, TATARCHENKO M, et al. Learning to generate chairs, tables and cars with convolutional networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 692-705. |
21 | CHEN H Z, TSAI S S, SCHROTH G, et al. Robust text detection in natural images with edge-enhanced maximally stable extremal regions [C]// Proceedings of the 2011 18th IEEE International Conference on Image Processing. Piscataway: IEEE, 2011: 2609-2612. 10.1109/icip.2011.6116200 |
22 | CHENG M M, MITRA N J, HUANG X L, et al. Global contrast based salient region detection [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 569-582. 10.1109/tpami.2014.2345401 |
23 | OTSU N. A threshold selection method from gray-level histograms [J]. IEEE Transactions on Systems, Man, and Cybernetics, 1979, 9(1): 62-66. 10.1109/tsmc.1979.4310076 |
24 | KARATZAS D, SHAFAIT F, UCHIDA S, et al. ICDAR 2013 robust reading competition [C]// Proceedings of the 2013 12th International Conference on Document Analysis and Recognition. Piscataway: IEEE, 2013: 1484-1493. 10.1109/icdar.2013.221 |
25 | Inc Google. Google fonts [EB/OL]. [2021-03-03]. . |
26 | KINGMA D P, BA J L. Adam: a method for stochastic optimization [EB/OL]. [2021-03-03]. . |
27 | WANG Z, BOVIK A C, SHEIKH H R, et al. Image quality assessment: from error visibility to structural similarity [J]. IEEE Transactions on Image Processing, 2004, 13(4): 600-612. 10.1109/tip.2003.819861 |
28 | SMITH R. An overview of the tesseract OCR engine [C]// Proceedings of the 2007 9th International Conference on Document Analysis and Recognition. Piscataway: IEEE, 2007: 629-633. 10.1109/icdar.2007.4376991 |
[1] | 黄思远, 张敏情, 柯彦, 毕新亮. 基于显著性检测的图像隐写分析方法[J]. 计算机应用, 2021, 41(2): 441-448. |
[2] | 温静, 宋建伟. 基于多级全局信息传递模型的视觉显著性检测[J]. 计算机应用, 2021, 41(1): 208-214. |
[3] | 赵恒, 安维胜, 付为刚. 深度导向显著性检测算法[J]. 计算机应用, 2019, 39(1): 143-147. |
[4] | 袁泉, 张建峰, 伍立志. 基于改进LBE特征的RGB-D显著性检测[J]. 计算机应用, 2018, 38(5): 1432-1435. |
[5] | 王鑫, 周韵, 宁晨, 石爱业. 自适应融合局部和全局稀疏表示的图像显著性检测[J]. 计算机应用, 2018, 38(3): 866-872. |
[6] | 王莎莎, 冯子亮, 傅可人. 基于图节点中心性和空间自相关的显著性检测方法[J]. 计算机应用, 2018, 38(12): 3547-3556. |
[7] | 叶子童, 邹炼, 颜佳, 范赐恩. 基于引导Boosting算法的显著性检测[J]. 计算机应用, 2017, 37(9): 2652-2658. |
[8] | 朱征宇, 汪梅. 基于Manifold Ranking和结合前景背景特征的显著性检测[J]. 计算机应用, 2016, 36(9): 2560-2565. |
[9] | 刘志远, 李华锋. 对比度与空间位置关系驱动的显著性检测[J]. 计算机应用, 2016, 36(3): 795-799. |
[10] | 郑斌, 牛玉贞, 柯玲玲. 多对象图像数据集建立及显著性检测算法评估[J]. 计算机应用, 2015, 35(9): 2624-2628. |
[11] | 杨苏 杨兆中. 基于参考纹理与自身色彩的图像修复[J]. 计算机应用, 2014, 34(6): 1724-1726. |
[12] | 钱小燕 韩磊 王帮峰. 基于YUV空间的彩色夜视融合方法[J]. 计算机应用, 2010, 30(12): 3222-3224. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||