改进字体自适应神经网络的图像字符编辑方法

doi:10.11772/j.issn.1001-9081.2021050882

《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (7): 2227-2238.DOI: 10.11772/j.issn.1001-9081.2021050882

• 多媒体计算与计算机仿真 • 上一篇下一篇

改进字体自适应神经网络的图像字符编辑方法

刘尚旺¹^,²(), 张新明¹^,², 张非¹^,²

^1.河南师范大学计算机与信息工程学院, 河南新乡 453007
^2.智慧商务与物联网技术河南工程实验室(河南师范大学), 河南新乡 453007

收稿日期:2021-05-27 修回日期:2021-11-24 接受日期:2021-12-21 发布日期:2022-03-08 出版日期:2022-07-10
通讯作者: 刘尚旺
作者简介:张新明（1963—），男，湖北孝感人，教授，硕士，CCF会员，主要研究方向：智能优化算法、图像分割
张非（1987—），女，河南南阳人，讲师，博士，主要研究方向：机器学习、对抗性学习。
基金资助:
河南省科技攻关计划项目(192102210290);河南省高等学校重点科研项目基础研究计划项目(21A520022)

Image character editing method based on improved font adaptive neural network

Shangwang LIU¹^,²(), Xinming ZHANG¹^,², Fei ZHANG¹^,²

^1.College of Computer and Information Engineering，Henan Normal University，Xinxiang Henan 453007，China
^2.Engineering Lab of Intelligence Business and Internet of Things of Henan Province （Henan Normal University），Xinxiang Henan 453007，China

Received:2021-05-27 Revised:2021-11-24 Accepted:2021-12-21 Online:2022-03-08 Published:2022-07-10
Contact: Shangwang LIU
About author:ZHANG Xinming， born in 1963， M. S.， professor. His research interests include intelligent optimization algorithm， image segmentation.
ZHANG Fei， born in 1987， Ph. D.， lecturer. Her research interests include machine learning， adversarial learning.
Supported by:
Key Program of Henan Province Science and Technology Project(192102210290);Basic Research Program of Key Scientific Research Project of Higher Educations of Henan Province(21A520022)

摘要/Abstract

摘要：

在当今国际化的社会，作为国际通用语言的英文字符及中文环境下的拼音字符出现在众多公共场合。当这些字符出现在图像中时，尤其在风格复杂的图像中时，难以直接对其进行编辑修改。针对上述问题，提出了一种改进文字生成网络（FANnet）的图像字符编辑方法。首先，利用基于直方图对比度（HC）的显著性检测算法改进自适应字符检测（CAD）模型，准确提取出用户所选择的图像字符；接着，根据FANnet，生成与源字符字体几乎一致的目标字符的二值图；然后，通过所提出的局部颜色分布（CDL）迁移模型，迁移源字符颜色至目标字符；最后，生成与源字符字体结构和颜色变化均高度一致的目标可编辑修改字符，从而达到字符编辑目的。实验结果表明，在MSRA-TD500、COCO-Text和ICDAR数据集上，所提方法的结构相似性（SSIM）、峰值信噪比（PSNR）和归一化均方根误差（NRMSE）平均值分别为0.776 5、18.321 1 dB和0.435 8，相较于基于字体自适应神经网络的场景文本编辑器（STEFANN）算法分别提高了18.59%、14.02%和降低了2.97%，相较于多模态小样本字体迁移模型MC-GAN算法（输入1个字符时）分别提高了30.24%、23.92%和降低了4.68%；而且针对字体结构和颜色渐变分布比较复杂的实际场景图像字符，所提方法的编辑效果也较好。该方法可以应用于图像重利用、图像字符计算机自动纠错和图像文本信息重存储。

关键词: 体自适应神经网络, 图像字符编辑, 直方图对比度, 显著性检测, 颜色迁移, 字体结构

Abstract:

In current international society， as the international language， English characters appear in many public occasions， as well as the Chinese pinyin characters in Chinese environment. When these characters appear in the image， especially in the image with complex style， it is difficult to edit and modify them directly. In order to solve the problems， an image character editing method based on improved character generation network named Font Adaptive Neural network （FANnet） was proposed. Firstly， the salience detection algorithm based on Histogram Contrast （HC） was used to improve the Character Adaptive Detection （CAD） model to accurately extract the image characters selected by the user. Secondly， the binary image of the target character that was almost consistent with the font of the source character was generated by using FANnet. Then， the color of source characters were transferred to target characters effectively by the proposed Colors Distribute-based Local （CDL） transfer model based on color complexity discrimination. Finally， the target editable characters that were highly consistent with the font structure and color change of the source character were generated， so as to achieve the purpose of character editing. Experimental results show that， on MSRA-TD500， COCO-Text and ICDAR datasets， the average values of Structural SIMilarity（SSIM）， Peak Signal-to-Noise Ratio （PSNR） and Normalized Root Mean Square Error （NRMSE） of the proposed method are 0.776 5， 18.321 1 dB and 0.435 8 respectively， which are increased by 18.59%，14.02% and decreased by 2.97% comparing with those of Scene Text Editor using Font Adaptive Neural Network（STEFANN） algorithm respectively， and increased by 30.24%，23.92% and decreased by 4.68% comparing with those of multi-modal few-shot font style transfer model named Multi-Content GAN（MC-GAN） algorithm（with 1 input character）respectively. For the image characters with complex font structure and color gradient distribution in real scene， the editing effect of the proposed method is also good. The proposed method can be applied to image reuse， image character computer automatic error correction and image text information restorage.

Key words: Font Adaptive Neural network (FANnet), image character editing, Histogram Contrast (HC), salience detection, color transfer, font structure

中图分类号:

TP391.41

刘尚旺, 张新明, 张非. 改进字体自适应神经网络的图像字符编辑方法[J]. 计算机应用, 2022, 42(7): 2227-2238.

Shangwang LIU, Xinming ZHANG, Fei ZHANG. Image character editing method based on improved font adaptive neural network[J]. Journal of Computer Applications, 2022, 42(7): 2227-2238.

图/表 18

图1 本文图像字符编辑方法的工作流程

Fig. 1 Workflow of proposed image character editing method

图2 自适应字符检测算法流程

Fig. 2 Flow chart of character adaptive detection algorithm

图3 FANnet结构

Fig. 3 Structure of FANnet

图4 局部颜色分布迁移模型流程

Fig. 4 Flow chart of colors distribute-based local transfer model

图5 源字符自适应提取模型典型结果

Fig. 5 Typical results of source character adaptiveextraction model

图6 FANnet与CAD FANnet生成字符图像的灰度平均SSIM值

Fig. 6 Gray average SSIM of character images generated byFANnet and CAD FANnet

图7 FANnet与CAD FANnet生成字符结果对比

Fig. 7 Comparison of character results generated by FANnet and CAD FANnet

图8 基于局部颜色分布迁移模型的部分结果

Fig. 8 Some results of colors distribute-based local transfer model

图9 不同λ下图像字符颜色判断正确率

Fig. 9 Accuracy of image character color judgment under different λ

图10 含有渐变颜色和复杂纹理的字符图像颜色迁移效果

Fig. 10 Color transfer effect of character images withgradient color and complex texture

图11 生成字符结果对比

Fig. 11 Result comparison of generated characters

图12 颜色迁移结果对比

Fig. 12 Comparison of color transfer results

图13 基于RGB的CASSIM

Fig. 13 CASSIM based on RGB

图14 含有颜色特征的字符生成结果对比

Fig. 14 Comparison of character generation results with color feature

图15 真实场景下目标图像字符对比

Fig. 15 Comparison of target image characters in real scene

表1 不同方法定量评价指标结果

Tab. 1 Results of different methods on quantitative evaluation indices

方法	评价指标
方法	NRMSE	PSNR/dB	ASSIM_RGB
MC-GAN（输入字符数为1）	0.457 2	14.785 4	0.596 2
MC-GAN（输入字符数大于等于3）	0.370 4	18.587 4	0.790 2
STEFANN	0.449 1	16.069 2	0.654 8
本文方法	0.435 8	18.321 1	0.776 5

图16 实景图像字符编辑结果对比

Fig. 16 Comparison of editing results of image characters in real scene

图17 不同真实场景图像应用图像字符编辑方法结果

Fig. 17 Application results of proposed image character editing method on different real scene images

参考文献 28

1	范一华，邓德祥，颜佳.基于色彩空间的最大稳定极值区域的自然场景文本检测［J］.计算机应用，2018，38（1）：264-269，294. 10.11772/j.issn.1001-9081.2017061389
	FAN Y H， DENG D X， YAN J. Natural scene text detection based on maximally stable extremal region in color space ［J］. Journal of Computer Applications， 2018， 38（1）： 264-269， 294. 10.11772/j.issn.1001-9081.2017061389
2	张矿，朱远平.基于超像素融合的文本分割［J］.计算机应用，2016，36（12）：3418-3422. 10.11772/j.issn.1001-9081.2016.12.3418
	ZHANG K， ZHU Y P. Text segmentation based on superpixel fusion ［J］. Journal of Computer Applications， 2016， 36（12）： 3418-3422. 10.11772/j.issn.1001-9081.2016.12.3418
3	CAMPBELL N D F， KAUTZ J. Learning a manifold of fonts ［J］. ACM Transactions on Graphics， 2014， 33（4）： Article No.91. 10.1145/2601097.2601212
4	PHAN H Q， FU H， CHAN A B. FlexyFont： learning transferring rules for flexible typeface synthesis ［J］. Computer Graphics Forum， 2015， 34（7）： 245-256. 10.1111/cgf.12763
5	LIAO J， YAO Y， YUAN L， et al. Visual attribute transfer through deep image analogy ［J］. ACM Transactions on Graphics， 2017， 36（4）： Article No.120. 10.1145/3072959.3073683
6	GATYS L A， ECKER A S， BETHGE M. Image style transfer using convolutional neural networks ［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 2414-2423. 10.1109/cvpr.2016.265
7	LYU P Y， BAI X， YAO C， et al. Auto-encoder guided GAN for Chinese calligraphy synthesis ［C］// Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition. Piscataway： IEEE， 2017： 1095-1100. 10.1109/icdar.2017.181
8	SHAMIR A， RAPPOPORT A. Feature-based design of fonts using constraints ［C］// Proceedings of the 1998 International Conference on Raster Imaging and Digital Typography， LNCS 1375. Berlin： Springer， 1998： 93-108.
9	SUVEERANONT R， IGARASHI T. Example-based automatic font generation ［C］// Proceedings of the 2010 International Symposium on Smart Graphics， LNCS 6133. Berlin： Springer， 2010： 127-138.
10	TENENBAUM J B， FREEMAN W T. Separating style and content with bilinear models ［J］. Neural Computation， 2000， 12（6）： 1247-1283. 10.1162/089976600300015349
11	BALUJA S. Learning typographic style： from discrimination to synthesis ［J］. Machine Vision and Applications， 2017， 28（5/6）： 551-568. 10.1007/s00138-017-0842-6
12	BERNHARDSSON E. Analyzing 50k fonts using deep neural networks ［EB/OL］. ［2021-03-03］. .
13	AZADI S， FISHER M， KIM V， et al. Multi-content GAN for few-shot font style transfer ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 7564-7573. 10.1109/cvpr.2018.00789
14	BUŠTA M， NEUMANN L， MATAS J. Deep TextSpotter： an end-to-end trainable scene text localization and recognition framework ［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2223-2231. 10.1109/iccv.2017.242
15	KEVIN K. Project Naptha ［EB/OL］. ［2021-03-03］. .
16	吴亮.基于GAN的文字编辑技术的研究［D］.武汉：华中科技大学，2019：7-17.
	WU L. Research on text editing technology based on GAN ［D］. Wuhan： Huazhong University of Science and Technology， 2019： 7-17.
17	ROY P， BHATTACHARYA S， GHOSH S， et al. STEFANN： scene text editor using font adaptive neural network ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 13225-13234. 10.1109/cvpr42600.2020.01324
18	EGMONT-PETERSEN M， DE RIDDER D， HANDELS H. Image processing with neural networks — a review ［J］. Pattern recognition， 2002， 35（10）： 2279-2301. 10.1016/s0031-3203(01)00178-9
19	GOODFELLOW I J， POUGET-ABADIE J， MIRZA M， et al. Generative adversarial nets ［C］// Proceedings of the 2014 27th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2014： 2672-2680.
20	DOSOVITSKIY A， SPRINGENBERG J T， TATARCHENKO M， et al. Learning to generate chairs， tables and cars with convolutional networks ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017， 39（4）： 692-705.
21	CHEN H Z， TSAI S S， SCHROTH G， et al. Robust text detection in natural images with edge-enhanced maximally stable extremal regions ［C］// Proceedings of the 2011 18th IEEE International Conference on Image Processing. Piscataway： IEEE， 2011： 2609-2612. 10.1109/icip.2011.6116200
22	CHENG M M， MITRA N J， HUANG X L， et al. Global contrast based salient region detection ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2015， 37（3）： 569-582. 10.1109/tpami.2014.2345401
23	OTSU N. A threshold selection method from gray-level histograms ［J］. IEEE Transactions on Systems， Man， and Cybernetics， 1979， 9（1）： 62-66. 10.1109/tsmc.1979.4310076
24	KARATZAS D， SHAFAIT F， UCHIDA S， et al. ICDAR 2013 robust reading competition ［C］// Proceedings of the 2013 12th International Conference on Document Analysis and Recognition. Piscataway： IEEE， 2013： 1484-1493. 10.1109/icdar.2013.221
25	Inc Google. Google fonts ［EB/OL］. ［2021-03-03］. .
26	KINGMA D P， BA J L. Adam： a method for stochastic optimization ［EB/OL］. ［2021-03-03］. .
27	WANG Z， BOVIK A C， SHEIKH H R， et al. Image quality assessment： from error visibility to structural similarity ［J］. IEEE Transactions on Image Processing， 2004， 13（4）： 600-612. 10.1109/tip.2003.819861
28	SMITH R. An overview of the tesseract OCR engine ［C］// Proceedings of the 2007 9th International Conference on Document Analysis and Recognition. Piscataway： IEEE， 2007： 629-633. 10.1109/icdar.2007.4376991

[1]	黄思远, 张敏情, 柯彦, 毕新亮. 基于显著性检测的图像隐写分析方法[J]. 计算机应用, 2021, 41(2): 441-448.
[2]	温静, 宋建伟. 基于多级全局信息传递模型的视觉显著性检测[J]. 计算机应用, 2021, 41(1): 208-214.
[3]	赵恒, 安维胜, 付为刚. 深度导向显著性检测算法[J]. 计算机应用, 2019, 39(1): 143-147.
[4]	袁泉, 张建峰, 伍立志. 基于改进LBE特征的RGB-D显著性检测[J]. 计算机应用, 2018, 38(5): 1432-1435.
[5]	王鑫, 周韵, 宁晨, 石爱业. 自适应融合局部和全局稀疏表示的图像显著性检测[J]. 计算机应用, 2018, 38(3): 866-872.
[6]	王莎莎, 冯子亮, 傅可人. 基于图节点中心性和空间自相关的显著性检测方法[J]. 计算机应用, 2018, 38(12): 3547-3556.
[7]	叶子童, 邹炼, 颜佳, 范赐恩. 基于引导Boosting算法的显著性检测[J]. 计算机应用, 2017, 37(9): 2652-2658.
[8]	朱征宇, 汪梅. 基于Manifold Ranking和结合前景背景特征的显著性检测[J]. 计算机应用, 2016, 36(9): 2560-2565.
[9]	刘志远, 李华锋. 对比度与空间位置关系驱动的显著性检测[J]. 计算机应用, 2016, 36(3): 795-799.
[10]	郑斌, 牛玉贞, 柯玲玲. 多对象图像数据集建立及显著性检测算法评估[J]. 计算机应用, 2015, 35(9): 2624-2628.
[11]	杨苏杨兆中. 基于参考纹理与自身色彩的图像修复[J]. 计算机应用, 2014, 34(6): 1724-1726.
[12]	钱小燕韩磊王帮峰. 基于YUV空间的彩色夜视融合方法[J]. 计算机应用, 2010, 30(12): 3222-3224.

改进字体自适应神经网络的图像字符编辑方法

Image character editing method based on improved font adaptive neural network

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 18

参考文献 28

相关文章 12

编辑推荐

Metrics