Image character editing method based on improved font adaptive neural network

doi:10.11772/j.issn.1001-9081.2021050882

Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (7): 2227-2238.DOI: 10.11772/j.issn.1001-9081.2021050882

• Multimedia computing and computer simulation • Previous Articles Next Articles

Image character editing method based on improved font adaptive neural network

Shangwang LIU¹^,²(), Xinming ZHANG¹^,², Fei ZHANG¹^,²

^1.College of Computer and Information Engineering，Henan Normal University，Xinxiang Henan 453007，China
^2.Engineering Lab of Intelligence Business and Internet of Things of Henan Province （Henan Normal University），Xinxiang Henan 453007，China

Received:2021-05-27 Revised:2021-11-24 Accepted:2021-12-21 Online:2022-03-08 Published:2022-07-10
Contact: Shangwang LIU
About author:ZHANG Xinming， born in 1963， M. S.， professor. His research interests include intelligent optimization algorithm， image segmentation.
ZHANG Fei， born in 1987， Ph. D.， lecturer. Her research interests include machine learning， adversarial learning.
Supported by:
Key Program of Henan Province Science and Technology Project(192102210290);Basic Research Program of Key Scientific Research Project of Higher Educations of Henan Province(21A520022)

改进字体自适应神经网络的图像字符编辑方法

刘尚旺¹^,²(), 张新明¹^,², 张非¹^,²

^1.河南师范大学计算机与信息工程学院, 河南新乡 453007
^2.智慧商务与物联网技术河南工程实验室(河南师范大学), 河南新乡 453007

通讯作者: 刘尚旺
作者简介:张新明（1963—），男，湖北孝感人，教授，硕士，CCF会员，主要研究方向：智能优化算法、图像分割
张非（1987—），女，河南南阳人，讲师，博士，主要研究方向：机器学习、对抗性学习。
基金资助:
河南省科技攻关计划项目(192102210290);河南省高等学校重点科研项目基础研究计划项目(21A520022)

Abstract

Abstract:

In current international society， as the international language， English characters appear in many public occasions， as well as the Chinese pinyin characters in Chinese environment. When these characters appear in the image， especially in the image with complex style， it is difficult to edit and modify them directly. In order to solve the problems， an image character editing method based on improved character generation network named Font Adaptive Neural network （FANnet） was proposed. Firstly， the salience detection algorithm based on Histogram Contrast （HC） was used to improve the Character Adaptive Detection （CAD） model to accurately extract the image characters selected by the user. Secondly， the binary image of the target character that was almost consistent with the font of the source character was generated by using FANnet. Then， the color of source characters were transferred to target characters effectively by the proposed Colors Distribute-based Local （CDL） transfer model based on color complexity discrimination. Finally， the target editable characters that were highly consistent with the font structure and color change of the source character were generated， so as to achieve the purpose of character editing. Experimental results show that， on MSRA-TD500， COCO-Text and ICDAR datasets， the average values of Structural SIMilarity（SSIM）， Peak Signal-to-Noise Ratio （PSNR） and Normalized Root Mean Square Error （NRMSE） of the proposed method are 0.776 5， 18.321 1 dB and 0.435 8 respectively， which are increased by 18.59%，14.02% and decreased by 2.97% comparing with those of Scene Text Editor using Font Adaptive Neural Network（STEFANN） algorithm respectively， and increased by 30.24%，23.92% and decreased by 4.68% comparing with those of multi-modal few-shot font style transfer model named Multi-Content GAN（MC-GAN） algorithm（with 1 input character）respectively. For the image characters with complex font structure and color gradient distribution in real scene， the editing effect of the proposed method is also good. The proposed method can be applied to image reuse， image character computer automatic error correction and image text information restorage.

Key words: Font Adaptive Neural network (FANnet), image character editing, Histogram Contrast (HC), salience detection, color transfer, font structure

摘要：

在当今国际化的社会，作为国际通用语言的英文字符及中文环境下的拼音字符出现在众多公共场合。当这些字符出现在图像中时，尤其在风格复杂的图像中时，难以直接对其进行编辑修改。针对上述问题，提出了一种改进文字生成网络（FANnet）的图像字符编辑方法。首先，利用基于直方图对比度（HC）的显著性检测算法改进自适应字符检测（CAD）模型，准确提取出用户所选择的图像字符；接着，根据FANnet，生成与源字符字体几乎一致的目标字符的二值图；然后，通过所提出的局部颜色分布（CDL）迁移模型，迁移源字符颜色至目标字符；最后，生成与源字符字体结构和颜色变化均高度一致的目标可编辑修改字符，从而达到字符编辑目的。实验结果表明，在MSRA-TD500、COCO-Text和ICDAR数据集上，所提方法的结构相似性（SSIM）、峰值信噪比（PSNR）和归一化均方根误差（NRMSE）平均值分别为0.776 5、18.321 1 dB和0.435 8，相较于基于字体自适应神经网络的场景文本编辑器（STEFANN）算法分别提高了18.59%、14.02%和降低了2.97%，相较于多模态小样本字体迁移模型MC-GAN算法（输入1个字符时）分别提高了30.24%、23.92%和降低了4.68%；而且针对字体结构和颜色渐变分布比较复杂的实际场景图像字符，所提方法的编辑效果也较好。该方法可以应用于图像重利用、图像字符计算机自动纠错和图像文本信息重存储。

关键词: 体自适应神经网络, 图像字符编辑, 直方图对比度, 显著性检测, 颜色迁移, 字体结构

CLC Number:

TP391.41

Shangwang LIU, Xinming ZHANG, Fei ZHANG. Image character editing method based on improved font adaptive neural network[J]. Journal of Computer Applications, 2022, 42(7): 2227-2238.

刘尚旺, 张新明, 张非. 改进字体自适应神经网络的图像字符编辑方法[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2227-2238.

Figures/Tables 18

Fig. 1 Workflow of proposed image character editing method

Fig. 2 Flow chart of character adaptive detection algorithm

Fig. 3 Structure of FANnet

Fig. 4 Flow chart of colors distribute-based local transfer model

Fig. 5 Typical results of source character adaptiveextraction model

Fig. 6 Gray average SSIM of character images generated byFANnet and CAD FANnet

Fig. 7 Comparison of character results generated by FANnet and CAD FANnet

Fig. 8 Some results of colors distribute-based local transfer model

Fig. 9 Accuracy of image character color judgment under different λ

Fig. 10 Color transfer effect of character images withgradient color and complex texture

Fig. 11 Result comparison of generated characters

Fig. 12 Comparison of color transfer results

Fig. 13 CASSIM based on RGB

Fig. 14 Comparison of character generation results with color feature

Fig. 15 Comparison of target image characters in real scene

Tab. 1 Results of different methods on quantitative evaluation indices

方法	评价指标
方法	NRMSE	PSNR/dB	ASSIM_RGB
MC-GAN（输入字符数为1）	0.457 2	14.785 4	0.596 2
MC-GAN（输入字符数大于等于3）	0.370 4	18.587 4	0.790 2
STEFANN	0.449 1	16.069 2	0.654 8
本文方法	0.435 8	18.321 1	0.776 5

Fig. 16 Comparison of editing results of image characters in real scene

Fig. 17 Application results of proposed image character editing method on different real scene images

References 28

1	范一华，邓德祥，颜佳.基于色彩空间的最大稳定极值区域的自然场景文本检测［J］.计算机应用，2018，38（1）：264-269，294. 10.11772/j.issn.1001-9081.2017061389
	FAN Y H， DENG D X， YAN J. Natural scene text detection based on maximally stable extremal region in color space ［J］. Journal of Computer Applications， 2018， 38（1）： 264-269， 294. 10.11772/j.issn.1001-9081.2017061389
2	张矿，朱远平.基于超像素融合的文本分割［J］.计算机应用，2016，36（12）：3418-3422. 10.11772/j.issn.1001-9081.2016.12.3418
	ZHANG K， ZHU Y P. Text segmentation based on superpixel fusion ［J］. Journal of Computer Applications， 2016， 36（12）： 3418-3422. 10.11772/j.issn.1001-9081.2016.12.3418
3	CAMPBELL N D F， KAUTZ J. Learning a manifold of fonts ［J］. ACM Transactions on Graphics， 2014， 33（4）： Article No.91. 10.1145/2601097.2601212
4	PHAN H Q， FU H， CHAN A B. FlexyFont： learning transferring rules for flexible typeface synthesis ［J］. Computer Graphics Forum， 2015， 34（7）： 245-256. 10.1111/cgf.12763
5	LIAO J， YAO Y， YUAN L， et al. Visual attribute transfer through deep image analogy ［J］. ACM Transactions on Graphics， 2017， 36（4）： Article No.120. 10.1145/3072959.3073683
6	GATYS L A， ECKER A S， BETHGE M. Image style transfer using convolutional neural networks ［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 2414-2423. 10.1109/cvpr.2016.265
7	LYU P Y， BAI X， YAO C， et al. Auto-encoder guided GAN for Chinese calligraphy synthesis ［C］// Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition. Piscataway： IEEE， 2017： 1095-1100. 10.1109/icdar.2017.181
8	SHAMIR A， RAPPOPORT A. Feature-based design of fonts using constraints ［C］// Proceedings of the 1998 International Conference on Raster Imaging and Digital Typography， LNCS 1375. Berlin： Springer， 1998： 93-108.
9	SUVEERANONT R， IGARASHI T. Example-based automatic font generation ［C］// Proceedings of the 2010 International Symposium on Smart Graphics， LNCS 6133. Berlin： Springer， 2010： 127-138.
10	TENENBAUM J B， FREEMAN W T. Separating style and content with bilinear models ［J］. Neural Computation， 2000， 12（6）： 1247-1283. 10.1162/089976600300015349
11	BALUJA S. Learning typographic style： from discrimination to synthesis ［J］. Machine Vision and Applications， 2017， 28（5/6）： 551-568. 10.1007/s00138-017-0842-6
12	BERNHARDSSON E. Analyzing 50k fonts using deep neural networks ［EB/OL］. ［2021-03-03］. .
13	AZADI S， FISHER M， KIM V， et al. Multi-content GAN for few-shot font style transfer ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 7564-7573. 10.1109/cvpr.2018.00789
14	BUŠTA M， NEUMANN L， MATAS J. Deep TextSpotter： an end-to-end trainable scene text localization and recognition framework ［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2223-2231. 10.1109/iccv.2017.242
15	KEVIN K. Project Naptha ［EB/OL］. ［2021-03-03］. .
16	吴亮.基于GAN的文字编辑技术的研究［D］.武汉：华中科技大学，2019：7-17.
	WU L. Research on text editing technology based on GAN ［D］. Wuhan： Huazhong University of Science and Technology， 2019： 7-17.
17	ROY P， BHATTACHARYA S， GHOSH S， et al. STEFANN： scene text editor using font adaptive neural network ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 13225-13234. 10.1109/cvpr42600.2020.01324
18	EGMONT-PETERSEN M， DE RIDDER D， HANDELS H. Image processing with neural networks — a review ［J］. Pattern recognition， 2002， 35（10）： 2279-2301. 10.1016/s0031-3203(01)00178-9
19	GOODFELLOW I J， POUGET-ABADIE J， MIRZA M， et al. Generative adversarial nets ［C］// Proceedings of the 2014 27th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2014： 2672-2680.
20	DOSOVITSKIY A， SPRINGENBERG J T， TATARCHENKO M， et al. Learning to generate chairs， tables and cars with convolutional networks ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017， 39（4）： 692-705.
21	CHEN H Z， TSAI S S， SCHROTH G， et al. Robust text detection in natural images with edge-enhanced maximally stable extremal regions ［C］// Proceedings of the 2011 18th IEEE International Conference on Image Processing. Piscataway： IEEE， 2011： 2609-2612. 10.1109/icip.2011.6116200
22	CHENG M M， MITRA N J， HUANG X L， et al. Global contrast based salient region detection ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2015， 37（3）： 569-582. 10.1109/tpami.2014.2345401
23	OTSU N. A threshold selection method from gray-level histograms ［J］. IEEE Transactions on Systems， Man， and Cybernetics， 1979， 9（1）： 62-66. 10.1109/tsmc.1979.4310076
24	KARATZAS D， SHAFAIT F， UCHIDA S， et al. ICDAR 2013 robust reading competition ［C］// Proceedings of the 2013 12th International Conference on Document Analysis and Recognition. Piscataway： IEEE， 2013： 1484-1493. 10.1109/icdar.2013.221
25	Inc Google. Google fonts ［EB/OL］. ［2021-03-03］. .
26	KINGMA D P， BA J L. Adam： a method for stochastic optimization ［EB/OL］. ［2021-03-03］. .
27	WANG Z， BOVIK A C， SHEIKH H R， et al. Image quality assessment： from error visibility to structural similarity ［J］. IEEE Transactions on Image Processing， 2004， 13（4）： 600-612. 10.1109/tip.2003.819861
28	SMITH R. An overview of the tesseract OCR engine ［C］// Proceedings of the 2007 9th International Conference on Document Analysis and Recognition. Piscataway： IEEE， 2007： 629-633. 10.1109/icdar.2007.4376991

[1]	YANG Su YANG Zhaozhong. Image inpainting using reference image texture and distress image color [J]. Journal of Computer Applications, 2014, 34(6): 1724-1726.
[2]	. Color fusion method for night vision based on YUV space [J]. Journal of Computer Applications, 2010, 30(12): 3222-3224.
[3]	. Color transfer method with multi-source images [J]. Journal of Computer Applications, 2010, 30(11): 3015-3018.
[4]	. New color transferring algorithm based on multidimensional eigenvector and ANN searching technology [J]. Journal of Computer Applications, 2006, 26(12): 2866-2868.

Image character editing method based on improved font adaptive neural network

改进字体自适应神经网络的图像字符编辑方法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 18

References 28

Related Articles 4

Recommended Articles

Metrics