Journal of Computer Applications ›› 2019, Vol. 39 ›› Issue (6): 1675-1679.DOI: 10.11772/j.issn.1001-9081.2018112361

• Artificial intelligence • Previous Articles     Next Articles

YOLO network character recognition method with variable candidate box density for international phonetic alphabet

ZHENG Yi1, QI Donglian2, WANG Zhenyu2   

  1. 1. School of Humanities, Zhejiang University, Hangzhou Zhejiang 310028, China;
    2. College of Electrical Engineering, Zhejiang University, Hangzhou Zhejiang 310027, China
  • Received:2018-11-28 Revised:2019-01-09 Online:2019-06-10 Published:2019-06-17
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61571394), the Science and Technology Project of Zhejiang Province (2019C01001), the Interdisciplinary Preresearch Project of Zhejiang University (2018FZA122).


郑伊1, 齐冬莲2, 王震宇2   

  1. 1. 浙江大学 人文学院, 杭州 310028;
    2. 浙江大学 电气工程学院, 杭州 310027
  • 通讯作者: 齐冬莲
  • 作者简介:郑伊(1989-),男,河南平顶山人,博士研究生,主要研究方向:汉语语音史、计算语言学;齐冬莲(1973-),女,河南南阳人,教授,博士,主要研究方向:图像识别、机器学习;王震宇(1992-),男,山东潍坊人,博士研究生,主要研究方向:图像识别。
  • 基金资助:

Abstract: Aiming at the low recognition accuracy and poor practicability of the traditional character feature extraction methods to International Phonetic Alphabet (IPA), a You Only Look Once (YOLO) network character recognition method with variable candidate box density for IPA was proposed. Firstly, based on YOLO network and combined with three characteristics such as the characters of IPA are closely arranged on X-axis direction and have various types and forms, the distribution density of candidate box in YOLO network was changed. Then, with the distribution density of candidate box on the X-axis increased while the distribution density of candidate box on the Y-axis reduced, YOLO-IPA network was constructed. The proposed method was tested on the IPA dataset collected from Chinese Dialect Vocabulary with 1360 images of 72 categories. The experimental results show that, the proposed method has the recognition rate of 93.72% for large characters and 89.31% for small characters. Compared with the traditional character recognition algorithms, the proposed method greatly improves the recognition accuracy. Meanwhile, the detection speed was improved to less than 1 s in the experimental environment. Therefore, the proposed method can meet the need of real-time application.

Key words: International Phonetic Alphabet (IPA), character detection and recognition, You Only Look Once (YOLO) network, deep learning

摘要: 针对传统方法对国际音标(IPA)的字符特征提取存在的识别精度低、实效性差等问题,提出了一种候选框密度可变的YOLO网络国际音标字符识别方法。首先,以YOLO网络为基础,结合国际音标字符图像X轴方向排列紧密、字符种类和形态多样的特点来改变YOLO网络中候选框的分布密度;然后,增加识别过程中候选框在X轴上的分布,同时减小Y轴方向上的密度,构成YOLO-IPA网络。对采集自《汉语方音字汇》的含有1360张、共72类国际音标图像的数据集进行检验,实验结果表明:所提方法对尺寸较大的字符识别率达到93.72%,对尺寸较小的字符识别率达到89.31%,较传统的字符识别算法,大幅提高了识别准确性;同时,在实验环境下检测速度小于1 s,因而可满足实时应用的需求。

关键词: 国际音标, 字符检测与识别, YOLO网络, 深度学习

CLC Number: