Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (4): 1027-1034.DOI: 10.11772/j.issn.1001-9081.2020081274

Special Issue: CCF第35届中国计算机应用大会(CCF NCCA 2020)

• The 35 CCF National Conference of Computer Applications (CCF NCCA 2020) • Previous Articles     Next Articles

Beijing Opera character recognition based on attention mechanism with HyperColumn

QIN Jun1,2, LUO Yifan1,2, TIE Jun1,2, ZHENG Lu1,2, LYU Weilong3   

  1. 1. College of Computer Science, South-Central University for Nationalities, Wuhan Hubei 430074, China;
    2. Hubei Provincial Engineering Research Center for Intelligent Management of Manufacturing Enterprises;(South-Central University for Nationalities), Wuhan Hubei 430074, China;
    3. School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing Jiangsu 210094, China
  • Received:2020-08-20 Revised:2020-10-27 Online:2021-04-10 Published:2020-11-25
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61902437), the Major Program of Technology Innovation Project of Hubei Province (2019ABA101).


覃俊1,2, 罗一凡1,2, 帖军1,2, 郑禄1,2, 吕伟龙3   

  1. 1. 中南民族大学 计算机科学学院, 武汉 430074;
    2. 湖北省制造企业智能管理工程技术研究中心(中南民族大学), 武汉 430074;
    3. 南京理工大学 计算机科学与工程学院, 南京 210094
  • 通讯作者: 帖军
  • 作者简介:覃俊(1968—),女,湖南常德人,教授,博士,CCF会员,主要研究方向:计算机视觉、复杂网络;罗一凡(1996—),男,湖南怀化人,硕士研究生,主要研究方向:计算机视觉;帖军(1976—),男,河南社旗人,教授,博士,CCF会员,主要研究方向:机器感知、模式识别;郑禄(1989—),男,内蒙古乌兰察布人,讲师,硕士,主要研究方向:图像处理、模式识别;吕伟龙(1996—),男,江苏南京人,硕士研究生,主要研究方向:深度学习。
  • 基金资助:

Abstract: In order to overcome the difficulty of visual feature extraction and meet the real-time recognition demand of Beijing Opera characters, a Convolutional Neural Network based on HyperColumn Attention(HCA-CNN) was proposed to extract and recognize the fine-grained features of Beijing Opera characters. The idea of HyperColumn features used for image segmentation and fine-grained positioning were applied to the attention mechanism used for key area positioning in the network. The multi-layer superposition features was formed by concatenating the backbone classification network in the forms of pixel points through the HyperColumn set, so as to better take into account both the early shallow spatial features and the late depth category semantic features, and improve the accuracy of positioning task and backbone network classification task. At the same time, the lightweight MobileNetV2 was adopted as the backbone network of the network, which better met the real-time requirement of video application scenarios. In addition, the BeiJing Opera Role(BJOR) dataset was created and the ablation experiments were carried out on this dataset. Experimental results show that, compared with the traditional fine-grained Recurrent Attention Convolutional Neural Network(RA-CNN), HCA-CNN not only improves the accuracy index by 0.63 percentage points, but also reduces the Memory Usage and Params by 162.84 MB and 131.5 MB respectively, and reduces the times of multiplication and addition Mult-Adds and floating-point operations per second FLOPs by 39 885×106 times and 51 886×106 times respectively. It verifies that the proposed HCA-CNN can effectively improve the accuracy and efficiency of Beijing Opera character recognition, and can meet the requirements of practical applications.

Key words: HyperColumn, attention mechanism, recurrent network, fine-grained, Beijing Opera character recognition

摘要: 为了克服京剧人物视觉特征提取的难点及满足京剧人物实时识别的需求,提出基于超列注意力机制的卷积神经网络(HCA-CNN)来实现面向京剧人物的细粒度特征提取和识别。该网络中用于关键区域定位的注意力机制借鉴了用于图像分割和细粒度定位的超列(HyperColumn)特征思想,通过超列集基于像素点的形式串联主干分类网络来形成多层叠加特征,从而更好地兼顾早期浅层空间特征与后期深度类别语义特征,并提高定位任务与主干网络分类任务的准确度。同时,该网络的主干网络采用轻量级的MobileNetV2,从而更好地满足视频应用场景下的实时性要求。此外,还创建了京剧人物(BJOR)数据集,并在此数据集上进行了相关消融实验。实验结果显示,HCA-CNN与传统细粒度循环注意力网络(RA-CNN)相比,除了在准确率(Accuracy)指标上提高了0.63个百分点以外,其内存使用量(Memory Usage)、参数量(Params)分别减少了162.84 MB、131.5 MB,乘加次数(Mult-Adds)、每秒浮点运算次数(FLOPs)分别减少了39 885×106、51 886×106。可见,针对京剧人物视觉特征提出的HCA-CNN能有效提高京剧人物识别的准确率和效率,满足实际应用的需求。

关键词: 超列, 注意力机制, 递归网络, 细粒度, 京剧人物识别

CLC Number: