Journal of Computer Applications ›› 2017, Vol. 37 ›› Issue (10): 3006-3011.DOI: 10.11772/j.issn.1001-9081.2017.10.3006

Previous Articles     Next Articles

Face annotation in news images based on multi-modal information fusion

ZHENG Cha, JI Lixin, LI Shaomei, GAO Chao   

  1. National Digital Switching System Engineering & Technological Research Center, Zhengzhou Henan 450000, China
  • Received:2017-04-26 Revised:2017-06-16 Online:2017-10-10 Published:2017-10-16
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61601513).

基于多模态信息融合的新闻图像人脸标注

征察, 吉立新, 李邵梅, 高超   

  1. 国家数字交换系统工程技术研究中心, 郑州 450000
  • 通讯作者: 征察(1994-),男,安徽宿州人,硕士研究生,主要研究方向:计算机视觉、跨媒体信息处理,E-mail:zcpi31415926@163.com
  • 作者简介:征察(1994-),男,安徽宿州人,硕士研究生,主要研究方向:计算机视觉、跨媒体信息处理;吉立新(1969-),男,河南郑州人,研究员,博士,主要研究方向:通信与信息系统;李邵梅(1982-),女,湖北钟祥人,副研究员,博士,主要研究方向:数字图像处理、模式识别;高超(1982-),男,河南新郑人,讲师,博士,主要研究方向:计算机视觉、机器学习.
  • 基金资助:
    国家自然科学基金资助项目(61601513)。

Abstract: The traditional face annotation methods for news images mainly rely on similarity information of the faces, and have poor ability to distinguish non-noise faces from noise faces and to annotate non-noise faces. Aiming at this issue, a face annotation method based on multi-modal information fusion was proposed. Firstly, according to the co-occurrence relations between faces and names, face-name match degrees based on face similarity were obtained by using a modified K-Nearest Neighbor (KNN) algorithm. After that, face importance degrees were characterized by the size and position information of faces extracted from images, and name importance degrees were characterized by the name position information extracted from images. Finally, Back Propagation (BP) neural network was applied to fuse the above information to infer labels of faces, and an annotation result correcting strategy was proposed to further improve the annotation results. Experimental results on Label Yahoo!News dataset demonstrate that the accuracy, precision and recall of the proposed method reach 77.11%, 73.58% and 78.75% respectively; compared with the methods only based on face similarity, the proposed method has outstanding ability to distinguish non-noise faces from noise faces and to annotate non-noise faces.

Key words: news image, faces annotation, K-Nearest Neighbor (KNN) algorithm, multi-modal information, Back Propagation (BP) neural network

摘要: 针对传统新闻图像中人脸标注方法主要依赖人脸相似度信息,分辨噪声和非噪声人脸能力以及非噪声人脸标注能力较差的问题,提出一种基于多模态信息融合的新闻图像人脸标注方法。首先根据人脸和姓名的共现关系,利用改进的K近邻算法,获得基于人脸相似度信息的人脸姓名匹配度;然后,分别从图像中提取人脸大小和位置的信息对人脸重要程度进行表征,从文本中提取姓名位置信息对姓名重要程度进行表征;最后,使用反向传播神经网络来融合上述信息完成人脸标签的推理,并提出一个标签修正策略来进一步改善标注结果。在Label Yahoo! News数据集上的测试效果表明,所提方法的标注准确率、精度和召回率分别达到了77.11%、73.58%和78.75%,与仅基于人脸相似度的算法相比,具有较好的分辨噪声和非噪声人脸能力以及非噪声人脸标注能力。

关键词: 新闻图像, 人脸标注, K近邻算法, 多模态信息, 反向传播神经网络

CLC Number: