Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (3): 752-758.DOI: 10.11772/j.issn.1001-9081.2022010053

• Artificial intelligence • Previous Articles    

Table structure recognition model integrating edge features and attention

Xueqiang LYU1, Yunan ZHANG1, Jing HAN1(), Yunpeng CUI2, Huan LI2   

  1. 1.Beijing Key Laboratory of Internet Culture and Digital Dissemination Research (Beijing Information Science and Technology University),Beijing 100101,China
    2.Key Laboratory of Agricultural Big Data,Ministry of Agriculture (Agricultural Information Institute of Chinese Academy of Agricultural Science),Beijing 100081,China
  • Received:2022-01-17 Revised:2022-04-06 Accepted:2022-04-11 Online:2022-04-26 Published:2023-03-10
  • Contact: Jing HAN
  • About author:LYU Xueqiang, born in 1970, Ph. D., professor. His research interests include multimedia information processing.
    ZHANG Yunan, born in 1996, M. S. candidate. His research interests include computer vision.
    CUI Yunpeng, born in 1972, Ph. D., research fellow. His research interests include agricultural information technology.
    LI Huan, born in 1992, M. S., research fellow. Her research interests include data mining.
  • Supported by:
    National Natural Science Foundation of China(62171043)

融合边特征与注意力的表格结构识别模型

吕学强1, 张煜楠1, 韩晶1(), 崔运鹏2, 李欢2   

  1. 1.网络文化与数字传播北京市重点实验室(北京信息科技大学), 北京 100101
    2.农业农村部农业大数据重点实验室(中国农业科学院农业信息研究所), 北京 100081
  • 通讯作者: 韩晶
  • 作者简介:吕学强(1970—),男,辽宁抚顺人,教授,博士,主要研究方向:多媒体信息处理
    张煜楠(1996—),男,北京人,硕士研究生,主要研究方向:计算机视觉
    韩晶(1990—),女,河北邯郸人,助理研究员,博士,主要研究方向:图像处理
    崔运鹏(1972—),男,吉林和龙人,研究员,博士,主要研究方向:农业信息技术
    李欢(1992—),女,湖北红安人,研究员,硕士,主要研究方向:数据挖掘。
  • 基金资助:
    国家自然科学基金资助项目(62171043)

Abstract:

Aiming at the problems in the existing methods such as dependence on prior knowledge, insufficient robustness, and insufficient expression ability in table structure recognition, a new table structure recognition model integrating edge features and attention was proposed, namely Graph Edge-Attention Network based Table Structure Recognition model (GEAN-TSR). Firstly, Graph Edge-Attention Network (GEAN) was proposed as the backbone network, and based on edge convolution structure, the graph attention mechanism was introduced and improved to aggregate graph node features, so as to solve the problem of information loss in the process of feature extraction of graph network, and improve the expression ability of graph network. Then, an edge feature fusion module was introduced to fuse the shallow graph node information with the graph network output to enhance the local information extraction and expression abilities of the graph network. Finally, the graph node text features extracted by Gated Recurrent Unit (GRU) were integrated into the text feature fusion module for edge’s classification and prediction. Comparative experiments on Scientific paper Table Structure Recognition-COMPlicated (SciTSR-COMP) dataset show that the recall and F1 score of GEAN-TSR are increased by 2.5 and 1.4 percentage points, respectively in comparison with the existing optimal model Split, Embed and Merge (SEM). Ablation experiments show that all the indicators of GEAN-TSR have achieved the optimal values after using the feature fusion module, proving the effectiveness of the module. Experimental results show that GEAN-TSR can effectively improve the network performance and better complete the task of table structure recognition.

Key words: graph neural network, graph attention network, feature fusion, table structure recognition, table parsing

摘要:

针对现有方法在表格结构识别问题中存在的先验知识依赖、鲁棒性不足、表达能力不足等问题,提出一种新的融合边特征与注意力的表格结构识别模型——GEAN-TSR。首先,提出图边注意力网络(GEAN)并作为模型的主干网络,在边卷积结构的基础上引入并改进图注意力机制聚合图节点特征,解决图网络在特征提取过程中的信息损失的问题,提高图网络的表达能力;然后,引入边特征融合模块融合浅层图节点信息与图网络输出,增强图网络的局部信息提取能力与表达能力;最后,将门控循环单元(GRU)提取的图节点文本特征融入文本特征融合模块对边进行分类预测。在SciTSR-COMP数据集上的对比实验中,相较于目前最优的模型SEM,GEAN-TSR的召回率与F1值分别提升2.5与1.4个百分点。在消融实验中,GEAN-TSR采用特征融合模块后,所有指标都取得了最优值,验证了模块的有效性。实验结果表明,GEAN-TSR能够有效提升网络性能,更好地完成表格结构识别任务。

关键词: 图神经网络, 图注意力网络, 特征融合, 表格结构识别, 表格解析

CLC Number: