《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (5): 1461-1466.DOI: 10.11772/j.issn.1001-9081.2022040641

• 人工智能 • 上一篇    

基于ERNIE+DPCNN+BiGRU的农业新闻文本分类

杨森淇1,2, 段旭良1,2(), 肖展1,2, 郎松松1,2, 李志勇1,2   

  1. 1.四川农业大学 信息工程学院,四川 雅安 625014
    2.四川农业大学 农业信息工程实验室,四川 雅安 625014
  • 收稿日期:2022-05-07 修回日期:2022-07-15 接受日期:2022-07-22 发布日期:2022-08-12 出版日期:2023-05-10
  • 通讯作者: 段旭良
  • 作者简介:杨森淇(1997—),男,河北廊坊人,硕士研究生,主要研究方向:自然语言处理
    段旭良(1982—),男,河北唐山人,副教授,硕士,主要研究方向:智慧农业、数据挖掘、数据清洗 duanxuliang@sicau.edu.cn
    肖展(2000—),男,四川巴中人,硕士研究生,主要研究方向:自然语言处理
    郎松松(1997—),男,四川达州人,硕士研究生,主要研究方向:计算机视觉、目标检测
    李志勇(1985—),男,四川眉山人,副教授,博士,主要研究方向:农业信息处理、智能决策。
  • 基金资助:
    四川省自然科学基金资助项目(2022NSFSC0172)

Text classification of agricultural news based on ERNIE+DPCNN+BiGRU

Senqi YANG1,2, Xuliang DUAN1,2(), Zhan XIAO1,2, Songsong LANG1,2, Zhiyong LI1,2   

  1. 1.College of Information Engineering,Sichuan Agricultural University,Ya'an Sichuan 625014,China
    2.Agricultural Information Engineering Laboratory,Sichuan Agricultural University,Ya'an Sichuan 625014,China
  • Received:2022-05-07 Revised:2022-07-15 Accepted:2022-07-22 Online:2022-08-12 Published:2023-05-10
  • Contact: Xuliang DUAN
  • About author:YANG Senqi, born in 1997, M. S. candidate. His research interests include natural language processing.
    DUAN Xuliang, born in 1982, M. S., associate professor. His research interests include smart agriculture, data mining, data cleaning.
    XIAO Zhan, born in 2000, M. S. candidate. His research interests include natural language processing.
    LANG Songsong, born in 1997, M. S. candidate. His research interests include computer vision, object detection.
    LI Zhiyong, born in 1985, Ph. D., associate professor. His research interests include agricultural information processing, intelligent decision-making.
  • Supported by:
    Natural Science Foundation of Sichuan Province(2022NSFSC0172)

摘要:

针对农业新闻目前面临的针对性差、分类不清和数据集缺乏等问题,提出一种基于ERNIE(Enhanced Representation through kNowledge IntEgration)、深度金字塔卷积神经网络(DPCNN)和双向门控循环单元(BiGRU)的农业新闻分类模型——EGC。首先利用ERNIE对数据集进行编码,然后利用改进后的DPCNN和BiGRU同时提取新闻文本的特征,再将两者提取的特征进行拼合并经过Softmax得到最终结果。为了使EGC模型适用于农业新闻分类领域,对DPCNN进行改进,减少它的卷积层以保留更多特征。实验结果表明,与ERNIE相比,EGC模型的精确率、召回率和F1分数别提升了1.47、1.29和1.42个百分点,优于传统分类模型。

关键词: 新闻文本分类, 农业工程, ERNIE, 深度金字塔卷积神经网络, 双向门控循环单元

Abstract:

To address the problems of poor targeted performance, unclear classification and lack of datasets faced by agricultural news, an agricultural news classification model based on Enhanced Representation through kNowledge IntEgration (ERNIE), Deep Pyramidal Convolutional Neural Network (DPCNN) and Bidirectional Gated Recurrent Unit (BiGRU), called EGC, was proposed. The dataset was first encoded by using ERNIE, then the features of the news text were extracted simultaneously by using the improved DPCNN and BiGRU, and the features extracted were combined and the final results were obtained by Softmax. To make EGC model more suitable for applications in the field of agricultural news classification, the DPCNN was improved by reducing its convolution layers to preserve more features. Experimental results show that compared with ERNIE, the precision, recall and F1 score of the proposed EGC model are improved by 1.47, 1.29 and 1.42 percentage points, respectively, verifying that EGC is better than traditional classification models.

Key words: text classification of news, agricultural engineering, Enhanced Representation through kNowledge IntEgration (ERNIE), Deep Pyramid Convolutional Neural Network (DPCNN), Bidirectional Gated Recurrent Unit (BiGRU)

中图分类号: