Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (4): 1356-1362.DOI: 10.11772/j.issn.1001-9081.2024040533

• Frontier and comprehensive applications • Previous Articles    

Power work order classification in substation area based on MiniRBT-LSTM-GAT and label smoothing

Jiaxin LI, Site MO()   

  1. College of Electrical Engineering,Sichuan University,Chengdu Sichuan 610065,China
  • Received:2024-04-28 Revised:2024-08-14 Accepted:2024-08-16 Online:2025-04-08 Published:2025-04-10
  • Contact: Site MO
  • About author:LI Jiaxin, born in 1999, M. S. candidate. Her research interests include power system data analysis.
  • Supported by:
    Science and Technology Project of State Grid Corporation of China(52199922000M)

基于MiniRBT-LSTM-GAT与标签平滑的台区电力工单分类

李嘉欣, 莫思特()   

  1. 四川大学 电气工程学院,成都 610065
  • 通讯作者: 莫思特
  • 作者简介:李嘉欣(1999—),女,山西长治人,硕士研究生,主要研究方向:电力系统数据分析
  • 基金资助:
    国家电网有限公司科技项目(52199922000M)

Abstract:

Record of power work orders in substation area serves as a reflection of substation operational conditions and user requirements, and is an important basis for establishing substation’s electricity safety management system and meeting the electricity demands of users. To address the issues of power work order classification in substation areas brought by high complexity and strong professionalism of the orders, a power work order classification in substation area, Mini RoBERTa-Long Short-Term Memory-Graph Attention neTwork (MiniRBT-LSTM-GAT) was proposed. Label Smoothing (LS) and a pre-trained language model were integrated by the proposed model. Firstly, a pre-trained model was utilized to calculate the character-level feature vector representation in the power work order text. Secondly, Bidirectional Long Short-Term Memory (BiLSTM) network was employed to capture the dependency within the power text sequence. Thirdly, Graph Attention neTwork (GAT) was applied to emphasize the feature information that contributes to text classification significantly. Finally, LS was used to modify the loss function, so as to improve the classification accuracy of the model. The proposed model was compared with mainstream text classification algorithms on Power Work Order dataset in Rural power Station area (RSPWO), 95598 Power Work Order dataset in ZheJiang province (ZJPWO), and THUCNews (TsingHua University Chinese News) dataset. Experimental results show that compared with Bidirectional Encoder Representations from Transformers (BERT) model for Electric Power Audit Text classification (EPAT-BERT), the proposed model has an increase of 2.76 percentage points in precision and 2.02 percentage points in F1 value on RSPWO, and has an increase of 1.77 percentage points in precision and 1.40 percentage points in F1 value on ZJPWO. In comparison with capsule network based on BERT and dependency syntax (BRsyn-caps), the proposed model has an increase of 0.76 percentage points in precision and 0.71 percentage points in accuracy on THUCNews dataset. The above confirms the effectiveness of the proposed model in enhancing the classification performance of power work orders in substation area, and the good performance of the proposed model on THUCNews dataset, verifying the generality of the model.

Key words: power work order in substation area, text classification, pre-trained model, Graph Attention neTwork (GAT), Label Smoothing (LS)

摘要:

台区电力工单记录反映了台区运行工况和用户需求,是制定台区用电安全管理制度和满足台区用户用电需求的重要依据。针对台区电力工单高复杂性和强专业性给台区工单分类带来的难题,提出一种融合标签平滑(LS)与预训练语言模型的台区电力工单分类模型(MiniRBT-LSTM-GAT)。首先,利用预训练模型计算电力工单文本中的字符级特征向量表示;其次,采用双向长短期记忆网络(BiLSTM)捕捉电力文本序列中的依赖关系;再次,通过图注意力网络(GAT)聚焦对文本分类贡献大的特征信息;最后,利用LS改进损失函数以提高模型的分类精度。所提模型与当前主流的文本分类算法在农网台区电力工单数据集(RSPWO)、浙江省95598电力工单数据集(ZJPWO)和THUCNews(TsingHua University Chinese News)数据集上的实验结果表明,与电力审计文本多粒度预训练语言模型(EPAT-BERT)相比,所提模型在RSPWO、ZJPWO上的查准率和F1值分别提升了2.76、2.02个百分点和1.77、1.40个百分点;与胶囊神经网络模型BRsyn-caps(capsule network based on BERT and dependency syntax)相比,所提模型在THUCNews数据集上的查准率和准确率分别提升了0.76和0.71个百分点。可见,所提模型有效提升了台区电力工单分类的性能,并在THUCNews数据集上表现良好,验证了模型的通用性。

关键词: 台区电力工单, 文本分类, 预训练模型, 图注意力网络, 标签平滑

CLC Number: