《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (4): 1104-1112.DOI: 10.11772/j.issn.1001-9081.2024030386
胡婕1,2,3, 郑启扬1, 孙军1,2,3(), 张龑1,2,3
收稿日期:
2024-04-08
修回日期:
2024-07-04
接受日期:
2024-07-17
发布日期:
2024-10-12
出版日期:
2025-04-10
通讯作者:
孙军
作者简介:
胡婕(1977—),女,湖北汉川人,教授,博士,主要研究方向:复杂语义大数据管理、自然语言处理基金资助:
Jie HU1,2,3, Qiyang ZHENG1, Jun SUN1,2,3(), Yan ZHANG1,2,3
Received:
2024-04-08
Revised:
2024-07-04
Accepted:
2024-07-17
Online:
2024-10-12
Published:
2025-04-10
Contact:
Jun SUN
About author:
HU Jie, born in 1977, Ph. D., professor. Her research interests include complex semantic big data management, natural language processing.Supported by:
摘要:
在多标签分类任务中,现有模型对依赖关系的构建主要考虑标签在训练集中是否共现,而忽视了标签之间各种不同类型的关系以及在不同样本中的动态交互关系。因此,结合多标签关系图和局部动态重构图学习更完整的标签依赖关系。首先,根据标签的全局共现关系,采用数据驱动的方式构建多标签关系图,学习标签之间不同类型的依赖关系;其次,通过标签注意力机制探索文本信息和标签语义的关联性;最后,对标签图进行动态重构学习,以捕获标签之间的局部特定关系。在3个公开数据集BibTeX、Delicious和Reuters-21578上的实验结果表明,所提模型的宏平均F1(maF1)值相较于MrMP(Multi-relation Message Passing)分别提高了1.6、1.0和2.2个百分点,综合性能得到提升。
中图分类号:
胡婕, 郑启扬, 孙军, 张龑. 基于多标签关系图和局部动态重构学习的多标签分类模型[J]. 计算机应用, 2025, 45(4): 1104-1112.
Jie HU, Qiyang ZHENG, Jun SUN, Yan ZHANG. Multi-label classification model based on multi-label relational graph and local dynamic reconstruction learning[J]. Journal of Computer Applications, 2025, 45(4): 1104-1112.
类型 | BibTeX | Delicious | Reuters-21578 |
---|---|---|---|
文本输入类型 | Binary Vector | Binary Vector | Sequential |
训练样本数 | 4 377 | 11 597 | 6 993 |
验证样本数 | 487 | 1 289 | 777 |
测试样本数 | 2 515 | 3 185 | 3 019 |
标签数 | 159 | 983 | 90 |
输入特征数 | 1 836 | 500 | 23 662 |
表1 实验数据集描述
Tab. 1 Experimental dataset description
类型 | BibTeX | Delicious | Reuters-21578 |
---|---|---|---|
文本输入类型 | Binary Vector | Binary Vector | Sequential |
训练样本数 | 4 377 | 11 597 | 6 993 |
验证样本数 | 487 | 1 289 | 777 |
测试样本数 | 2 515 | 3 185 | 3 019 |
标签数 | 159 | 983 | 90 |
输入特征数 | 1 836 | 500 | 23 662 |
参数 | 值 | 参数 | 值 |
---|---|---|---|
Transformer hidden size | 512 | Learning rate | 0.000 2 |
GCN嵌入维度 | 512 | Weight decay | 0.01 |
GCN层数 | 2 | 优化器 | Adam |
Batch size | 32 | Dropout(Delicious) | 0.1 |
注意力头数 | 4 | Dropout(BibTeX和 Reuters-21578) | 0.2 |
编码器层数 | 2 | 0.005 | |
Epoch | 50 |
表2 参数设置
Tab. 2 Parameter setting
参数 | 值 | 参数 | 值 |
---|---|---|---|
Transformer hidden size | 512 | Learning rate | 0.000 2 |
GCN嵌入维度 | 512 | Weight decay | 0.01 |
GCN层数 | 2 | 优化器 | Adam |
Batch size | 32 | Dropout(Delicious) | 0.1 |
注意力头数 | 4 | Dropout(BibTeX和 Reuters-21578) | 0.2 |
编码器层数 | 2 | 0.005 | |
Epoch | 50 |
模型 | BibTeX | Delicious | Reuters-21578 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
ACC | ebF1 | maF1 | miF1 | ACC | ebF1 | maF1 | miF1 | ACC | ebF1 | maF1 | miF1 | |
差值 | +6.0 | +2.1 | +3.5 | +1.5 | +14.3 | +3.2 | +5.0 | +4.9 | +1.4 | +1.1 | +3.7 | +1.3 |
ML-KNN | 7.4 | 20.5 | 13.4 | 26.8 | 0.3 | 22.3 | 8.0 | 24.5 | 65.1 | 71.0 | 25.3 | 73.2 |
ML-ARAM | 10.7 | 25.8 | 9.7 | 23.8 | 0.5 | 15.5 | 4.1 | 16.7 | 47.4 | 67.3 | 16.3 | 62.8 |
LaMP | 18.5 | 44.7 | 37.6 | 47.3 | 0.6 | 37.2 | 19.6 | 38.6 | 83.5 | 90.6 | 56.0 | 88.9 |
MPVAE | 17.9 | 45.3 | 38.6 | 47.5 | 0.0 | 37.3 | 18.1 | 38.4 | 81.6 | 89.8 | 54.2 | 88.7 |
HOT-VAE | — | — | — | — | 84.0 | 91.2 | 57.4 | 89.1 | ||||
MrMP | 19.9 | 46.0 | 39.3 | 48.1 | ||||||||
CFTC | — | — | — | — | — | — | — | — | 83.3 | 90.0 | 55.7 | 88.8 |
本文模型 | 21.3 | 47.9 | 40.9 | 48.9 | 0.8 | 38.9 | 20.9 | 41.0 | 85.6 | 92.4 | 61.3 | 90.5 |
表3 不同模型在BibTeX、Delicious、Reuters-21578数据集上评价指标的对比 (%)
Tab. 3 Evaluation index comparison among different models on BibTeX, Delicious, and Reuters-21578 datasets
模型 | BibTeX | Delicious | Reuters-21578 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
ACC | ebF1 | maF1 | miF1 | ACC | ebF1 | maF1 | miF1 | ACC | ebF1 | maF1 | miF1 | |
差值 | +6.0 | +2.1 | +3.5 | +1.5 | +14.3 | +3.2 | +5.0 | +4.9 | +1.4 | +1.1 | +3.7 | +1.3 |
ML-KNN | 7.4 | 20.5 | 13.4 | 26.8 | 0.3 | 22.3 | 8.0 | 24.5 | 65.1 | 71.0 | 25.3 | 73.2 |
ML-ARAM | 10.7 | 25.8 | 9.7 | 23.8 | 0.5 | 15.5 | 4.1 | 16.7 | 47.4 | 67.3 | 16.3 | 62.8 |
LaMP | 18.5 | 44.7 | 37.6 | 47.3 | 0.6 | 37.2 | 19.6 | 38.6 | 83.5 | 90.6 | 56.0 | 88.9 |
MPVAE | 17.9 | 45.3 | 38.6 | 47.5 | 0.0 | 37.3 | 18.1 | 38.4 | 81.6 | 89.8 | 54.2 | 88.7 |
HOT-VAE | — | — | — | — | 84.0 | 91.2 | 57.4 | 89.1 | ||||
MrMP | 19.9 | 46.0 | 39.3 | 48.1 | ||||||||
CFTC | — | — | — | — | — | — | — | — | 83.3 | 90.0 | 55.7 | 88.8 |
本文模型 | 21.3 | 47.9 | 40.9 | 48.9 | 0.8 | 38.9 | 20.9 | 41.0 | 85.6 | 92.4 | 61.3 | 90.5 |
模型 | BibTeX | Delicious | Reuters-21578 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
ACC | ebF1 | maF1 | miF1 | ACC | ebF1 | maF1 | miF1 | ACC | ebF1 | maF1 | miF1 | |
本文模型 | 21.3 | 47.9 | 40.3 | 48.9 | 0.8 | 38.9 | 20.9 | 40.0 | 85.6 | 92.4 | 61.3 | 90.5 |
摘除局部动态重构图 | 19.8 | 45.8 | 39.0 | 47.8 | 0.7 | 37.5 | 19.7 | 39.0 | 84.0 | 91.0 | 58.8 | 89.0 |
摘除多标签关系图 | 20.1 | 46.3 | 39.4 | 48.2 | 0.7 | 38.0 | 20.2 | 39.3 | 84.6 | 91.6 | 59.5 | 89.5 |
摘除标签注意力机制 | 20.4 | 46.8 | 39.9 | 48.5 | 0.7 | 38.2 | 20.5 | 39.6 | 85.0 | 92.1 | 60.4 | 89.9 |
摘除三者 | 19.0 | 45.2 | 38.3 | 47.1 | 0.6 | 36.9 | 19.1 | 38.5 | 83.2 | 89.8 | 57.9 | 88.2 |
表4 消融实验结果 (%)
Tab. 4 Ablation experimental results
模型 | BibTeX | Delicious | Reuters-21578 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
ACC | ebF1 | maF1 | miF1 | ACC | ebF1 | maF1 | miF1 | ACC | ebF1 | maF1 | miF1 | |
本文模型 | 21.3 | 47.9 | 40.3 | 48.9 | 0.8 | 38.9 | 20.9 | 40.0 | 85.6 | 92.4 | 61.3 | 90.5 |
摘除局部动态重构图 | 19.8 | 45.8 | 39.0 | 47.8 | 0.7 | 37.5 | 19.7 | 39.0 | 84.0 | 91.0 | 58.8 | 89.0 |
摘除多标签关系图 | 20.1 | 46.3 | 39.4 | 48.2 | 0.7 | 38.0 | 20.2 | 39.3 | 84.6 | 91.6 | 59.5 | 89.5 |
摘除标签注意力机制 | 20.4 | 46.8 | 39.9 | 48.5 | 0.7 | 38.2 | 20.5 | 39.6 | 85.0 | 92.1 | 60.4 | 89.9 |
摘除三者 | 19.0 | 45.2 | 38.3 | 47.1 | 0.6 | 36.9 | 19.1 | 38.5 | 83.2 | 89.8 | 57.9 | 88.2 |
GCN层数 | ACC/% | ebF1/% | maF1/% | miF1/% |
---|---|---|---|---|
1 | 85.4 | 92.0 | 61.0 | 90.2 |
2 | 85.6 | 92.4 | 61.3 | 90.5 |
3 | 85.0 | 91.6 | 59.7 | 89.7 |
4 | 84.6 | 91.0 | 58.9 | 89.1 |
表5 GCN不同层数实验结果的对比
Tab. 5 Comparison of experimental results of different layers of GCN
GCN层数 | ACC/% | ebF1/% | maF1/% | miF1/% |
---|---|---|---|---|
1 | 85.4 | 92.0 | 61.0 | 90.2 |
2 | 85.6 | 92.4 | 61.3 | 90.5 |
3 | 85.0 | 91.6 | 59.7 | 89.7 |
4 | 84.6 | 91.0 | 58.9 | 89.1 |
建模方式 | ACC | ebF1 | maF1 | miF1 |
---|---|---|---|---|
条件概率建模 | 84.8 | 91.2 | 60.6 | 89.7 |
数据驱动方式建模 | 85.6 | 92.4 | 61.3 | 90.5 |
表6 不同相关矩阵构建方式实验结果的对比 (%)
Tab. 6 Comparison of experimental results of different correlation matrix construction methods
建模方式 | ACC | ebF1 | maF1 | miF1 |
---|---|---|---|---|
条件概率建模 | 84.8 | 91.2 | 60.6 | 89.7 |
数据驱动方式建模 | 85.6 | 92.4 | 61.3 | 90.5 |
类型 | ACC | ebF1 | maF1 | miF1 |
---|---|---|---|---|
有激励抑制关系边 | 85.6 | 92.4 | 61.3 | 90.5 |
无激励抑制关系边 | 84.6 | 91.2 | 60.1 | 89.3 |
表7 有无激励抑制关系边的模型实验结果的对比 (%)
Tab. 7 Comparison of experimental results of models with and without incentive inhibition relationship edges
类型 | ACC | ebF1 | maF1 | miF1 |
---|---|---|---|---|
有激励抑制关系边 | 85.6 | 92.4 | 61.3 | 90.5 |
无激励抑制关系边 | 84.6 | 91.2 | 60.1 | 89.3 |
模型 | ACC | ebF1 | maF1 | miF1 |
---|---|---|---|---|
本文模型 | 85.6 | 92.4 | 61.3 | 90.5 |
缺少嵌入距离损失函数的模型 | 81.2 | 91.0 | 59.7 | 88.6 |
缺少交叉熵损失函数的模型 | 80.1 | 89.9 | 58.8 | 87.4 |
表8 不同损失函数实验结果的对比 (%)
Tab. 8 Comparison of experimental results of different loss functions
模型 | ACC | ebF1 | maF1 | miF1 |
---|---|---|---|---|
本文模型 | 85.6 | 92.4 | 61.3 | 90.5 |
缺少嵌入距离损失函数的模型 | 81.2 | 91.0 | 59.7 | 88.6 |
缺少交叉熵损失函数的模型 | 80.1 | 89.9 | 58.8 | 87.4 |
ACC/% | ebF1/% | maF1/% | miF1/% | |
---|---|---|---|---|
0.010 | 83.3 | 90.7 | 58.5 | 88.9 |
0.005 | 85.6 | 92.4 | 61.3 | 90.5 |
0.001 | 84.5 | 91.3 | 59.8 | 89.7 |
表9 不同噪声阈值实验结果的对比
Tab. 9 Comparison of experimental results of different noise thresholds
ACC/% | ebF1/% | maF1/% | miF1/% | |
---|---|---|---|---|
0.010 | 83.3 | 90.7 | 58.5 | 88.9 |
0.005 | 85.6 | 92.4 | 61.3 | 90.5 |
0.001 | 84.5 | 91.3 | 59.8 | 89.7 |
特征类型 | ACC | ebF1 | maF1 | miF1 |
---|---|---|---|---|
全局共现特征 | 83.0 | 89.8 | 57.5 | 87.9 |
局部动态特征 | 83.5 | 90.3 | 58.2 | 88.5 |
全局+共现 | 85.0 | 92.1 | 60.4 | 89.9 |
表10 不同特征实验结果的对比 (%)
Tab. 10 Comparison of experimental results of different features
特征类型 | ACC | ebF1 | maF1 | miF1 |
---|---|---|---|---|
全局共现特征 | 83.0 | 89.8 | 57.5 | 87.9 |
局部动态特征 | 83.5 | 90.3 | 58.2 | 88.5 |
全局+共现 | 85.0 | 92.1 | 60.4 | 89.9 |
参考标签 | 可视化结果 |
---|---|
money-fx | JAPAN STILL WANTS SPECULATIVE DLR DEALS LIMITED The Finance Ministry is still asking financial institutions to limit speculative dollar dealings, Finance Minister Kiichi Miyazawa told reporters. He was responding to rumours in the New York currency market overnight that the Ministry was reducing its pressure on institutions to refrain from excessively speculative dollar dealings. |
dlr | JAPAN STILL WANTS SPECULATIVE DLR DEALS LIMITED The Finance Ministry is still asking financial institutions to limit speculative dollar dealings, Finance Minister Kiichi Miyazawa told reporters. He was responding to rumours in the New York currency market overnight that the Ministry was reducing its pressure on institutions to refrain from excessively speculative dollar dealings. |
表11 标签注意力权重可视化案例分析
Tab. 11 Case study of label attention weight visualization
参考标签 | 可视化结果 |
---|---|
money-fx | JAPAN STILL WANTS SPECULATIVE DLR DEALS LIMITED The Finance Ministry is still asking financial institutions to limit speculative dollar dealings, Finance Minister Kiichi Miyazawa told reporters. He was responding to rumours in the New York currency market overnight that the Ministry was reducing its pressure on institutions to refrain from excessively speculative dollar dealings. |
dlr | JAPAN STILL WANTS SPECULATIVE DLR DEALS LIMITED The Finance Ministry is still asking financial institutions to limit speculative dollar dealings, Finance Minister Kiichi Miyazawa told reporters. He was responding to rumours in the New York currency market overnight that the Ministry was reducing its pressure on institutions to refrain from excessively speculative dollar dealings. |
1 | 李博涵,向宇轩,封顶,等. 融合知识感知与双重注意力的短文本 分类模型[J]. 软件学报, 2022, 33(10):3565-3581. |
LI B H, XIANG Y X, FENG D, et al. Short text classification model combining knowledge aware and dual attention [J]. Journal of Software, 2022, 33(10): 3565-3581. | |
2 | LIU S M, CHEN J H. A multi-label classification based approach for sentiment classification [J]. Expert Systems with Applications, 2015, 42(3):1083-1093. |
3 | LANCHANTIN J, SEKHON A, QI Y. Neural message passing for multi-label classification [C]// Proceedings of the 2019 European Conference on Machine Learning and Knowledge Discovery in Databases, LNCS 11907. Cham: Springer, 2020: 138-163. |
4 | BAI J, KONG S, GOMES C. Disentangled variational autoencoder based multi-label classification with covariance-aware multivariate Probit model [C]// Proceedings of the 29th International Joint Conferences on Artificial Intelligence Special Track on AI for CompSust and Human Well-being. California: ijcai.org, 2020: 4313-4321. |
5 | ZHANG M L, ZHOU Z H. ML-KNN: a lazy learning approach to multi-label learning [J]. Pattern Recognition, 2007, 40(7): 2038-2048. |
6 | READ J, PFAHRINGER B, HOLMES G. Multi-label classification using ensembles of pruned sets [C]// Proceedings of the 8th IEEE International Conference on Data Mining. Piscataway: IEEE, 2008: 995-1000. |
7 | WANG J, YANG Y, MAO J, et al. CNN-RNN: a unified framework for multi-label image classification [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 2285-2294. |
8 | YU H F, JAIN P, KAR P, et al. Large-scale multi-label learning with missing labels [C]// Proceedings of the 31st International Conference on Machine Learning. New York: JMLR.org, 2014: 593-601. |
9 | CHEN C, WANG H, LIU W, et al. Two-stage label embedding via neural factorization machine for multi-label classification [C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2019: 304-3311. |
10 | 王少敬,刘鹏飞,邱锡鹏. 基于序列图模型的多标签序列标注[J]. 中文信息学报, 2020, 34(6): 18-26. |
WANG S J, LIU P F, QIU X P. Sequential graph neural networks for multi-label sequence labeling [J]. Journal of Chinese Information Processing, 2020, 34(6): 18-26. | |
11 | YEH C K, WU W C, KO W J, et al. Learning deep latent spaces for multi-label classification [C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2017: 2838-2844. |
12 | 刘晓玲,刘柏嵩,王洋洋. 一种基于图卷积网络的文本多标签学习方法[J]. 小型微型计算机系统, 2021, 42(3): 531-535. |
LIU X L, LIU B S, WANG Y Y. Text multi-label learning method based on graph convolutional networks [J]. Journal of Chinese Computer Systems, 2021, 42(3): 531-535. | |
13 | ZHAO W, KONG S, BAI J, et al. HOT-VAE: learning high-order label correlation for multi-label classification via attention-based variational autoencoders [C]// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2021: 15016-15024. |
14 | OZMEN M, ZHANG H, WANG P, et al. Multi-relation message passing for multi-label text classification [C]// Proceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2022: 3583-3587. |
15 | VASHISHTH S, SANYAL S, NITIN V, et al. Composition-based multi-relational graph convolutional networks [EB/OL]. [2023-09-27]. . |
16 | OZMEN M, COTNAREANU J, COATES M. Substituting data annotation with balanced updates and collective loss in multi-label text classification [C]// Proceedings of the 2nd Conference on Lifelong Learning Agents. New York: JMLR.org, 2023: 909-922. |
17 | FAN C, CHEN W, TIAN J, et al. Accurate use of label dependency in multi-label text classification through the lens of causality [J]. Applied Intelligence, 2023, 53: 21841-21857. |
18 | KATAKIS I, TSOUMAKAS G, VLAHAVAS I. Multilabel text classification for automated tag suggestion [EB/OL]. [2023-09-13]. . |
19 | TSOUMAKAS G, KATAKIS I, VLAHAVAS I. Effective and efficient multilabel classification in domains with large number of labels [EB/OL]. [2023-09-25]. . |
20 | LEWIS D D, YANG Y, ROSE T G, et al. RCV1: a new benchmark collection for text categorization research [J]. Journal of Machine Learning Research, 2004, 5: 361-397. |
21 | BENITES F, SAPOZHNIKOVA E. HARAM: a hierarchical ARAM neural network for large-scale text classification [C]// Proceedings of the 2015 IEEE International Conference on Data Mining Workshop. Piscataway: IEEE, 2015: 847-854. |
[1] | 刘昕, 杨大伟, 邵长恒, 王海文, 庞铭江, 李艳茹. 面向长尾分布的民众诉求层次多标签分类模型[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 82-89. |
[2] | 李锦烨, 黄瑞章, 秦永彬, 陈艳平, 田小瑜. 基于反绎学习的裁判文书量刑情节识别[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1802-1807. |
[3] | 任炜, 白鹤翔. 基于全局与局部标签关系的多标签图像分类方法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1383-1390. |
[4] | 包永春, 张建臣, 杜守信, 张军军. 基于非负矩阵分解与稀疏表示的多标签分类算法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1375-1382. |
[5] | 王安义, 张衡. 基于多标签分类算法的多输入多输出智能接收机模型[J]. 《计算机应用》唯一官方网站, 2022, 42(10): 3124-3129. |
[6] | 吕学强, 彭郴, 张乐, 董志安, 游新冬. 融合BERT与标签语义注意力的文本多标签分类方法[J]. 《计算机应用》唯一官方网站, 2022, 42(1): 57-63. |
[7] | 闫钧华, 侯平, 张寅, 吕向阳, 马越, 王高飞. 基于多尺度多分类器卷积神经网络的混合失真类型判定方法[J]. 《计算机应用》唯一官方网站, 2021, 41(11): 3178-3184. |
[8] | 李兆玉, 王纪超, 雷曼, 龚琴. 基于引力模型的多标签分类算法[J]. 计算机应用, 2018, 38(10): 2807-2811. |
[9] | 檀何凤, 刘政怡. 基于标签相关性的K近邻多标签分类方法[J]. 计算机应用, 2015, 35(10): 2761-2765. |
[10] | 张丹普, 付忠良, 王莉莉, 李昕. 基于浮动阈值分类器组合的多标签分类算法[J]. 计算机应用, 2015, 35(1): 147-151. |
[11] | 蒋华 戚玉顺. 基于球结构支持向量机的多标签分类的主动学习[J]. 计算机应用, 2012, 32(05): 1359-1361. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||