基于交叉层级数据共享的多任务模型

doi:10.11772/j.issn.1001-9081.2021030516

《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (5): 1447-1454.DOI: 10.11772/j.issn.1001-9081.2021030516

基于交叉层级数据共享的多任务模型

陈颖¹, 于炯¹^,²(), 陈嘉颖², 杜旭升²

^1.新疆大学软件学院，乌鲁木齐 830091
^2.新疆大学信息科学与工程学院，乌鲁木齐 830046

收稿日期:2021-04-06 修回日期:2021-06-22 接受日期:2021-06-22 发布日期:2022-06-11 出版日期:2022-05-10
通讯作者: 于炯
作者简介:陈颖（1999—），女，湖南娄底人，硕士研究生，主要研究方向：数据挖掘、机器学习
于炯（1964—），男，北京人，教授，博士生导师，博士，主要研究方向：绿色计算、机器学习、数据挖掘 yujiong@xju.edu.cn
陈嘉颖（1988—），女，新疆沙湾人，博士研究生，主要研究方向：推荐系统、数据挖掘
杜旭升（1995—），男，甘肃庆阳人，博士研究生，CCF会员，主要研究方向：机器学习、数据挖掘。
基金资助:
国家自然科学基金资助项目(61862060)

Cross-layer data sharing based multi-task model

Ying CHEN¹, Jiong YU¹^,²(), Jiaying CHEN², Xusheng DU²

^1.School of Software，Xinjiang University，Urumqi Xinjiang 830091，China
^2.College of Information Science and Engineering，Xinjiang University，Urumqi Xinjiang 830046，China

Received:2021-04-06 Revised:2021-06-22 Accepted:2021-06-22 Online:2022-06-11 Published:2022-05-10
Contact: Jiong YU
About author:CHEN Ying， born in 1999，M. S. candidate. Her research interestsinclude data mining，machine learning.
YU Jiong， born in 1964，Ph. D.，professor. His research interestsinclude green computing，machine learning，data mining.
CHEN Jiayingborn in 1988，Ph. D. candidate. Her researchinterests include recommender system，data mining.，
DU Xusheng，born in 1995， Ph. D. candidate. His researchinterests include machine learning，data mining.
Supported by:
National Natural Science Foundation of China(61862060)

摘要/Abstract

摘要：

针对多任务学习模型中相关度低的任务之间存在的负迁移现象和信息共享困难问题，提出了一种基于交叉层级数据共享的多任务模型。该模型关注细粒度的知识共享，且能保留浅层共享专家的记忆能力和深层特定任务专家的泛化能力。首先，统一多层级共享专家，以获取复杂相关任务间的公共知识；然后，将共享信息分别迁移到不同层级的特定任务专家之中，从而在上下层之间共享部分公共知识；最后，利用基于数据样本的门控网络自主选择不同任务所需信息，从而减轻样本依赖性对模型的不利影响。相较于多门控混合专家（MMOE）模型，所提模型在UCI census-income数据集上对两个任务的F1值分别提高了7.87个百分点和1.19个百分点；且在MovieLens数据集上的回归任务的均方误差（MSE）值降低到0.004 7，分类任务的AUC值提高到0.642。实验结果表明，所提出的模型适用于改善负迁移现象的影响，且能更高效地学习复杂相关任务之间的公共信息。

关键词: 多任务学习, 信息共享, 负迁移, 神经网络, 迁移学习

Abstract:

To address the issues of negative transfer and difficulty of information sharing between loosely correlated tasks in multi-task learning model， a cross-layer data sharing based multi-task model was proposed. The proposed model pays attention to fine-grained knowledge sharing， and is able to retain the memory ability of shallow layer shared experts and generalization ability of deep layer specific task experts. Firstly， multi-layer shared experts were unified to obtain public knowledge among complicatedly correlated tasks. Then， the shared information was transferred to specific task experts at different layers for sharing partial public knowledge between the upper and lower layers. Finally， the data sample based gated network was used to select the needed information for different tasks autonomously， thereby alleviating the harmful effects of sample dependence to the model. Compared with the Multi-gate Mixture-Of-Experts （MMOE） model， the proposed model improved the F1-score of two tasks by 7.87 percentage points and 1.19 percentage points respectively on UCI census-income dataset. The proposed model also decreased the Mean Square Error （MSE） value of regression task to 0.004 7 and increased the Area Under Curve （AUC） value of classification task to 0.642 on MovieLens dataset. Experimental results demonstrate that the proposed model is suitable to improve the influence of negative transfer and can learn public information among complicated related tasks more efficiently.

Key words: multi-task learning, information sharing, negative transfer, neural network, transfer learning

中图分类号:

TP311.1

陈颖, 于炯, 陈嘉颖, 杜旭升. 基于交叉层级数据共享的多任务模型[J]. 计算机应用, 2022, 42(5): 1447-1454.

Ying CHEN, Jiong YU, Jiaying CHEN, Xusheng DU. Cross-layer data sharing based multi-task model[J]. Journal of Computer Applications, 2022, 42(5): 1447-1454.

图/表 9

参考文献 21

1	CARUANA R. Multitask learning ［M］// THRUN S， PRATT L. Learning to Learn. New York： Springer， 1998： 95-133. 10.1007/978-1-4615-5529-2_5
2	章荪，尹春勇.基于多任务学习的时序多模态情感分析模型［J］.计算机应用，2021，41（6）：1631-1639. 10.11772/j.issn.1001-9081.2020091416
	ZHANG S， YIN C Y. Sequential multimodal sentiment analysis model based on multi-task learning ［J］. Journal of Computer Applications， 2021， 41（6）： 1631-1639. 10.11772/j.issn.1001-9081.2020091416
3	姜尧岗，孙晓刚，林云.基于多任务卷积神经网络人脸检测网络的优化加速方法［J］.计算机应用，2019，39（S2）：59-62.
	JIANG Y G， SUN X G， LIN Y. Optimization acceleration method for face detection network based on multi-task convolutional neural network［J］. Journal of Computer Applications， 2019， 39（S2）： 59-62.
4	BANSAL T， BELANGER D， MCCALLUM A. Ask the GRU： multitask learning for deep text recommendations ［C］// Proceedings of the 2016 10th ACM Conference on Recommender Systems. New York： ACM， 2016： 107-114. 10.1145/2959100.2959180
5	SHAO C J， FU H M， CHENG P J. Improving one-class recommendation with multi-tasking on various preference intensities ［C］// Proceedings of the 2020 14th ACM Conference on Recommender Systems. New York： ACM， 2020： 498-502. 10.1145/3383313.3412224
6	LU Y C， DONG R H， SMYTH B. Why I like it： multi-task learning for recommendation and explanation ［C］// Proceedings of the 2018 12th ACM Conference on Recommender Systems. New York： ACM， 2018： 4-12. 10.1145/3240323.3240365
7	TANG H Y， LIU J N， ZHAO M， et al. Progressive Layered Extraction （PLE）： a novel Multi-Task Learning （MTL） model for personalized recommendations ［C］// Proceedings of the 2020 14th ACM Conference on Recommender Systems. New York： ACM， 2020： 269-278. 10.1145/3383313.3412236
8	MA J Q， ZHAO Z， YI X Y， et al. Modeling task relationships in multi-task learning with multi-gate mixture-of-experts ［C］// Proceedings of the 2018 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York： ACM， 2018： 1930-1939. 10.1145/3219819.3220007
9	ISHAN M， ABHINAV S， GUPTA A， et al. Cross-stitch networks for multi-task learning ［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 3994-4003. 10.1109/cvpr.2016.433
10	RUDER S， BINGEL J， AUGENSTEIN I， et al. Sluice networks： learning what to share between loosely related tasks ［EB/OL］. ［2021-02-11］. .
11	ZHANG Y， YANG Q. An overview of multi-task learning ［J］. National Science Review， 2018， 5（1）： 30-43. 10.1093/nsr/nwx105
12	JACOBS R A， JORDAN M I， NOWLAN S J， et al. Adaptive mixtures of local experts ［J］. Neural Computation， 1991， 3（1）： 79-87. 10.1162/neco.1991.3.1.79
13	MA J Q， ZHAO Z， CHEN J L， et al. SNR： sub-network routing for flexible parameter sharing in multi-task learning［C］// Proceedings of the 2019 33rd AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2019： 216-223. 10.1609/aaai.v33i01.3301216
14	ZOPH B， LE Q V. Neural architecture search with reinforcement learning ［EB/OL］. ［2021-02-11］. .
15	WANG N， WANG H N， JIA Y L， et al. Explainable recommendation via multi-task learning in opinionated text data ［C］// Proceedings of the 2018 41st International ACM SIGIR Conference on Research and Development in Information Retrieval. New York： ACM， 2018： 165-174. 10.1145/3209978.3210010
16	WANG J L， HOI S C H， ZHAO P L， et al. Online multitask collaborative filtering for on-the-fly recommender systems ［C］// Proceedings of the 2013 7th ACM Conference on Recommender Systems. New York： ACM， 2013： 237-244. 10.1145/2507157.2507176
17	QIN Z， CHENG Y C， ZHAO Z， et al. Multitask mixture of sequential experts for user activity streams ［C］// Proceedings of the 2020 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York： ACM， 2020： 3083-3091. 10.1145/3394486.3403359
18	HADASH G， SHALOM O S， OSADCHY R. Rank and rate： multi-task learning for recommender systems ［C］// Proceedings of the 2018 12th ACM Conference on Recommender Systems. New York： ACM， 2018： 451-454. 10.1145/3240323.3240406
19	AKHTAR M S， CHAUHAN D S， EKBAL A. A deep multi-task contextual attention framework for multi-modal affect analysis ［J］. ACM Transaction on Knowledge Discovery from Data， 2020， 14（3）： Article No.32. 10.1145/3380744
20	ZHAO Z， HONG L C， WEI L， et al. Recommending what video to watch next： a multi-task ranking system ［C］// Proceedings of the 2019 13th ACM Conference on Recommender Systems. New York： ACM， 2019： 43-51. 10.1145/3298689.3346997
21	ROSENBLATT F. The perceptron： a probabilistic model for information storage and organization in the brain ［J］. Psychological Review， 1958， 65（6）： 386-408. 10.1037/h0042519

数据集	训练集样本数	验证集样本数	测试集样本数
合成数据1	1 000 000	100 000	100 000
合成数据2	100 000	10 000	10 000

数据集	训练集样本数	验证集样本数	测试集样本数
合成数据1	1 000 000	100 000	100 000
合成数据2	100 000	10 000	10 000

数据集	总样本数	训练集样本数	验证集样本数	测试集样本数
UCI census-income	299 285	199 523	49 881	49 881
MovieLens	100 000	70 000	—	30 000

数据集	总样本数	训练集样本数	验证集样本数	测试集样本数
UCI census-income	299 285	199 523	49 881	49 881
MovieLens	100 000	70 000	—	30 000

模型	Task1-Income			Task2-Marital			MCV-AUC	MCV-F1	MCV-ACC
模型	AUC	F1-score	ACC	AUC	F1-score	ACC	MCV-AUC	MCV-F1	MCV-ACC
Single-task	0.932 5	0.693 1	0.952 0	0.970 8	0.927 0	0.928 3	—	—	—
Shared-bottom	0.904 9	0.643 6	0.845 1	0.974 2	0.931 3	0.932 7	1.879 1	1.574 9	1.777 8
Cross-stitch	0.929 4	0.742 3	0.950 5	0.984 3	0.933 4	0.934 5	1.913 7	1.675 7	1.885 0
PLE	0.941 5	0.713 9	0.950 9	0.980 6	0.927 2	0.929 0	1.922 1	1.641 1	1.879 9
MMOE	0.939 3	0.679 0	0.948 2	0.984 9	0.932 5	0.933 6	1.924 2	1.611 5	1.881 8
CLS-0	0.946 1	0.753 4	0.953 2	0.986 0	0.933 5	0.934 6	1.932 1	1.687 8	1.887 8
CLS	0.946 8	0.757 7	0.953 3	0.988 7	0.944 4	0.945 8	1.935 5	1.702 1	1.899 1

基于交叉层级数据共享的多任务模型

Cross-layer data sharing based multi-task model

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献 21

相关文章 15

编辑推荐

Metrics

[1]	屈震, 李堃婷, 冯志玺. 基于有效通道注意力的遥感图像场景分类[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1431-1439.
[2]	李默, 芦天亮, 谢子恒. 基于代码图像合成的Android恶意软件家族分类方法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1490-1499.
[3]	王艺霏, 于雷, 滕飞, 宋佳玉, 袁玥. 基于长-短时序特征融合的资源负载预测模型[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1508-1515.
[4]	李晓寒, 贾华丁, 程雪, 李太勇. 基于改进遗传算法和图神经网络的股市波动预测方法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1624-1633.
[5]	谢新林, 肖毅, 续欣莹. 基于神经网络架构搜索的肺结节分类算法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1424-1430.
[6]	王利娥, 李小聪, 刘红翼. 融合知识图谱和差分隐私的新闻推荐方法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1339-1346.
[7]	陈学勤, 陶涛, 张钟旺, 王一蕾. 融合成对编码方案及二维卷积神经网络的长短期会话推荐算法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1347-1354.
[8]	胡鹤轩, 隋华超, 胡强, 张晔, 胡震云, 马能武. 基于图注意力网络与双阶注意力机制的径流预报模型[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1607-1615.
[9]	陈浩杰, 范江亭, 刘勇. 深度强化学习解决动态旅行商问题[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1194-1200.
[10]	蒋雯静, 熊熙, 李中志, 李斌勇. 基于无采样协作知识图网络的推荐系统[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1057-1064.
[11]	汪祖民, 张志豪, 秦静, 季长清. 基于卷积神经网络的机械故障诊断技术综述[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1036-1043.
[12]	刘志华, 陈文洁, 陈爱斌. 基于自注意力机制时频谱同源特征融合的鸟鸣声分类[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1260-1268.
[13]	顾军华, 王锐, 李宁宁, 张素琪. 融合协同过滤信息的知识图注意力网络[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1087-1092.
[14]	董永峰, 孙跃华, 高立超, 韩鹏, 季海鹏. 基于改进一维卷积和双向长短期记忆神经网络的故障诊断方法[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1207-1215.
[15]	滕腾, 潘海为, 张可佳, 牟雪莲, 张锡明, 陈伟鹏. 支持中文医疗问答的基于注意力机制的栈卷积神经网络模型[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1125-1130.