Cross-layer data sharing based multi-task model

doi:10.11772/j.issn.1001-9081.2021030516

Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (5): 1447-1454.DOI: 10.11772/j.issn.1001-9081.2021030516

• Data science and technology • Previous Articles Next Articles

Cross-layer data sharing based multi-task model

Ying CHEN¹, Jiong YU¹^,²(), Jiaying CHEN², Xusheng DU²

^1.School of Software，Xinjiang University，Urumqi Xinjiang 830091，China
^2.College of Information Science and Engineering，Xinjiang University，Urumqi Xinjiang 830046，China

Received:2021-04-06 Revised:2021-06-22 Accepted:2021-06-22 Online:2022-06-11 Published:2022-05-10
Contact: Jiong YU
About author:CHEN Ying， born in 1999，M. S. candidate. Her research interestsinclude data mining，machine learning.
YU Jiong， born in 1964，Ph. D.，professor. His research interestsinclude green computing，machine learning，data mining.
CHEN Jiayingborn in 1988，Ph. D. candidate. Her researchinterests include recommender system，data mining.，
DU Xusheng，born in 1995， Ph. D. candidate. His researchinterests include machine learning，data mining.
Supported by:
National Natural Science Foundation of China(61862060)

基于交叉层级数据共享的多任务模型

陈颖¹, 于炯¹^,²(), 陈嘉颖², 杜旭升²

^1.新疆大学软件学院，乌鲁木齐 830091
^2.新疆大学信息科学与工程学院，乌鲁木齐 830046

通讯作者: 于炯
作者简介:陈颖（1999—），女，湖南娄底人，硕士研究生，主要研究方向：数据挖掘、机器学习
于炯（1964—），男，北京人，教授，博士生导师，博士，主要研究方向：绿色计算、机器学习、数据挖掘 yujiong@xju.edu.cn
陈嘉颖（1988—），女，新疆沙湾人，博士研究生，主要研究方向：推荐系统、数据挖掘
杜旭升（1995—），男，甘肃庆阳人，博士研究生，CCF会员，主要研究方向：机器学习、数据挖掘。
基金资助:
国家自然科学基金资助项目(61862060)

Abstract

Abstract:

To address the issues of negative transfer and difficulty of information sharing between loosely correlated tasks in multi-task learning model， a cross-layer data sharing based multi-task model was proposed. The proposed model pays attention to fine-grained knowledge sharing， and is able to retain the memory ability of shallow layer shared experts and generalization ability of deep layer specific task experts. Firstly， multi-layer shared experts were unified to obtain public knowledge among complicatedly correlated tasks. Then， the shared information was transferred to specific task experts at different layers for sharing partial public knowledge between the upper and lower layers. Finally， the data sample based gated network was used to select the needed information for different tasks autonomously， thereby alleviating the harmful effects of sample dependence to the model. Compared with the Multi-gate Mixture-Of-Experts （MMOE） model， the proposed model improved the F1-score of two tasks by 7.87 percentage points and 1.19 percentage points respectively on UCI census-income dataset. The proposed model also decreased the Mean Square Error （MSE） value of regression task to 0.004 7 and increased the Area Under Curve （AUC） value of classification task to 0.642 on MovieLens dataset. Experimental results demonstrate that the proposed model is suitable to improve the influence of negative transfer and can learn public information among complicated related tasks more efficiently.

Key words: multi-task learning, information sharing, negative transfer, neural network, transfer learning

摘要：

针对多任务学习模型中相关度低的任务之间存在的负迁移现象和信息共享困难问题，提出了一种基于交叉层级数据共享的多任务模型。该模型关注细粒度的知识共享，且能保留浅层共享专家的记忆能力和深层特定任务专家的泛化能力。首先，统一多层级共享专家，以获取复杂相关任务间的公共知识；然后，将共享信息分别迁移到不同层级的特定任务专家之中，从而在上下层之间共享部分公共知识；最后，利用基于数据样本的门控网络自主选择不同任务所需信息，从而减轻样本依赖性对模型的不利影响。相较于多门控混合专家（MMOE）模型，所提模型在UCI census-income数据集上对两个任务的F1值分别提高了7.87个百分点和1.19个百分点；且在MovieLens数据集上的回归任务的均方误差（MSE）值降低到0.004 7，分类任务的AUC值提高到0.642。实验结果表明，所提出的模型适用于改善负迁移现象的影响，且能更高效地学习复杂相关任务之间的公共信息。

关键词: 多任务学习, 信息共享, 负迁移, 神经网络, 迁移学习

CLC Number:

TP311.1

Ying CHEN, Jiong YU, Jiaying CHEN, Xusheng DU. Cross-layer data sharing based multi-task model[J]. Journal of Computer Applications, 2022, 42(5): 1447-1454.

陈颖, 于炯, 陈嘉颖, 杜旭升. 基于交叉层级数据共享的多任务模型[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1447-1454.

Figures/Tables 9

References 21

1	CARUANA R. Multitask learning ［M］// THRUN S， PRATT L. Learning to Learn. New York： Springer， 1998： 95-133. 10.1007/978-1-4615-5529-2_5
2	章荪，尹春勇.基于多任务学习的时序多模态情感分析模型［J］.计算机应用，2021，41（6）：1631-1639. 10.11772/j.issn.1001-9081.2020091416
	ZHANG S， YIN C Y. Sequential multimodal sentiment analysis model based on multi-task learning ［J］. Journal of Computer Applications， 2021， 41（6）： 1631-1639. 10.11772/j.issn.1001-9081.2020091416
3	姜尧岗，孙晓刚，林云.基于多任务卷积神经网络人脸检测网络的优化加速方法［J］.计算机应用，2019，39（S2）：59-62.
	JIANG Y G， SUN X G， LIN Y. Optimization acceleration method for face detection network based on multi-task convolutional neural network［J］. Journal of Computer Applications， 2019， 39（S2）： 59-62.
4	BANSAL T， BELANGER D， MCCALLUM A. Ask the GRU： multitask learning for deep text recommendations ［C］// Proceedings of the 2016 10th ACM Conference on Recommender Systems. New York： ACM， 2016： 107-114. 10.1145/2959100.2959180
5	SHAO C J， FU H M， CHENG P J. Improving one-class recommendation with multi-tasking on various preference intensities ［C］// Proceedings of the 2020 14th ACM Conference on Recommender Systems. New York： ACM， 2020： 498-502. 10.1145/3383313.3412224
6	LU Y C， DONG R H， SMYTH B. Why I like it： multi-task learning for recommendation and explanation ［C］// Proceedings of the 2018 12th ACM Conference on Recommender Systems. New York： ACM， 2018： 4-12. 10.1145/3240323.3240365
7	TANG H Y， LIU J N， ZHAO M， et al. Progressive Layered Extraction （PLE）： a novel Multi-Task Learning （MTL） model for personalized recommendations ［C］// Proceedings of the 2020 14th ACM Conference on Recommender Systems. New York： ACM， 2020： 269-278. 10.1145/3383313.3412236
8	MA J Q， ZHAO Z， YI X Y， et al. Modeling task relationships in multi-task learning with multi-gate mixture-of-experts ［C］// Proceedings of the 2018 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York： ACM， 2018： 1930-1939. 10.1145/3219819.3220007
9	ISHAN M， ABHINAV S， GUPTA A， et al. Cross-stitch networks for multi-task learning ［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 3994-4003. 10.1109/cvpr.2016.433
10	RUDER S， BINGEL J， AUGENSTEIN I， et al. Sluice networks： learning what to share between loosely related tasks ［EB/OL］. ［2021-02-11］. .
11	ZHANG Y， YANG Q. An overview of multi-task learning ［J］. National Science Review， 2018， 5（1）： 30-43. 10.1093/nsr/nwx105
12	JACOBS R A， JORDAN M I， NOWLAN S J， et al. Adaptive mixtures of local experts ［J］. Neural Computation， 1991， 3（1）： 79-87. 10.1162/neco.1991.3.1.79
13	MA J Q， ZHAO Z， CHEN J L， et al. SNR： sub-network routing for flexible parameter sharing in multi-task learning［C］// Proceedings of the 2019 33rd AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2019： 216-223. 10.1609/aaai.v33i01.3301216
14	ZOPH B， LE Q V. Neural architecture search with reinforcement learning ［EB/OL］. ［2021-02-11］. .
15	WANG N， WANG H N， JIA Y L， et al. Explainable recommendation via multi-task learning in opinionated text data ［C］// Proceedings of the 2018 41st International ACM SIGIR Conference on Research and Development in Information Retrieval. New York： ACM， 2018： 165-174. 10.1145/3209978.3210010
16	WANG J L， HOI S C H， ZHAO P L， et al. Online multitask collaborative filtering for on-the-fly recommender systems ［C］// Proceedings of the 2013 7th ACM Conference on Recommender Systems. New York： ACM， 2013： 237-244. 10.1145/2507157.2507176
17	QIN Z， CHENG Y C， ZHAO Z， et al. Multitask mixture of sequential experts for user activity streams ［C］// Proceedings of the 2020 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York： ACM， 2020： 3083-3091. 10.1145/3394486.3403359
18	HADASH G， SHALOM O S， OSADCHY R. Rank and rate： multi-task learning for recommender systems ［C］// Proceedings of the 2018 12th ACM Conference on Recommender Systems. New York： ACM， 2018： 451-454. 10.1145/3240323.3240406
19	AKHTAR M S， CHAUHAN D S， EKBAL A. A deep multi-task contextual attention framework for multi-modal affect analysis ［J］. ACM Transaction on Knowledge Discovery from Data， 2020， 14（3）： Article No.32. 10.1145/3380744
20	ZHAO Z， HONG L C， WEI L， et al. Recommending what video to watch next： a multi-task ranking system ［C］// Proceedings of the 2019 13th ACM Conference on Recommender Systems. New York： ACM， 2019： 43-51. 10.1145/3298689.3346997
21	ROSENBLATT F. The perceptron： a probabilistic model for information storage and organization in the brain ［J］. Psychological Review， 1958， 65（6）： 386-408. 10.1037/h0042519

数据集	训练集样本数	验证集样本数	测试集样本数
合成数据1	1 000 000	100 000	100 000
合成数据2	100 000	10 000	10 000

数据集	训练集样本数	验证集样本数	测试集样本数
合成数据1	1 000 000	100 000	100 000
合成数据2	100 000	10 000	10 000

数据集	总样本数	训练集样本数	验证集样本数	测试集样本数
UCI census-income	299 285	199 523	49 881	49 881
MovieLens	100 000	70 000	—	30 000

数据集	总样本数	训练集样本数	验证集样本数	测试集样本数
UCI census-income	299 285	199 523	49 881	49 881
MovieLens	100 000	70 000	—	30 000

模型	Task1-Income			Task2-Marital			MCV-AUC	MCV-F1	MCV-ACC
模型	AUC	F1-score	ACC	AUC	F1-score	ACC	MCV-AUC	MCV-F1	MCV-ACC
Single-task	0.932 5	0.693 1	0.952 0	0.970 8	0.927 0	0.928 3	—	—	—
Shared-bottom	0.904 9	0.643 6	0.845 1	0.974 2	0.931 3	0.932 7	1.879 1	1.574 9	1.777 8
Cross-stitch	0.929 4	0.742 3	0.950 5	0.984 3	0.933 4	0.934 5	1.913 7	1.675 7	1.885 0
PLE	0.941 5	0.713 9	0.950 9	0.980 6	0.927 2	0.929 0	1.922 1	1.641 1	1.879 9
MMOE	0.939 3	0.679 0	0.948 2	0.984 9	0.932 5	0.933 6	1.924 2	1.611 5	1.881 8
CLS-0	0.946 1	0.753 4	0.953 2	0.986 0	0.933 5	0.934 6	1.932 1	1.687 8	1.887 8
CLS	0.946 8	0.757 7	0.953 3	0.988 7	0.944 4	0.945 8	1.935 5	1.702 1	1.899 1

Cross-layer data sharing based multi-task model

基于交叉层级数据共享的多任务模型

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 9

References 21

Related Articles 15

Recommended Articles

Metrics

[1]	Zhen QU, Kunting LI, Zhixi FENG. Remote sensing image scene classification based on effective channel attention [J]. Journal of Computer Applications, 2022, 42(5): 1431-1439.
[2]	Mo LI, Tianliang LU, Ziheng XIE. Android malware family classification method based on code image integration [J]. Journal of Computer Applications, 2022, 42(5): 1490-1499.
[3]	Yifei WANG, Lei YU, Fei TENG, Jiayu SONG, Yue YUAN. Resource load prediction model based on long-short time series feature fusion [J]. Journal of Computer Applications, 2022, 42(5): 1508-1515.
[4]	Xiaohan LI, Huading JIA, Xue CHENG, Taiyong LI. Stock market volatility prediction method based on improved genetic algorithm and graph neural network [J]. Journal of Computer Applications, 2022, 42(5): 1624-1633.
[5]	Hexuan HU, Huachao SUI, Qiang HU, Ye ZHANG, Zhenyun HU, Nengwu MA. Runoff forecast model based on graph attention network and dual-stage attention mechanism [J]. Journal of Computer Applications, 2022, 42(5): 1607-1615.
[6]	Xinlin XIE, Yi XIAO, Xinying XU. Lung nodule classification algorithm based on neural network architecture search [J]. Journal of Computer Applications, 2022, 42(5): 1424-1430.
[7]	Li’e WANG, Xiaocong LI, Hongyi LIU. News recommendation method with knowledge graph and differential privacy [J]. Journal of Computer Applications, 2022, 42(5): 1339-1346.
[8]	Xueqin CHEN, Tao TAO, Zhongwang ZHANG, Yilei WANG. Long short-term session-based recommendation algorithm combining paired coding scheme and two-dimensional conventional neural network [J]. Journal of Computer Applications, 2022, 42(5): 1347-1354.
[9]	Zhihua LIU, Wenjie CHEN, Aibin CHEN. Homologous spectrogram feature fusion with self-attention mechanism for bird sound classification [J]. Journal of Computer Applications, 2022, 42(4): 1260-1268.
[10]	Junhua GU, Rui WANG, Ningning LI, Suqi ZHANG. Knowledge graph attention network fusing collaborative filtering information [J]. Journal of Computer Applications, 2022, 42(4): 1087-1092.
[11]	Lie PAN, Cheng ZENG, Haifeng ZHANG, Chaodong WEN, Rusong HAO, Peng HE. Text sentiment analysis method combining generalized autoregressive pre-training language model and recurrent convolutional neural network [J]. Journal of Computer Applications, 2022, 42(4): 1108-1115.
[12]	Changqing JI, Zhiyong GAO, Jing QIN, Zumin WANG. Review of image classification algorithms based on convolutional neural network [J]. Journal of Computer Applications, 2022, 42(4): 1044-1049.
[13]	Guifang QIAO, Shouming HOU, Yanyan LIU. Facial expression recognition algorithm based on combination of improved convolutional neural network and support vector machine [J]. Journal of Computer Applications, 2022, 42(4): 1253-1259.
[14]	Haojie CHEN, Jiangting FAN, Yong LIU. Solving dynamic traveling salesman problem by deep reinforcement learning [J]. Journal of Computer Applications, 2022, 42(4): 1194-1200.
[15]	Wenjing JIANG, Xi XIONG, Zhongzhi LI, Binyong LI. Recommendation system based on non-sampling collaborative knowledge graph network [J]. Journal of Computer Applications, 2022, 42(4): 1057-1064.