Transfer learning model based on improved domain separation network

doi:10.11772/j.issn.1001-9081.2022071103

Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (8): 2382-2389.DOI: 10.11772/j.issn.1001-9081.2022071103

• Artificial intelligence • Previous Articles

Transfer learning model based on improved domain separation network

Zexi JIN¹, Lei LI¹^,², Ji LIU¹^,²

^1.Institute of Statistics and Data Science，Xinjiang University of Finance and Economics，Urumqi Xinjiang 830012，China
^2.Xinjiang Social and Economic Statistics and Big Data Application Research Center （Xinjiang University of Finance and Economics），Urumqi Xinjiang 830012，China

Received:2022-07-29 Revised:2022-11-21 Accepted:2022-11-30 Online:2023-01-15 Published:2023-08-10
Contact: Lei LI
About author:JIN Zexi， born in 1998， M. S. candidate. His research interests include machine learning， big data analysis.
LIU Ji， born in 1974， Ph. D.， professor. His research interests include data intelligent analysis.
Supported by:
National Natural Science Foundation of China(71762028)

基于改进领域分离网络的迁移学习模型

金泽熙¹, 李磊¹^,², 刘继¹^,²

^1.新疆财经大学统计与数据科学学院, 乌鲁木齐 830012
^2.新疆社会经济统计与大数据应用研究中心(新疆财经大学), 乌鲁木齐 830012

通讯作者: 李磊
作者简介:金泽熙（1998—），男，江苏盐城人，硕士研究生，主要研究方向：机器学习、大数据分析
刘继（1974—），男，四川达州人，教授，博士，主要研究方向：数据智能分析。
基金资助:
国家自然科学基金资助项目(71762028)

Abstract

Abstract:

In order to further improve the feature recognition and extraction efficiency of transfer learning， reduce negative transfer and enhance the learning performance of the model， a transfer learning model based on improved Domain Separation Network （DSN） — AMCN-DSN （Attention Mechanism Capsule Network-DSN） was proposed. Firstly， the extraction and reconstruction of feature information in the source and target domains were accomplished by using Multi-Head Attention CapsNet （MHAC）， the feature information was filtered effectively based on the attention mechanism， and the capsule network was adopted to improve the extraction quality of deep information. Secondly， a dynamic adversarial factor was introduced to optimize the reconstruction loss function， so that the reconstructor was able to dynamically measure the relative importance of the source and target domain information to improve the robustness and convergence speed of transfer learning. Finally， a multi-head self-attention mechanism was incorporated into the classifier to enhance the semantic understanding of the public features and improve the classification performance. In the sentiment analysis experiments， compared to other transfer learning models， the proposed model can transfer the learned knowledge to tasks with less data but high similarity with the least degradation of classification performance and good transfer performance. In the intent recognition experiments， the proposed model improves the precision， recall and F1 score by 4.5%， 4.3% and 4.4% respectively， compared to the model with suboptimal classification performance — Capsule Network improved Domain Adversarial Neural Network （DANN+CapsNet） model， showing certain advantages of the proposed model in dealing with small data problems and personalization problems. In comparison with DSN， AMCN-DSN has the F1 scores on the target domain in the above-mentioned two types of experiments improved by 6.0% and 12.4% respectively， further validating the effectiveness of the improved model.

Key words: transfer learning, Domain Separation Network (DSN), capsule network, attention mechanism, Natural Language Processing (NLP)

摘要：

为进一步提高迁移学习的特征识别和提取效率、减少负迁移并增强模型的学习性能，提出了一种基于改进领域分离网络（DSN）的迁移学习模型AMCN-DSN（Attention Mechanism Capsule Network-DSN）。首先，使用融合多头注意力机制的胶囊网络（MHAC）完成源域和目标域特征信息的提取与重构，基于注意力机制有效筛选特征信息，并利用胶囊网络提高深层信息的提取质量；其次，引入动态对抗因子优化重构损失函数，使重构器可动态衡量源域与目标域信息的相对重要性，从而增强迁移学习的鲁棒性和提升收敛速度；最后，在分类器中融入多头自注意力机制，以强化对公有特征的语义理解并提高分类性能。在情感分析实验中，相较于其他迁移学习模型，所提模型能够将学习到的知识迁移到数据量少但相似性高的任务中，分类性能的下降幅度最小，迁移表现较好；在意图识别实验中，相较于分类性能次优的胶囊网络改进领域对抗神经网络（DANN+CapsNet）模型，所提模型的精确度、召回率和F1值分别提升了4.5%、4.3%和4.4%，表明所提模型在处理小数据问题和个性化问题上具有一定优势。与DSN相比，AMCN-DSN在上述两类实验目标域上的F1值分别提高了6.0%和12.4%，进一步验证了改进模型的有效性。

关键词: 迁移学习, 领域分离网络, 胶囊网络, 注意力机制, 自然语言处理

CLC Number:

TP391.1

Zexi JIN, Lei LI, Ji LIU. Transfer learning model based on improved domain separation network[J]. Journal of Computer Applications, 2023, 43(8): 2382-2389.

金泽熙, 李磊, 刘继. 基于改进领域分离网络的迁移学习模型[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2382-2389.

Figures/Tables 10

Fig. 1 Architecture of DSN

Fig. 2 Model framework and training process of AMCN-DSN

Fig. 3 Structure of MHAC

Tab.1 Parameter setting of AMCN-DSN model

参数类型	可调参数	值
胶囊网络参数	初始胶囊维度	16
	主胶囊维度	32
	动态路由次数Routing	3
	胶囊网络优化函数	Adam
模型训练参数	学习率lr	0.001
	迭代次数epoch	20
	批处理大小batch_size	32

Tab.2 Performance comparison of transfer learning models in sentiment analysis

模型	源域数据集	目标域数据集
模型	F1	P	R	F1
DCGAN^［23］	0.855 3	0.773 9	0.784 1	0.779 0
DANN^［12］	0.895 4	0.810 4	0.812 6	0.811 5
DANN+CapsNet^［4］	0.915 5	0.824 2	0.831 5	0.827 8
ADDA^［14］	0.902 1	0.819 2	0.816 5	0.817 8
Res-CapsNet^［24］	0.917 4	0.832 2	0.830 7	0.831 4
DAAN^［15］	0.910 2	0.829 5	0.832 2	0.830 8
ASDA^［17］	0.889 6	0.807 9	0.797 0	0.802 4
AMCN-DSN	0.9239	0.8577	0.8429	0.8502

Tab.3 Performance comparison of transfer learning models in intent recognition

模型	源域数据集	目标域数据集
模型	F1	P	R	F1
DCGAN^［23］	0.746 3	0.658 7	0.651 4	0.655 0
DANN^［12］	0.791 0	0.702 7	0.706 9	0.704 8
DANN+CapsNet^［4］	0.823 1	0.742 6	0.741 1	0.741 8
ADDA^［14］	0.802 5	0.712 2	0.701 6	0.706 9
Res-CapsNet^［24］	0.815 8	0.740 9	0.740 3	0.740 6
DAAN^［15］	0.821 4	0.735 9	0.733 2	0.734 5
ADSA^［17］	0.771 9	0.673 3	0.691 4	0.682 2
AMCN-DSN	0.8439	0.7758	0.7733	0.7745

Tab.4 Ablation experimental results on sentiment analysis and intent recognition tasks

模型	情感分析任务				意图识别任务
	源域	目标域			源域	目标域
	F1	P	R	F1	F1	P	R	F1
DSN^［16］	0.892 4	0.809 2	0.795 5	0.802 2	0.774 2	0.686 1	0.691 8	0.688 9
AMCN-DSN（-MSHA， $- ω ̂$ ）	0.910 2	0.816 6	0.819 5	0.822 7	0.814 7	0.732 3	0.739 8	0.736 0
AMCN-DSN（ $- ω ̂$ ）	0.903 7	0.835 0	0.841 6	0.838 3	0.826 8	0.747 5	0.746 6	0.747 0
AMCN-DSN（-MSHA）	0.914 5	0.844 2	0.836 9	0.840 5	0.833 5	0.756 3	0.753 5	0.754 9
AMCN-DSN	0.9239	0.8577	0.8429	0.8502	0.8439	0.7758	0.7733	0.7745

Tab.4 Ablation experimental results on sentiment analysis and intent recognition tasks

模型	情感分析任务				意图识别任务
	源域	目标域			源域	目标域
	F1	P	R	F1	F1	P	R	F1
DSN^［16］	0.892 4	0.809 2	0.795 5	0.802 2	0.774 2	0.686 1	0.691 8	0.688 9
AMCN-DSN（-MSHA， $- ω ̂$ ）	0.910 2	0.816 6	0.819 5	0.822 7	0.814 7	0.732 3	0.739 8	0.736 0
AMCN-DSN（ $- ω ̂$ ）	0.903 7	0.835 0	0.841 6	0.838 3	0.826 8	0.747 5	0.746 6	0.747 0
AMCN-DSN（-MSHA）	0.914 5	0.844 2	0.836 9	0.840 5	0.833 5	0.756 3	0.753 5	0.754 9
AMCN-DSN	0.9239	0.8577	0.8429	0.8502	0.8439	0.7758	0.7733	0.7745

Fig. 4 Effect of improving feature extractor

Fig. 5 Loss-epoch curves based on different reconstruction loss functions

Fig.6 Heatmaps of sentiment analysis for different models

References 24

1	SCHMIDHUBER J. Deep learning in neural networks： an overview［J］. Neural Networks， 2015， 61： 85-117. 10.1016/j.neunet.2014.09.003
2	ZHUANG F Z， QI Z Y， DUAN K Y， et al. A comprehensive survey on transfer learning［J］. Proceedings of the IEEE， 2021， 109（1）： 43-76. 10.1109/jproc.2020.3004555
3	吴冬茵，桂林，陈钊，等. 基于深度表示学习和高斯过程迁移学习的情感分析方法［J］. 中文信息学报， 2017， 31（1）： 169-176.
	WU D Y， GUI L， CHEN Z， et al. Sentiment analysis based on deep representation learning and Gaussian processes transfer learning［J］. Journal of Chinese Information Processing， 2017， 31（1）：169-176.
4	赵鹏飞，李艳玲，林民. 结合胶囊网络的领域适应意图识别［J］. 计算机工程与应用， 2021， 57（21）： 188-194.
	ZHAO P F， LI Y L， LIN M. Intent detection of domain adaptation combined with capsule network［J］. Computer Engineering and Applications， 2021， 57（21）： 188-194.
5	范涛，王昊，陈玥彤. 基于深度迁移学习的地方志多模态命名实体识别研究［J］. 情报学报， 2022， 41（4）： 412-423. 10.3772/j.issn.1000-0135.2022.04.008
	FAN T， WANG H， CHEN Y T. Research on multimodal named entity recognition of local history based on deep transfer learning［J］. Journal of the China Society for Scientific and Technical Information， 2022， 41（4）： 412-423. 10.3772/j.issn.1000-0135.2022.04.008
6	赵鹏飞，李艳玲，林民. 面向迁移学习的意图识别研究进展［J］. 计算机科学与探索， 2020， 14（8）： 1261-1274.
	ZHAO P F， LI Y L， LIN M. Research progress on intent detection oriented to transfer learning［J］. Journal of Frontiers of Computer Science and Technology， 2020， 14（8）： 1261-1274.
7	PAN S J， YANG Q. A survey on transfer learning［J］. IEEE Transactions on Knowledge and Data Engineering， 2010， 22（10）： 1345-1359. 10.1109/tkde.2009.191
8	DAI W Y， YANG Q， XUE G R， et al. Boosting for transfer learning［C］// Proceedings of the 24th International Conference on Machine Learning. New York： ACM， 2007：193-200. 10.1145/1273496.1273521
9	BO C， LAM W， TSANG I， et al. Extracting discriminative concepts for domain adaptation in text mining［C］// Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York： ACM， 2009：179-188. 10.1145/1557019.1557045
10	CHANG H， HAN J， ZHONG C， et al. Unsupervised transfer learning via multi-scale convolutional sparse coding for biomedical applications［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2018， 40（5）： 1182-1194. 10.1109/tpami.2017.2656884
11	GOODFELLOW I， POUGET-ABADIE J， MIRZA M， et al. Generative adversarial nets［C］// Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2. Cambridge： MIT Press， 2014：2672-2680.
12	GANIN Y， USTINOVA E， AJAKAN H， et al. Domain-adversarial training of neural networks［J］. Journal of Machine Learning Research， 2016， 17： 1-35. 10.1007/978-3-319-58347-1_10
13	TZENG E， HOFFMAN J， DARRELL T， et al. Simultaneous deep transfer across domains and tasks［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 4068-4076. 10.1109/iccv.2015.463
14	TZENG E， HOFFMAN J， SAENKO K， et al. Adversarial discriminative domain adaptation［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 2962-2971. 10.1109/cvpr.2017.316
15	YU C H， WANG J D， CHEN Y Q， et al. Transfer learning with dynamic adversarial adaptation network［C］// Proceedings of the 2019 IEEE International Conference on Data Mining. Piscataway： IEEE， 2019： 778-786. 10.1109/icdm.2019.00088
16	BOUSMALIS K， TRIGEORGIS G， SILBERMAN N， et al. Domain separation networks［C］// Proceedings of the 30th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2016：343-351.
17	TSAI J C， CHIEN J T. Adversarial domain separation and adaptation［C］// Proceedings of the IEEE 27th International Workshop on Machine Learning for Signal Processing. Piscataway： IEEE， 2017： 1-6. 10.1109/mlsp.2017.8168121
18	BEN-DAVID S， BLITZER J， CRAMMER K， et al. Analysis of representations for domain adaptation［C］// Proceedings of the 19th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2006：137-144. 10.7551/mitpress/7503.003.0022
19	BEN-DAVID S， BLITZER J， CRAMMER K， et al. A theory of learning from different domains［J］. Machine Learning， 2010， 79（1/2）： 151-175. 10.1007/s10994-009-5152-4
20	王家乾，龚子寒，薛云，等. 基于混合多头注意力和胶囊网络的特定目标情感分析［J］. 中文信息学报， 2020， 34（5）： 100-110. 10.3969/j.issn.1003-0077.2020.05.014
	WANG J Q， GONG Z H， XUE Y， et al. Aspect-based sentiment analysis based on hybrid multi-head attention and capsule networks［J］. Journal of Chinese Information Processing， 2020， 34（5）： 100-110. 10.3969/j.issn.1003-0077.2020.05.014
21	China Computer Federation. NLPCC 2014 Evaluation tasks test data download［DB/OL］. ［2022-11-20］..
22	Corporation iFLYTEK. SMP2018-ECDT task 1 dataset［DS/OL］. ［2022-11-20］..
23	FANG W， ZHANG F H， SHENG V S， et al. A method for improving CNN-based image recognition using DCGAN［J］. Computers， Materials and Continua， 2018， 57（1）： 167-178. 10.32604/cmc.2018.02356
24	戴宏，盛立杰，苗启广. 基于胶囊网络的对抗判别域适应算法［J］. 计算机研究与发展， 2021， 58（9）： 1997-2012. 10.7544/issn1000-1239.2021.20200569
	DAI H， SHENG L J， MIAO Q G. Adversarial discriminative domain adaptation algorithm with CapsNet［J］. Journal of Computer Research and Development， 2021， 58（9）： 1997-2012. 10.7544/issn1000-1239.2021.20200569

[1]	Yuan LIU, Yongquan DONG, Rui JIA, Haolin YANG. Hierarchical and phased attention network model for personalized course recommendation [J]. Journal of Computer Applications, 2023, 43(8): 2358-2363.
[2]	Jinghong WANG, Zhixia ZHOU, Hui WANG, Haokang LI. Attribute network representation learning with dual auto-encoder [J]. Journal of Computer Applications, 2023, 43(8): 2338-2344.
[3]	Min LIANG, Jiayi LIU, Jie LI. Image super-resolution reconstruction method based on iterative feedback and attention mechanism [J]. Journal of Computer Applications, 2023, 43(7): 2280-2287.
[4]	Kunpei YE, Xi XIONG, Zhe DING. Recruitment recommendation model based on field fusion and time weight [J]. Journal of Computer Applications, 2023, 43(7): 2133-2139.
[5]	Bona XUAN, Jin LI, Yafei SONG, Zexuan MA. Malicious code classification method based on improved MobileNetV2 [J]. Journal of Computer Applications, 2023, 43(7): 2217-2225.
[6]	Shuai ZHENG, Xiaolong ZHANG, He DENG, Hongwei REN. 3D liver image segmentation method based on multi-scale feature fusion and grid attention mechanism [J]. Journal of Computer Applications, 2023, 43(7): 2303-2310.
[7]	Yuan WEI, Yan LIN, Shengnan GUO, Youfang LIN, Huaiyu WAN. Prediction of taxi demands between urban regions by fusing origin-destination spatial-temporal correlation [J]. Journal of Computer Applications, 2023, 43(7): 2100-2106.
[8]	Zhongyu LI, Haodong SUN, Jiao LI. Lightweight gesture recognition algorithm for basketball referee [J]. Journal of Computer Applications, 2023, 43(7): 2173-2181.
[9]	Yuxin TUO, Tao XUE. Joint triple extraction model combining pointer network and relational embedding [J]. Journal of Computer Applications, 2023, 43(7): 2116-2124.
[10]	Yuanyuan QIN, Hong ZHANG. Pulmonary nodule detection algorithm based on attention feature pyramid networks [J]. Journal of Computer Applications, 2023, 43(7): 2311-2318.
[11]	Yao LIU, Xin TONG, Yifeng CHEN. Algorithm path self-assembling model for business requirements [J]. Journal of Computer Applications, 2023, 43(6): 1768-1778.
[12]	Ke FANG, Rong LIU, Chiyu WEI, Xinyue ZHANG, Yang LIU. Pedestrian fall detection algorithm in complex scenes [J]. Journal of Computer Applications, 2023, 43(6): 1811-1817.
[13]	Bin LU, Jielin LIU. Semantic segmentation for 3D point clouds based on feature enhancement [J]. Journal of Computer Applications, 2023, 43(6): 1818-1825.
[14]	Yi ZHANG, Zhenmei WANG. circRNA-disease association prediction by two-stage fusion on graph auto-encoder [J]. Journal of Computer Applications, 2023, 43(6): 1979-1986.
[15]	Huibin ZHANG, Liping FENG, Yaojun HAO, Yining WANG. Ancient mural dynasty identification based on attention mechanism and transfer learning [J]. Journal of Computer Applications, 2023, 43(6): 1826-1832.

Transfer learning model based on improved domain separation network

基于改进领域分离网络的迁移学习模型

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 10

References 24

Related Articles 15

Recommended Articles

Metrics