基于改进分段卷积神经网络和知识蒸馏的学科知识实体间关系抽取

doi:10.11772/j.issn.1001-9081.2023081065

《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (8): 2421-2429.DOI: 10.11772/j.issn.1001-9081.2023081065

基于改进分段卷积神经网络和知识蒸馏的学科知识实体间关系抽取

赵宇博, 张丽萍(), 闫盛, 侯敏, 高茂

内蒙古师范大学计算机科学技术学院，呼和浩特 010022

收稿日期:2023-08-08 修回日期:2023-11-06 接受日期:2023-11-15 发布日期:2023-12-18 出版日期:2024-08-10
通讯作者: 张丽萍
作者简介:赵宇博（1999—），男，内蒙古赤峰人，硕士研究生，CCF会员，主要研究方向：知识图谱、教育数据挖掘
张丽萍（1974—），女，内蒙古呼和浩特人，教授，硕士，CCF会员，主要研究方向：智能教育、软件工程 cieczlp@imnu.edu.cn
闫盛（1984—），男，内蒙古包头人，硕士，CCF会员，主要研究方向：计算机教育
侯敏（1973—），女，内蒙古乌兰察布人，副教授，硕士，CCF会员，主要研究方向：软件分析、智能教育
高茂（1997—），男，内蒙古呼和浩特人，硕士，主要研究方向：计算机教育。
基金资助:
内蒙古自然科学基金资助项目(2023LHMS06009);内蒙古自治区教育科学研究“十四五”规划2023年度课题(2023NGHZX?ZH119);内蒙古师范大学研究生科研创新基金资助项目(CXJJS23067);内蒙古师范大学基本科研业务费专项(2022JBXC018)

Relation extraction between discipline knowledge entities based on improved piecewise convolutional neural network and knowledge distillation

Yubo ZHAO, Liping ZHANG(), Sheng YAN, Min HOU, Mao GAO

College of Computer Science and Technology，Inner Mongolia Normal University，Hohhot Inner Mongolia 010022，China

Received:2023-08-08 Revised:2023-11-06 Accepted:2023-11-15 Online:2023-12-18 Published:2024-08-10
Contact: Liping ZHANG
About author:ZHAO Yubo， born in 1999， M. S. candidate. His researchinterests include knowledge graph， educational data mining.
ZHANG Liping， born in 1974， M. S.， professor. Her researchinterests include intelligence education， software engineering.
YAN Sheng ， born in 1984， M. S. His research interests includecomputer education.
HOU Min ， born in 1973， M. S.， associate professor. Her researchinterests include software analysis， intelligence education.
GAO Mao， born in 1997， M. S. His research interests includecomputer education.
Supported by:
This work is partially supported by Inner Mongolia Natural ScienceFoundation （2023LHMS06009）； 2023 Project of 14th Five Year Plan forInner Mongolia Education Science Research （2023NGHZX-ZH119，NGJGH2023234）； Graduate Research & Innovation Fund of InnerMongolia Normal University （CXJJS23067， CXJJS22137）； SpecialProject for Basic Scientific Research Business Expenses of InnerMongolia Normal University（ 2022JBXC018）.

摘要/Abstract

摘要：

关系抽取是梳理学科知识的重要手段以及构建教育知识图谱的重要步骤。在当前研究中，如BERT（Bidirectional Encoder Representations from Transformers）等以Transformer架构为基础的预训练语言模型多数存在参数量大、复杂度过高的问题，难以部署于终端设备，限制了在真实教育场景中的应用。此外，大多数传统的轻量级关系抽取模型并不是通过文本结构对数据进行建模，容易忽略实体间的结构信息；且生成的词嵌入向量难以捕捉文本的上下文特征、对一词多义问题解决能力差，难以契合学科知识文本非结构化以及专有名词占比大的特点，不利于高质量的关系抽取。针对上述问题，提出一种基于改进分段卷积神经网络（PCNN）和知识蒸馏（KD）的学科知识实体间关系抽取方法。首先，利用BERT生成高质量的领域文本词向量，改进PCNN模型的输入层，从而有效捕捉文本上下文特征并在一定程度上解决一词多义问题；其次，利用卷积和分段最大池化操作深入挖掘实体间结构信息，构建BERT-PCNN模型，实现高质量的关系抽取；最后，考虑到教育场景对高效且轻量化模型的需求，蒸馏BERT-PCNN模型输出层和中间层知识，用于指导PCNN模型，完成KD-PCNN模型的构建。实验结果表明，BERT-PCNN模型的加权平均F1值达到94%，相较于R-BERT和EC_BERT模型分别提升了1和2个百分点；KD-PCNN模型的加权平均F1值达到92%，与EC_BERT模型持平；参数量相较于BERT-PCNN、KD-RB-l模型下降了3个数量级。可见，所提方法能在性能评价指标和网络参数量之间更好地权衡，有利于教育知识图谱自动化构建水平的提高和新型教育应用的研发与部署。

关键词: 关系抽取, 分段卷积神经网络, 知识蒸馏, 知识图谱, 学科知识, 神经网络

Abstract:

Relational extraction is an important means of sorting out discipline knowledge as well as an important step in the construction of educational knowledge graph. In the current research， most of the pre-trained language models based on the Transformer architecture， such as the Bidirectional Encoder Representations from Transformers （BERT）， suffer from large number of parameters and excessive complexity， which make them difficult to be deployed on end devices and limite their applications in real educational scenarios. In addition， most traditional lightweight relation extraction models do not model the data through text structure， which are easy to ignore the structural information between entities， and the generated word embedding vectors are difficult to capture the contextual features of the text， have poor ability to solve the problem of multiple meanings of words， and are difficult to fit the unstructured nature of discipline knowledge texts and the high proportion of proper nouns， which is not conducive to high-quality relation extraction. In order to solve the above problems， a relation extraction method between discipline knowledge entities based on improved Piecewise Convolutional Neural Network （PCNN） and Knowledge Distillation （KD） was proposed. Firstly， BERT was used to generate high-quality domain text word vectors to improve the input layer of the PCNN model， so as to effectively capture the text context features and solve the problem of multiple meanings of words to a certain extent. Then， convolution and piecewise max pooling operations were utilized to deeply mine inter-entity structural information， constructing the BERT-PCNN model， and achieving high-quality relation extraction. Lastly， by taking into account the demands for efficient and lightweight models in educational scenarios， the knowledge of the output layer and middle layer of the BERT-PCNN model was distilled for guiding the PCNN model to complete the construction of the KD-PCNN model. The experimental results show that， the weighted-average F1 of the BERT-PCNN model reaches 94%， which is improved by 1 and 2 percentage points compared with the R-BERT and EC_BERT models； the weighted-average F1 of the KD-PCNN model reaches 92%， which is the same as the EC_BERT model， and the parameter quantity of the KD-PCNN model decreased by 3 orders of magnitude compared with the BERT-PCNN and KD-RB-l models. It can be seen that the proposed method can achieve a better trade-off between the performance evaluation index and the network parameter quantity， which is conducive to the improvement of the automated construction level of educational knowledge graph and the development and deployment of new educational applications.

Key words: relation extraction, Piecewise Convolution Neural Network (PCNN), knowledge distillation, knowledge graph, discipline knowledge, neural network

中图分类号:

TP183

赵宇博, 张丽萍, 闫盛, 侯敏, 高茂. 基于改进分段卷积神经网络和知识蒸馏的学科知识实体间关系抽取[J]. 计算机应用, 2024, 44(8): 2421-2429.

Yubo ZHAO, Liping ZHANG, Sheng YAN, Min HOU, Mao GAO. Relation extraction between discipline knowledge entities based on improved piecewise convolutional neural network and knowledge distillation[J]. Journal of Computer Applications, 2024, 44(8): 2421-2429.

图/表 14

参考文献 32

1	BAIG M I， SHUIB L， YADEGARIDEHKORDI E. Big data in education： a state of the art， limitations， and future research directions［J］. International Journal of Educational Technology in Higher Education， 2020， 17： 44.
2	LAURI L， VIRKUS S， HEIDMETS M. Information cultures and strategies for coping with information overload： case of Estonian higher education institutions［J］. Journal of Documentation， 2020， 77（2）： 518-541.
3	JI S， PAN S， CAMBRIA E， et al. A survey on knowledge graphs： representation， acquisition， and applications［J］. IEEE Transactions on Neural Networks and Learning Systems， 2022， 33（2）： 494-514.
4	LIN J， ZHAO Y， HUANG W， et al. Domain knowledge graph-based research progress of knowledge representation［J］. Neural Computing and Applications， 2021， 33： 681-690.
5	鄂海红，张文静，肖思琪，等.深度学习实体关系抽取研究综述［J］.软件学报， 2019， 30（6）： 1793-1818.
	E H H， ZHANG W J， XIAO S Q，et al.Survey of entity relationship extraction based on deep learning［J］. Journal of Software， 2019， 30（6）： 1793-1818.
6	高茂，张丽萍.融合多模态资源的教育知识图谱的内涵、技术与应用研究［J］.计算机应用研究， 2022， 39（8）： 2257-2267.
	GAO M， ZHANG L P. Research on connotation， technology and application of educational knowledge graph based on multimodal resources［J］. Application Research of Computers， 2022， 39（8）： 2257-2267.
7	赵宇博，张丽萍，闫盛，等. 个性化学习中学科知识图谱构建与应用综述［J］.计算机工程与应用，2023， 59（10）： 1-21.
	ZHAO Y B， ZHANG L P， YAN S， et al. Construction and application of discipline knowledge graph in personalized learning［J］. Computer Engineering and Applications，2023， 59（10）： 1-21.
8	赵哲焕，杨志豪，孙聪，等. 生物医学文献中的蛋白质关系抽取研究［J］. 中文信息学报， 2018， 32（7）： 82-90.
	ZHAO Z H， YANG Z H， SUN C， et al. Protein-protein interaction extraction from biomedical literature［J］. Journal of Chinese Information Processing， 2018， 32（7）： 82-90.
9	PAN L， LI C， LI J， et al. Prerequisite relation learning for concepts in MOOCs［C］// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2017： 1447-1456.
10	韩萌，李蔚清.基于特征增强的中文STEM课程知识的关系抽取［J］.计算机应用研究， 2020， 37（S1）： 40-42.
	HAN M， LI W Q. Relationship extraction of Chinese STEM course knowledge based on feature enhancement［J］. Application Research of Computers， 2020， 37（S1）： 40-42.
11	SONG M， ZHAO J， GAO X. Research on entity relation extraction in education field based on multi-feature deep learning［C］// Proceedings of the 3rd International Conference on Big Data Technologies. New York： ACM， 2020： 102-106.
12	WANG H， QIN K， ZAKARI R Y， et al. Deep neural network-based relation extraction： an overview［J］. Neural Computing and Applications， 2022： 34： 4781-4801.
13	SONG D， XU J， PANG J， et al. Classifier-adaptation knowledge distillation framework for relation extraction and event detection with imbalanced data［J］. Information Sciences， 2021， 573： 222-238.
14	DEVLIN J， CHANG M-W， LEE K， et al. BERT： pre-training of deep bidirectional transformers for language understanding［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1 （Long and Short Papers）. Stroudsburg： ACL， 2019： 4171-4186.
15	ZENG D， LIU K， CHEN Y， et al. Distant supervision for relation extraction via piecewise convolutional neural networks［C］// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2015： 1753-1762.
16	HINTON G， VINYALS O， DEAN J. Distilling the knowledge in a neural network［EB/OL］. （2015-03-09）［2023-11-06］. .
17	国务院. 国务院关于印发新一代人工智能发展规划的通知［J］. 中华人民共和国国务院公报， 2017（22）： 7-21.
	Council State. Notice of the state council on issuing the development plan on the new generation of artificial intelligence ［J］. The Gazette of the State Council of the People’s Republic of China， 2017（22）： 7-21.
18	LUO L， YANG Z， CAO M， et al. A neural network-based joint learning approach for biomedical entity and relation extraction from biomedical literature［J］. Journal of Biomedical Informatics， 2020， 103： 103384.
19	排日旦·阿布都热依木，吐尔地·托合提，艾斯卡尔·艾木都拉.基于深度学习的实体关系抽取方法研究［J］.计算机工程与科学， 2023， 45（5）： 895-902.
	PERIDE A， TURDI T， ASKAR H.An entity relation extraction method based on deep learning［J］. Computer Engineering & Science，2023， 45（5）： 895-902.
20	葛艳，杜坤钰，杜军威，等.基于混合神经网络的实体关系抽取方法研究［J］.中文信息学报，2021，35（10）：81-89.
	GE Y， DU K Y， DU J W， et al. Entity relation extraction based on hybrid neural network［J］. Journal of Chinese Information Processing，2021， 35（10）： 81-89.
21	BUCKMAN J， ROY A， RAFFEL C， et al. Thermometer encoding： one hot way to resist adversarial examples［C/OL］// Proceedings of the 2015 International Conference on Learning Representations ［2023-08-01］. .
22	魏敏，张丽萍，闫盛.基于程序向量树和聚类的学生程序算法识别方法［J］.计算机工程与设计， 2022， 43（10）： 2790-2798.
	WEI M， ZHANG L P， YAN S. Student program algorithm recognition based on program vector tree and clustering［J］. Computer Engineering and Design，2022，43（10）：2790-2798.
23	陈德光，马金林，马自萍，等.自然语言处理预训练技术综述［J］.计算机科学与探索， 2021， 15（8）： 1359-1389.
	CHEN D G， MA J L， MA Z P， et al. Review of pre-training techniques for natural language processing［J］. Journal of Frontiers of Computer Science & Technology， 2021， 15（8）： 1359-1389.
24	WU S， HE Y. Enriching pre-trained language model with entity information for relation classification［C］// Proceedings of the 28th ACM International Conference on Information and Knowledge Management. New York： ACM， 2019： 2361-2364.
25	万莹，孙连英，赵平，等.基于信息增强BERT的关系分类［J］.中文信息学报， 2021， 35（3）： 69-77.
	WAN Y， SUN L P， ZHAO P，et al.Relation classification based on information enhanced BERT［J］.Journal of Chinese Information Processing， 2021， 35（3）： 69-77.
26	GOU J， YU B， MAYBANK S J， et al. Knowledge distillation： a survey［J］. International Journal of Computer Vision， 2021， 129： 1789-1819.
27	ZHANG L， SU J， MIN Z， et al. Exploring self-distillation based relational reasoning training for document-level relation extraction［J］. Proceedings of the AAAI Conference on Artificial Intelligence， 2023， 37（11）： 13967-13975.
28	TAN Q， HE R， BING L， et al. Document-level relation extraction with adaptive focal loss and knowledge distillation［C］// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2022： 1672-1681.
29	HAO S， TAN B， TANG K， et al. BertNet： harvesting knowledge graphs with arbitrary relations from pretrained language models［C］// Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2023： 5000-5015.
30	SCHUSTER M， NAKAJIMA K. Japanese and Korean voice search［C］// Proceedings of the 2012 IEEE International Conference on Acoustics， Speech And Signal Processing. Piscataway： IEEE， 2012： 5149-5152.
31	窦慧，张凌茗，韩峰，等.卷积神经网络的可解释性研究综述［J］. 软件学报，2024，35（1）：159-184.
	DOU H， ZHANG L M， HAN F， et al. Survey on convolutional neural network interpretability［J］. Journal of Software， 2024，35（1）：159-184.
32	邵仁荣，刘宇昂，张伟，等.深度学习中知识蒸馏研究综述［J］.计算机学报，2022，45（8）：1638-1673.
	SHAO R R， LIU Y A， ZHANG W， et al. A survey of knowledge distillation in deep learning［J］. Chinese Journal of Computers，2022，45（8）：1638-1673.

实体类型	关系类型	关系值域	示例
课程	包含	章，节，知识点	1）《C语言程序设计》包含第一章C语言概述 2）《C语言程序设计》包含第一章C语言概述的第一节——C语言的特点 3）《C语言程序设计》包含输出函数这个知识点
章	包含，顺序	章，节，知识点	1）第一章C语言概述包含C语言的特点这一节 2）第一章C语言概述包含输出函数这个知识点 3）第一章C语言概述和第二章程序的灵魂—算法属于顺序关系
节	包含，顺序	节，知识点	1）第一章第三节输入和输出函数包含输出函数这个知识点 2）第一章第三节输入和输出函数和第一章第四节C源程序结构特点是顺序关系
知识点	顺序，相关	知识点	1）常量和直接常量是顺序关系 2）输入函数和输出函数是相关关系
编程问题	step i	知识点	将华氏温度转换为摄氏温度这道编程题目依次包含预处理命令，主函数，整型变量，变量赋初值，输出函数这几个知识点（step 1为编辑预处理命令；step 2为定义主函数main；step 3为定义两个整型变量分别代表华氏温度和摄氏温度；step 4为对代表华氏温度的变量赋初值；step 5为依据温度转换计算公式调用输出函数输出结果）

实体类型	关系类型	关系值域	示例
课程	包含	章，节，知识点	1）《C语言程序设计》包含第一章C语言概述 2）《C语言程序设计》包含第一章C语言概述的第一节——C语言的特点 3）《C语言程序设计》包含输出函数这个知识点
章	包含，顺序	章，节，知识点	1）第一章C语言概述包含C语言的特点这一节 2）第一章C语言概述包含输出函数这个知识点 3）第一章C语言概述和第二章程序的灵魂—算法属于顺序关系
节	包含，顺序	节，知识点	1）第一章第三节输入和输出函数包含输出函数这个知识点 2）第一章第三节输入和输出函数和第一章第四节C源程序结构特点是顺序关系
知识点	顺序，相关	知识点	1）常量和直接常量是顺序关系 2）输入函数和输出函数是相关关系
编程问题	step i	知识点	将华氏温度转换为摄氏温度这道编程题目依次包含预处理命令，主函数，整型变量，变量赋初值，输出函数这几个知识点（step 1为编辑预处理命令；step 2为定义主函数main；step 3为定义两个整型变量分别代表华氏温度和摄氏温度；step 4为对代表华氏温度的变量赋初值；step 5为依据温度转换计算公式调用输出函数输出结果）

数据集	数据规模
训练集	1 541
验证集	193
测试集	192

数据集	数据规模
训练集	1 541
验证集	193
测试集	192

参数	值	参数	值
学习率	1E-5	卷积层数	1
Epoch	50	卷积核个数	230
Batch size	64	卷积核尺寸	1×3
Dropout	0.15	卷积核滑动步长	1
词嵌入向量维度	128	温度T	4
位置嵌入向量维度	50

基于改进分段卷积神经网络和知识蒸馏的学科知识实体间关系抽取

Relation extraction between discipline knowledge entities based on improved piecewise convolutional neural network and knowledge distillation

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 14

参考文献 32

相关文章 15

编辑推荐

Metrics

模型	Weighted-average precision/%	Weighted-average recall/%	Weighted-average F1/%	total params
PCNN^［15］	84	85	84	233 864
BiLSTM-CNN-Attention^［19］	88	87	87	430 817
BiGRU-Att-PCNN^［20］	89	87	88	466 772
R-BERT^［24］	95	94	93	114 204 672
EC_BERT^［25］	93	92	92	113 531 136
KD-RB-l^［28］	96	95	95	351 357 596
BERT-PCNN	95	94	94	102542675
KD-PCNN	93	91	92	846434

蒸馏方式	Weighted-average precision	Weighted-average recall	Weighted-average F1
只蒸馏标签知识	91	88	88
只蒸馏中间层知识	91	90	90
蒸馏标签和中间层知识	93	91	92

[1]	杜郁, 朱焱. 构建预训练动态图神经网络预测学术合作行为消失[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2726-2731.
[2]	薛桂香, 王辉, 周卫峰, 刘瑜, 李岩. 基于知识图谱和时空扩散图卷积网络的港口交通流量预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2952-2957.
[3]	王娜, 蒋林, 李远成, 朱筠. 基于图形重写和融合探索的张量虚拟机算符融合优化[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2802-2809.
[4]	李云, 王富铕, 井佩光, 王粟, 肖澳. 基于不确定度感知的帧关联短视频事件检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2903-2910.
[5]	唐廷杰, 黄佳进, 秦进. 基于图辅助学习的会话推荐[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2711-2718.
[6]	张睿, 张鹏云, 高美蓉. 自优化双模态多通路非深度前庭神经鞘瘤识别模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2975-2982.
[7]	杨兴耀, 陈羽, 于炯, 张祖莲, 陈嘉颖, 王东晓. 结合自我特征和对比学习的推荐模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2704-2710.
[8]	秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974.
[9]	方介泼, 陶重犇. 应对零日攻击的混合车联网入侵检测系统[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2763-2769.
[10]	武杰, 张安思, 吴茂东, 张仪宗, 王从宝. 知识图谱在装备故障诊断领域的研究与应用综述[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2651-2659.
[11]	杨航, 李汪根, 张根生, 王志格, 开新. 基于图神经网络的多层信息交互融合算法用于会话推荐[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2719-2725.
[12]	姚光磊, 熊菊霞, 杨国武. 基于神经网络优化的花朵授粉算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2829-2837.
[13]	黄颖, 杨佳宇, 金家昊, 万邦睿. 用于RGBT跟踪的孪生混合信息融合算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2878-2885.
[14]	贾洁茹, 杨建超, 张硕蕊, 闫涛, 陈斌. 基于自蒸馏视觉Transformer的无监督行人重识别[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2893-2902.
[15]	张春雪, 仇丽青, 孙承爱, 荆彩霞. 基于两阶段动态兴趣识别的购买行为预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2365-2371.