融合大语言模型与图结构的招商风险分析算法

doi:10.11772/j.issn.1001-9081.2024081210

《计算机应用》唯一官方网站 ›› 0, Vol. ›› Issue (): 7-11.DOI: 10.11772/j.issn.1001-9081.2024081210

融合大语言模型与图结构的招商风险分析算法

吕晓斌, 唐远泉, 苏怀强, 赵茂瑶, 席凤正, 周鑫, 何亚()

中科院成都信息技术股份有限公司，成都 610213

收稿日期:2024-08-26 修回日期:2024-10-17 接受日期:2024-10-21 发布日期:2025-01-24 出版日期:2024-12-31
通讯作者: 何亚
作者简介:吕晓斌（1978—），男，四川泸州人，高级工程师，主要研究方向：机器学习、数据挖掘、智能Web、智慧城市
唐远泉（1992—），男，四川成都人，工程师，主要研究方向：人工智能、大数据
苏怀强（1992—），男，四川广安人，工程师，主要研究方向：机器学习、数据挖掘、智能Web
赵茂瑶（1996—），女，四川成都人，工程师，主要研究方向：数理统计、数据挖掘、统计调查
席凤正（1980—），男，江苏沛县人，主要研究方向：智慧城市、大数据、机器学习、数据挖掘
周鑫（1990—），男，四川资中人，工程师，主要研究方向：机器学习、大数据、数据挖掘、智能Web
何亚（1981—），男，四川成都人，高级工程师，硕士，主要研究方向：智慧城市、大数据、数据挖掘、机器学习。
基金资助:
西部之光青年学者

Algorithm for analyzing investment risks by integrating large language model with graph structure

Xiaobin LYU, Yuanquan TANG, Huaiqiang SU, Maoyao ZHAO, Fengzheng XI, Xin ZHOU, Ya HE()

Chengdu Information Technology of Chinese Academy of Sciences Company Limited，Chengdu Sichuan 610213，China

Received:2024-08-26 Revised:2024-10-17 Accepted:2024-10-21 Online:2025-01-24 Published:2024-12-31
Contact: Ya HE

摘要/Abstract

摘要：

在企业的招商引资过程中，存在多维度的风险。传统的风险评估方法由于信息失真以及经济行为中的复杂关系，难以及时且准确地识别这些风险。为解决上述问题，提出一种将大型语言模型（LLM）与图神经网络（GNN）融合的风险分析框架。利用LLM的语义理解能力，辅助GNN构建全面、准确的动态企业异构知识图谱，从而解决静态数据引起的信息失真问题。在此基础上，针对GNN在深度和语义表达能力上的不足，设计一个基于知识的语义结构挖掘模块，并结合Qwen2大模型增强节点表示的语义精准性。此外，提出一体化图（IOG）模块将节点分类与图分类任务统一为对“关注节点”的预测。通过统一预测机制，实现对不同图结构类型的预测，从而显著提升模型在不同数据集上的泛化能力。基于该框架构建的IOG-CIQAN（In One Graph with Collective Intelligence and Qwen2 Assistance Network）模型在劳工、财务、行政这3个风险分析数据集上的准确率均超过了87%，优于胶囊网络（CapsNet）等多种基线模型。

关键词: 图神经网络, 大语言模型, 图结构感知, 企业风险预测, 图结构统一表示

Abstract:

During the process of enterprise investment attraction， there are multi-dimensional risks. Traditional risk assessment methods are difficult to identify these risks timely and accurately due to information distortion and complex relationships in economic behaviors. To address the above issues， a risk analysis framework integrating Large Language Model （LLM） and Graph Neural Network （GNN） was proposed. The semantic understanding capability of LLM was utilized to assist the GNN in constructing a more comprehensive and accurate dynamic heterogeneous knowledge graph of enterprises， thereby solving the information distortion problem caused by static data. On this basis， to address the shortcomings of GNN in terms of deep and semantic expression abilities， a knowledge-based semantic structure mining module was designed， and Qwen large model was combined to enhance the semantic accuracy of node representations. Furthermore， an Integrated One Graph （IOG） module was proposed to unify node classification and graph classification tasks into the prediction of “focus nodes”. Through a unified prediction mechanism， predictions for different graph structure types were achieved， thereby improving the model’s generalization ability on different datasets significantly. The IOG-CIQAN（In One Graph with Collective Intelligence and Qwen2 Assistance Network） model constructed on the basis of this framework achieved accuracy over 87% on all of three risk analysis datasets in labor， finance， and administration compared to multiple baseline models such as Capsule Network （CapsNet）.

Key words: Graph Neural Network (GNN), Large Language Model (LLM), graph structure awareness, enterprise risk prediction, unified graph structure representation

中图分类号:

TP391.1

吕晓斌, 唐远泉, 苏怀强, 赵茂瑶, 席凤正, 周鑫, 何亚. 融合大语言模型与图结构的招商风险分析算法[J]. 计算机应用, 0, (): 7-11.

Xiaobin LYU, Yuanquan TANG, Huaiqiang SU, Maoyao ZHAO, Fengzheng XI, Xin ZHOU, Ya HE. Algorithm for analyzing investment risks by integrating large language model with graph structure[J]. Journal of Computer Applications, 0, (): 7-11.

图/表 4

参考文献 24

1	NIU G， YU L， FAN G Z， et al. Corporate fraud， risk avoidance， and housing investment in China［J］. Emerging Markets Review， 2019， 39： 18-33.
2	GROVE H， CLOUSE M. Strategic risk management for enhanced corporate governance［J］. Corporate Ownership and Control， 2016， 13（4， continued 1）： 173-182.
3	LIU H L， TAO L， LIU M Y. Research on financial fraud identification of listed companies based on text data mining［C］// Proceedings of the SPIE 11584， 2020 International Conference on Image， Video Processing and Artificial Intelligence. Bellingham， WA： SPIE， 2020： No.115841X.
4	HU Y， ZHANG Z， ZHAO L. Beyond text： a deep dive into large language models’ ability on understanding graph data［R/OL］. ［2024-08-25］..
5	DWIVEDI V P， JOSHI C K， LUU A T， et al. Benchmarking graph neural networks［J］. Journal of Machine Learning Research， 2022， 23： 1-48.
6	PAN S， LUO L， WANG Y， et al. Unifying large language models and knowledge graphs： a roadmap［J］. IEEE Transactions on Knowledge and Data Engineering， 2024， 36（7）： 3580-3599.
7	TANG J， YANG Y H， WEI W， et al. GraphGPT： graph instruction tuning for large language models［C］// Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York： ACM， 2024： 491-500.
8	LeCUN Y， BENGIO Y， HINTON G. Deep learning［J］. Nature， 2015， 521（7553）： 436-444.
9	HE K， ZHANG X， REN S， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778.
10	HOCHREITER S， SCHMIDHUBER J. Long short-term memory［J］. Neural Computation， 1997， 9（8）： 1735-1780.
11	XU Y， CHEN L， HUANG D. Corporate financial risk prediction using deep learning［J］. Expert Systems with Applications， 2018， 85： 308-317.
12	CHEN W， ZHANG Y， YEO C K， et al. Stock market prediction using neural network through news on online social networks［C］// Proceedings of the 2017 International Smart Cities Conference. Piscataway： IEEE， 2017： 1-6.
13	DEVLIN J， CHANG M W， LEE K， et al. BERT： pre-training of deep bidirectional transformers for language understanding［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1 （Long and Short Papers）. Stroudsburg： ACL， 2019： 4171-4186.
14	RADFORD A， WU J， CHILD R， et al. Language models are unsupervised multitask learners［EB/OL］. ［2024-08-25］..
15	QIU X， SUN T， XU Y， et al. Pre-trained models for natural language processing： a review［J］. SCIENCE CHINA Technological Sciences， 2020， 63（10）： 1872-1897.
16	LIU Z， ZHANG Y， XU Y. Policy impact analysis using BERT-based models［J］. Journal of Information Science， 2020， 46（6）： 824-838.
17	SUN C， LI Y. Market sentiment analysis based on generative pre-training model［J］. Financial Innovation， 2021， 7（1）： 1-15.
18	CHEN T， MA J， LI C. Multi-modal risk assessment combining text and graph data［J］. IEEE Access， 2021， 9： 123456-123465.
19	CHEN H， WANG W， LIU J. A hybrid deep learning model for risk prediction integrating LLMs and GNNs［J］. Expert Systems with Applications， 2021， 168： No.114241.
20	FEI H， JI D， ZHANG Y， et al. Topic-enhanced capsule network for multi-label emotion classification［J］. IEEE/ACM Transactions on Audio， Speech， and Language Processing， 2020， 28： 1839-1848.
21	XIAO L， ZHANG H， CHEN W， et al. MCapsNet： capsule network for text with multi-task learning［C］// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2018： 4565-4574.
22	TANG H， WANG C， ZHENG J， et al. Enabling graph neural networks for semi-supervised risk prediction in online credit loan services ［J］. ACM Transactions on Intelligent Systems and Technology， 2024， 15（1）： No.13.
23	MOTIE S， RAAHEMI B. Financial fraud detection using graph neural networks： a systematic review［J］. Expert Systems with Applications， 2024， 240： No.122156.
24	CHU Z， GUO H， ZHOU X， et al. Data-centric financial large language models ［EB/OL］. ［2024-08-25］..

方法	准确率	F1
方法	准确率	N	F	T	U
Topic-Cap	59.23	60.14	61.76	53.89	57.33
McapsNet	59.27	58.41	58.73	54.63	52.95
CredGNN	71.14	62.19	62.54	64.08	63.72
FraudGNN	80.10	64.84	66.17	67.53	68.81
FinLLM	81.29	74.19	73.61	75.21	69.84
IOG-CIQAN	87.25	79.34	78.14	81.16	82.14

方法	准确率	F1
方法	准确率	N	F	T	U
Topic-Cap	59.23	60.14	61.76	53.89	57.33
McapsNet	59.27	58.41	58.73	54.63	52.95
CredGNN	71.14	62.19	62.54	64.08	63.72
FraudGNN	80.10	64.84	66.17	67.53	68.81
FinLLM	81.29	74.19	73.61	75.21	69.84
IOG-CIQAN	87.25	79.34	78.14	81.16	82.14

方法	准确率	F1
方法	准确率	N	F	T	U
Topic-Cap	61.36	60.21	53.94	62.15	67.48
MCapsNet	59.27	58.63	59.12	54.89	52.76
CredGNN	74.80	62.37	61.85	63.42	62.93
FraudGNN	88.30	63.82	65.24	66.73	67.95
FinLLM	89.44	77.76	74.18	75.68	71.39
IOG-CIQAN	90.13	83.66	81.91	82.77	83.17

方法	准确率	F1
方法	准确率	N	F	T	U
Topic-Cap	61.36	60.21	53.94	62.15	67.48
MCapsNet	59.27	58.63	59.12	54.89	52.76
CredGNN	74.80	62.37	61.85	63.42	62.93
FraudGNN	88.30	63.82	65.24	66.73	67.95
FinLLM	89.44	77.76	74.18	75.68	71.39
IOG-CIQAN	90.13	83.66	81.91	82.77	83.17

方法	准确率	F1
方法	准确率	N	F	T	U
Topic-Cap	61.35	59.94	52.76	62.77	56.72
MCapsNet	59.27	57.19	59.47	53.26	51.38
CredGNN	74.48	60.29	60.29	60.29	60.29
FraudGNN	88.30	60.48	63.57	64.51	65.89
FinLLM	89.29	77.19	73.61	75.21	69.84
IOG-CIQAN	92.13	83.31	81.45	82.36	93.14

融合大语言模型与图结构的招商风险分析算法

Algorithm for analyzing investment risks by integrating large language model with graph structure

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 4

参考文献 24

相关文章 15

编辑推荐

Metrics

[1]	刘超, 余岩化. 融合降噪策略与多视图对比学习的知识感知推荐模型[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2827-2837.
[2]	张滨滨, 秦永彬, 黄瑞章, 陈艳平. 结合大语言模型与动态提示的裁判文书摘要方法[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2783-2789.
[3]	梁永濠, 李金龙. 用于神经布尔可满足性问题求解器的新型消息传递网络[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2934-2940.
[4]	卢燕群, 赵奕奕. 基于层次图神经网络和差异化特征学习的客户流失预测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 3057-3066.
[5]	王义, 马应龙. 基于项图动态适应性生成的多任务社交项推荐方法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2592-2599.
[6]	涂银川, 郭勇, 毛恒, 任怡, 张建锋, 李宝. 基于分布式环境的图神经网络模型训练效率与训练性能评估[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2409-2420.
[7]	蒋权, 黄文清, 苟志勇. 基于等变图神经网络的拉格朗日粒子流模拟[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2666-2671.
[8]	冯涛, 刘晨. 自动化偏好对齐的双阶段提示调优方法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2442-2447.
[9]	赵彪, 秦玉华, 田荣坤, 胡月航, 陈芳锐. 依赖类型及距离增强的方面级情感分析模型[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2507-2514.
[10]	梁辰, 王奕森, 魏强, 杜江. 基于Tsransformer-GCN的源代码漏洞检测方法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2296-2303.
[11]	张子墨, 赵雪专. 多尺度稀疏图引导的视觉图神经网络[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2188-2194.
[12]	陈丹阳, 张长伦. 多尺度去相关的图卷积网络模型[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2180-2187.
[13]	张悦岚, 苏静, 赵航宇, 杨白利. 基于知识感知与交互的多视图蒸馏推荐算法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2211-2220.
[14]	姜超英, 李倩, 刘宁, 刘磊, 崔立真. 基于图对比学习的再入院预测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1784-1792.
[15]	党伟超, 温鑫瑜, 高改梅, 刘春霞. 基于多视图多尺度对比学习的图协同过滤[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1061-1068.