基于图卷积神经网络的医保欺诈检测算法

doi:10.11772/j.issn.1001-9081.2019101766

计算机应用 ›› 2020, Vol. 40 ›› Issue (5): 1272-1277.DOI: 10.11772/j.issn.1001-9081.2019101766

基于图卷积神经网络的医保欺诈检测算法

易东义, 邓根强, 董超雄, 祝苗苗, 吕周平, 朱岁松

华中科技大学协和深圳医院，广东深圳 518060

收稿日期:2019-10-18 修回日期:2019-12-12 发布日期:2020-05-15 出版日期:2020-05-10
通讯作者: 邓根强(1991—)
作者简介:易东义(1993—)，男，湖北潜江人，硕士研究生，主要研究方向：深度学习、数据挖掘；邓根强(1991—)，男，湖南株洲人，硕士研究生，主要研究方向：模式识别、数据挖掘；董超雄（1976—），男，湖北天门人，硕士研究生，主要研究方向：医院信息化建设、数据挖掘；祝苗苗（1992—），女，湖南郴州人，硕士研究生，主要研究方向：智能算法、多目标优化；吕周平（1985—），男，浙江慈溪人，主要研究方向：智能算法优化；朱岁松（1964—），男，湖北咸宁人，主要研究方向：医保政策、医院信息化建设。
基金资助:
深圳市南山区技术研发和创意设计项目（深南科卫2018042号）。

Medical insurance fraud detection algorithm based on graph convolutional neural network

YI Dongyi, DENG Genqiang, DONG Chaoxiong, ZHU Miaomiao, LYU Zhouping, ZHU Suisong

Union Shenzhen Hospital,Huazhong University of Science and Technology， Shenzhen Guangdong 518060, China

Received:2019-10-18 Revised:2019-12-12 Online:2020-05-15 Published:2020-05-10
Contact: DENG Genqiang, born in 1991, M. S. candidate. His research interests include pattern recognition, data mining.
About author:YI Dongyi, born in 1993, M. S. candidate. His research interests include deep learning, data mining.DENG Genqiang, born in 1991, M. S. candidate. His research interests include pattern recognition, data mining.DONG Chaoxiong, born in 1976, M. S. candidate. His research interests include hospital informationization construction, data mining.ZHU Miaomiao, born in 1992, M. S. candidate. Her research interests include intelligent algorithm, multi-objective optimization.LYU Zhouping, born in 1985. His research interests include intelligent algorithm optimization.ZHU Suisong, born in 1964. His research interests include research on medical insurance policy, hospital informationization construction.
Supported by:
This work is partially supported by Shenzhen Nanshan District Technology Research and Development and Creative Design Project (2018042).

摘要/Abstract

摘要：

针对医疗保险欺诈检测当中欺诈样本不足、数据标注昂贵和传统基于欧氏空间的模型准确率低的问题，提出了一种新的基于图卷积和变分自编码的单分类医保欺诈检测模型（OCGVAE）。首先,通过病人就诊记录建立社交网络，计算病人和医生之间的权重关系，并设计了一个2层的图卷积神经网络（GCN）作为社交网络数据的输入，用以降低社交网络的数据维度；然后,设计了一个变分自编码（VAE）用以实现只存在一类欺诈样本标签的情况下的模型训练；最后,设计了一个逻辑回归（LR）模型用以判别数据类别。实验结果表明，OCGVAE模型的检测准确率达到87.26%，相较于一类对抗神经网络（OCAN）、一类高斯过程（OCGP）、一类近邻（OCNN）、一类支持向量机（OCSVM）和半监督图卷积神经网络（Semi-GCN）算法，分别高出16.1%、70.2%、31.7%、36.5%和27.6%，说明所提模型有效提高了医保欺诈筛查精度。

关键词: 医保欺诈检测, 图卷积神经网络, 变分自编码, 社交网络, 单分类, 主动学习

Abstract:

Aiming at the problems of insufficient fraud samples, expensive data labeling and low accuracy of traditional Euclidean space model, a new One-Class medical insurance fraud detection model based on Graph convolution and Variational Auto-Encoder (OCGVAE) was proposed. Firstly, a social network was established through patient visit records, the weight relationships between the patients and the doctors were calculated, and a 2-layer Graph Convolutional neural Network (GCN) was designed as the input of the social network data to reduce the data dimension of the social network. Secondly, a Variational Auto-Encoder (VAE) was designed to implement the model training under only one-class fraud sample label. Finally, a Logistic Regression (LR) model was designed to discriminate the data category. The experimental results show that the detection accuracy of the OCGVAE model reaches 87.26%, which is 16.1%,70.2%,31.7%,36.5%,and 27.6% higher than that of One-Class Adversarial Net (OCAN), One-Class Gaussian Process (OCGP), One-Class Nearest Neighbor (OCNN), One-Class Support Vector Machine (OCSVM) and Semi-supervised GCN (Semi-GCN) algorithm, demonstrating that the proposed model effectively improves the accuracy of medical insurance fraud screening.

Key words: medical insurance fraud detection, Graph Convolutional neural Network (GCN), Variational Auto-Encoder (VAE), social network, one-class, active learning

中图分类号:

TP39.4

易东义, 邓根强, 董超雄, 祝苗苗, 吕周平, 朱岁松. 基于图卷积神经网络的医保欺诈检测算法[J]. 计算机应用, 2020, 40(5): 1272-1277.

YI Dongyi, DENG Genqiang, DONG Chaoxiong, ZHU Miaomiao, LYU Zhouping, ZHU Suisong. Medical insurance fraud detection algorithm based on graph convolutional neural network[J]. Journal of Computer Applications, 2020, 40(5): 1272-1277.

参考文献

1 欣欣 . 《社会蓝皮书:2017年中国社会形势分析与预测》发布[J]. 出版参考, 2017(1):69-69. (XIN X. Publishing of Blue Book of China’s Society: Society of China Analysis and Forecast (2017)[J]. Information on Publication， 2017(1):69-69.)
2 MORRIS L . Combating fraud in health care: an essential component of any cost containment strategy[J]. Health Affairs, 2009, 28(5):1351-1356.
3 O’SHAUGHNESSY C , DIVISION D S P . Older Americans act nutrition programs: a community-based nutrition program helping older adults remain at home[J].Journal of Nutrition in Gerontology and Geriatrics, 2015,34(2):90-109.
4 LIU J , BIER E , WILSON A , et al. Graph analysis for detecting fraud, waste , and abuse in healthcare data[C]// Proceedings of the 27th Conference on Innovative Applications of Artificial Intelligence. Palo Alto, CA: AAAI Press, 2015:3912-3919.
5 JOUDAKI H , RASHIDIAN A , MINAEIBIDGOLI B , et al . Using data mining to detect health care fraud and abuse: a review of literature[J]. Global Journal of Health Science, 2014, 7(1):194-202.
6 VIEGAS J L , CEPEDA N M , VIEIRA S M . Electricity fraud detection using committee semi-supervised learning[C]// Proceedings of the 2018 International Joint Conference on Neural Networks. Piscataway: IEEE, 2018: 1-6.
7 SEO J, MENDELEVITCH O . Identifying frauds and anomalies in Medicare-B dataset[C]// Proceedings of the 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Piscataway: IEEE, 2017: 3664-3667.
8 SHAN Y , MURRAY D W , SUTINEN A . Discovering inappropriate billings with local density based outlier detection method[C]// Proceedings of the 8th Australasian Data Mining Conference. Sydney: Australian Computer Society, 2009: 93-98.
9 ZHANG W , HE X . An anomaly detection method for medicare fraud detection[C]// Proceedings of the 2017 IEEE International Conference on Big Knowledge. Piscataway: IEEE, 2017: 309-314.
10 BAUDER R A , KHOSHGOFTAAR T M . The detection of medicare fraud using machine learning methods with excluded provider labels[C]// Proceedings of the 31st International Florida Artificial Intelligence Research Society Conference. Palo Alto, CA: AAAI Press, 2018:404-409.
11 BAUDER R , KHOSHGOFTAAR T . Medicare fraud detection using random forest with class imbalanced big data[C]// Proceedings of the 2018 IEEE International Conference on Information Reuse and Integration. Piscataway: IEEE, 2018: 80-87.
12 PENG H , YOU M . The health care fraud detection using the pharmacopoeia spectrum tree and neural network analytic contribution hierarchy process[C]// Proceedings of the 2016 IEEE International Conference on Trust, Security and Privacy in Computing and Communications / International Conference on Big Data Science and Engineering / International Symposium on Image and Signal Processing and Analysis. Piscataway: IEEE, 2016: 2006-2011.
13 ORTEGA P A , FIGUEROA C J , RUZ G A . A medical claim fraud/abuse detection system based on data mining: a case study in Chile[C]// Proceedings of the 2006 International Conference on Data Mining. Long Island City, NY: CSREA Press, 2006: 224-231.
14 PANDEY P , SAROLIYA A , KUMAR R . Analyses and detection of health insurance fraud using data mining and predictive modeling techniques[M]// PANT M, RAY K, SHARMA T, et al . Soft Computing: Theories and Applications, AISC 584. Cham: Springer, 2018: 41-49.
15 BRANTING L K , REEDER F , GOLD J , et al . Graph analytics for healthcare fraud risk estimation[C]// Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. Piscataway: IEEE, 2016: 845-851.
16 LIU J , BIER E , WILSON A , et al . Graph analysis for detecting fraud, waste, and abuse in healthcare data[J]. AI Magazine, 2016, 37(2): 33-46.
17 LLOYD J L , WELLMAN N S . Older Americans act nutrition programs: a community-based nutrition program helping older adults remain at home[J]. Journal of Nutrition in Gerontology and Geriatrics, 2015, 34(2): 90-109.
18 DEFFERRARD M , BRESSON X , VANDERGHEYNST P . Convolutional neural networks on graphs with fast localized spectral filtering[C]// Proceedings of the 30th International Conference on Neural Information Processing Systems. New York: Curran Associates Inc., 2016: 3844-3852.
19 KIPF T N , WELLING M . Semi-supervised classification with graph convolutional networks[EB/OL]. [2018-09-09]. https://arxiv.org/pdf/1609.02907.pdf.
20 AVRACHENKOV K , GONÇALVES P , SOKOL M . On the choice of kernel and labelled data in semi-supervised learning methods[C]// Proceedings of the 2013 International Workshop on Algorithms and Models for the Web-Graph, LNCS 8305. Cham: Springer, 2013: 56-67.
21 CARCILLO F , LE BORGNE Y A , CAELEN O , et al . Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization[J]. International Journal of Data Science and Analytics, 2018, 5(4): 285-300.
22 CESA-BIANCHI N , GENTILE C , VITALE F , et al . Active learning on trees and graphs[EB/OL]. [2018-06-22]. https://arxiv.org/pdf/1301.5112.pdf.
23 ZHENG P , YUAN S , WU X , et al . One-class adversarial nets for fraud detection[C]// Proceedings of the 2019 AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2019: 1286-1293.
24 KEMMLER M , RODNER E , WACKER E S , et al . One-class classification with Gaussian processes[J]. Pattern Recognition, 2013, 46: 3507-3518.
25 TAX D M J, DUIN R P W . Uniform object generation for optimizing one-class classifiers[J]. Journal of Machine Learning Research, 2001, 2: 155-173.
26 MANEVITZ L M , YOUSEF M . One-class SVMs for document classification[J]. Journal of Machine Learning Research, 2001, 2: 139-154.

[1]	张春雪, 仇丽青, 孙承爱, 荆彩霞. 基于两阶段动态兴趣识别的购买行为预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2365-2371.
[2]	姚迅, 秦忠正, 杨捷. 生成式标签对抗的文本分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1781-1785.
[3]	林欣蕊, 王晓菲, 朱焱. 基于局部扩展社区发现的学术异常引用群体检测[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1855-1861.
[4]	沈君凤, 周星辰, 汤灿. 基于改进的提示学习方法的双通道情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1796-1806.
[5]	孟凡, 杨群力, 霍静, 王新宽. 基于边缘异常候选集的迭代式主动多元时序异常检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1458-1463.
[6]	李宗禹, 强思维, 郭晓波, 朱振峰. 重加权的对抗变分自编码器及其在工业因果效应估计中的应用[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1099-1106.
[7]	王星, 刘贵娟, 陈志豪. 高斯混合模型与文本图卷积网络结合的虚假评论识别算法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 360-368.
[8]	高瑞, 陈学斌, 张祖篡. 面向部分图更新的动态社交网络隐私发布方法[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3831-3838.
[9]	刘世梁, 王义, 马应龙. 考虑社区规模不平衡的非重叠社区检测[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3396-3402.
[10]	李莉, 杨春艳, 朱江文, 胡荣磊. 区块链下社交网络用户抄袭识别方案[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 242-251.
[11]	郭晓, 陈艳平, 唐瑞雪, 黄瑞章, 秦永彬. 融合行为词的罪名预测多任务学习模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 159-166.
[12]	彭诗杰, 陈红梅, 王丽珍, 肖清. 基于地理偏好排序的兴趣点混合推荐模型[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2448-2455.
[13]	李豆豆, 李汪根, 夏义春, 束阳, 高坤. 基于特征交互与自适应融合的骨骼动作识别[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2581-2587.
[14]	黄梦林, 段磊, 张袁昊, 王培妍, 李仁昊. 基于Prompt学习的无监督关系抽取模型[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2010-2016.
[15]	何嘉明, 杨巨成, 吴超, 闫潇宁, 许能华. 基于多模态图卷积神经网络的行人重识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2182-2189.

基于图卷积神经网络的医保欺诈检测算法

Medical insurance fraud detection algorithm based on graph convolutional neural network

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics