Journal of Computer Applications ›› 2018, Vol. 38 ›› Issue (7): 2124-2129.DOI: 10.11772/j.issn.1001-9081.2018010123

Previous Articles     Next Articles

Grading of diabetic retinopathy based on cost-sensitive semi-supervised ensemble learning

REN Fulong1,2, CAO Peng1, WAN Chao3, ZHAO Dazhe1   

  1. 1. College of Computer Science and Engineering, Northeastern University, Shenyang Liaoning 110089, China;
    2. State Key Laboratory of Software Architecture, Northeastern University, Shenyang Liaoning 110179, China;
    3. Department of Ophthalmology, the First Hospital of China Medical University, Shenyang Liaoning 110001, China
  • Received:2018-01-15 Revised:2018-03-14 Online:2018-07-10 Published:2018-07-12
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61502091), the Fundamental Research Funds for Shenyang Municipal Science and Technology Bureau (17-134-8-00), the Fundamental Research Funds for the Central Universities (N161604001, N150408001).

结合代价敏感半监督集成学习的糖尿病视网膜病变分级

任福龙1,2, 曹鹏1, 万超3, 赵大哲1   

  1. 1. 东北大学 计算机科学与工程学院, 沈阳 110089;
    2. 东北大学 软件架构国家重点实验室, 沈阳 110179;
    3. 中国医科大学附属第一医院 眼科, 沈阳 110001
  • 通讯作者: 万超
  • 作者简介:任福龙(1978-),男,辽宁沈阳人,博士研究生,主要研究方向:图像处理、模式识别;曹鹏(1982-),男,辽宁大连人,讲师,博士,主要研究方向:机器学习、数据挖掘;万超(1979-),女,辽宁沈阳人,副教授,博士,主要研究方向:糖尿病视网膜病变的基础与临床;赵大哲(1960-),女,辽宁沈阳人,教授,博士,主要研究方向:图像处理、软件工程。
  • 基金资助:
    国家自然科学基金资助项目(61502091);沈阳市科技计划项目(17-134-8-00);中央高校基本科研业务费专项(N161604001,N150408001)。

Abstract: Since the lack of lesion labels and unbalanced data distribution in datasets lead to the problem that the supervised classification model can not effectively classify the lesions in the traditional Diabetic Retinopathy (DR) grading system, a Cost-Sensitive based Semi-supervised Bagging (CS-SemiBagging) algorithm for DR classification was proposed. Firstly, retinal vessels were removed from a fundus image, and then the suspicious red lesions (MicroAneurysms (MAs) and HEMorrhages (HEMs)) were detected on the image without vessels. Secondly, a 22-dimensional feature based on color, shape and texture was extracted to describe each candidate lesion region. Thirdly, a CS-SemiBagging model was constructed for the classification of MAs and HEMs. Finally, the severity of DR was graded into four levels based on the numbers of different lesions. The proposed method was evaluated on the publicly available MESSIDOR database. It achieved an average accuracy of 90.2%, which was 4.9 percentage points higher than that of classical semi-supervised learning method based on Co-training. The CS-SemiBagging algorithm can effectively classify DR without label information of the suspicious lesions, so as to avoid the time-consuming effort of labeling the lesions by specialists and the bad influence of unbalanced samples on the classification.

Key words: Diabetic Retinopathy (DR), classification, cost-sensitive learning, semi-supervised learning, ensemble learning

摘要: 针对传统糖尿病视网膜病变(糖网)分级诊断系统中,由于数据集中缺少病灶区域的标记和类别分布的不平衡性导致无法有效地进行监督性分类的问题,提出基于代价敏感的半监督Bagging(CS-SemiBagging)的糖网分级方法。首先,从眼底图像上删除视网膜血管,并在此图像上检测疑似的红色病灶(微动脉瘤(MAs)与出血斑(HEMs));然后,从颜色、形状和纹理方面提取22维的特征用于描述每个病灶区域;其次,构建一个CS-SemiBagging模型对MAs与HEMs进行分类;最后,依据不同病灶的数量将糖网划分为4级。通过对国际公共数据集MESSIDOR进行糖网分级评估实验,所提方法获得平均准确率为90.2%,与经典的半监督学习的Co-training方法相比提高了4.9个百分点。实验结果表明,CS-SemiBagging方法在无需提供病灶标注的情况下,能够高效自动地对糖网进行分级,从而既能免除医学图像中标注病灶的费时费力,又可以避免样本类别分布不平衡对分类算法的性能影响,获得较好的效果。

关键词: 糖尿病视网膜病变, 分类, 代价敏感学习, 半监督学习, 集成学习

CLC Number: