基于卷积神经网络和代价敏感的不平衡图像分类方法

doi:10.11772/j.issn.1001-9081.2018010152

计算机应用 ›› 2018, Vol. 38 ›› Issue (7): 1862-1865.DOI: 10.11772/j.issn.1001-9081.2018010152

基于卷积神经网络和代价敏感的不平衡图像分类方法

谭洁帆¹, 朱焱¹, 陈同孝², 张真诚³

1. 西南交通大学信息科学与技术学院, 成都 611756;
2. 台中科技大学资讯工程系, 台湾台中 404;
3. 逢甲大学资讯工程系, 台湾台中407

收稿日期:2018-01-17 修回日期:2018-03-06 出版日期:2018-07-10 发布日期:2018-07-12
通讯作者: 朱焱
作者简介:谭洁帆(1992-),女,四川长宁人,硕士研究生,主要研究方向:图像数据挖掘;朱焱(1965-),女,广西桂林人,教授,博士,CCF会员,主要研究方向:数据挖掘、Web异常发现、大数据管理与智能分析;陈同孝(1964-),男,安徽霍邱人,教授,博士,主要研究方向:影像处理、资料勘探、资讯安全;张真诚(1954-),男,台湾台中人,教授,博士,主要研究方向:资料库设计、电子商务安全、电子多媒体影像技术、密码学。
基金资助:
四川省学术和技术带头人后备人选科研基金资助项目（WZ0100112371408，YH1500411031402）；四川省学术和技术带头人科研基金资助项目（WZ0100112371601/004）；四川省科技服务业示范项目（2016GFW0166）。

Imbalanced image classification approach based on convolution neural network and cost-sensitivity

TAN Jiefan¹, ZHU Yan¹, CHEN Tung-shou², CHANG Chin-chen³

1. College of Information Science and Technology, Southwest Jiaotong University, Chengdu Sichuan 611756, China;
2. Department of Computer Science and Information Engineering, Taichung University of Science and Technology, Taichung Taiwan 404, China;
3. Department of Information Engineering and Computer Science, Feng Chia University, Taichung Taiwan 407, China

Received:2018-01-17 Revised:2018-03-06 Online:2018-07-10 Published:2018-07-12
Supported by:
This work is partially supported by the Academic and Technological Leadership Training Foundation of Sichuan Province (WZ0100112371408, YH1500411031402), the Academic and Technological Leadership Research Foundation of Sichuan Province (WZ0100112371601/004), the Demonstration Project in Technology Service Industry of Sichuan Province (2016GFW0166).

摘要/Abstract

摘要： 针对不平衡图像分类中少数类查全率低、分类结果总代价高，以及人工提取特征主观性强而且费时费力的问题，提出了一种基于Triplet-sampling的卷积神经网络（Triplet-sampling CNN）和代价敏感支持向量机（CSSVM）的不平衡图像分类方法——Triplet-CSSVM。该方法将分类过程分为特征学习和代价敏感分类两部分。首先，利用误差公式为三元损失函数的卷积神经网络端对端地学习将图像映射到欧几里得空间的编码方法；然后，结合采样方法重构数据集，使其分布平衡化；最后，使用CSSVM分类算法给不同类别赋以不同的代价因子，获得最佳代价最小的分类结果。在深度学习框架Caffe上使用人像数据集FaceScrub进行实验。实验结果表明，所提方法在1∶3的不平衡率下，与VGGNet-SVM方法相比，少数类的精确率提高了31个百分点，召回率提高了71个百分点。

关键词: 卷积神经网络, 代价敏感, 图像分类, 数据平衡, 支持向量机

Abstract: Focusing on the issues that the recall of minority class is low, the cost of classification is high and manual feature selection costs too much in imbalanced image classification, an imbalanced image classification approach based on Triplet-sampling Convolutional Neural Network (Triplet-sampling CNN) and Cost-Sensitive Support Vector Machine (CSSVM), called Triplet-CSSVM, was proposed. This method had two parts:feature learning and cost sensitive classification. Firstly, the coding method which mapped images to a Euclidean space end-to-end was learned by the CNN which used Triplet loss as loss function. Then, the dataset was rescaled by sampling method to balance the distribution. At last, the best classification result with the minimum cost was obtained by CSSVM classification algorithm which assigned different cost factors to different classes. Experiments with the portrait dataset FaceScrub on the deep learning framework Caffe were conducted. And the experimental results show that the precision is increased by 31 percentage points and the recall of the proposed method is increased by 71 percentage points compared with VGGNet-SVM (Visual Geometry Group Net-Support Vector Machine) in the condition of 1:3 imbalanced rate.

Key words: Convolution Neural Network (CNN), cost sensitive, image classification, data balance, Support Vector Machine (SVM)

中图分类号:

TP181

谭洁帆, 朱焱, 陈同孝, 张真诚. 基于卷积神经网络和代价敏感的不平衡图像分类方法[J]. 计算机应用, 2018, 38(7): 1862-1865.

TAN Jiefan, ZHU Yan, CHEN Tung-shou, CHANG Chin-chen. Imbalanced image classification approach based on convolution neural network and cost-sensitivity[J]. Journal of Computer Applications, 2018, 38(7): 1862-1865.

参考文献

[1] 谷琼,袁磊,熊启军,等.基于非均衡数据集的代价敏感学习算法比较研究[J].微电子学与计算机,2011,28(8):146-149.(GU Q, YUAN L, XIONG Q J, et al. A comparative study of cost-sensitive learning algorithm based on imbalanced data sets[J]. Micro Electronics & Computer, 2011, 28(8):146-149.)
[2] 刘胥影.代价敏感学习方法的研究[D].南京:南京大学,2010:7.(LIU X Y. Research on cost-sensitive learning methods[D]. Nanjing:Nanjing University, 2010:7.)
[3] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 2012 International Conference on Neural Information Processing Systems. Cambridge, MA:MIT Press, 2012:1097-1105.
[4] YAN Y, CHEN M, SHYU M L, et al. Deep learning for imbalanced multimedia data classification[C]//Proceedings of the 2015 IEEE International Symposium on Multimedia. Piscataway, NJ:IEEE, 2015:483-488.
[5] CHUNG Y A, LIN H T, YANG S W. Cost-aware pre-training for multiclass cost-sensitive deep learning[J/OL]. arXiv preprint, 2015:arXiv:1511.09337[2017-06-15]. https://arxiv.org/abs/1511.09337.
[6] SCHROFF F, KALENICHENKO D, PHILBIN J. FaceNet:a unified embedding for face recognition and clustering[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2015:815-823.
[7] 缪林松.基于代价敏感神经网络算法的软件缺陷预测[J].电子科技,2012,25(6):75-78.(MIAO L S. Software defect prediction based on cost-sensitive neural networks[J]. Electronic Science and Technology, 2012, 25(6):75-78.)
[8] LIU X Y, ZHOU Z H. Learning with cost intervals[C]//Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM, 2010:403-412.
[9] WANG K J, MAKOND B, WANG K M. An improved survivability prognosis of breast cancer by using sampling and feature selection technique to solve imbalanced patient classification data[J]. BMC Medical Informatics and Decision Making, 2013, 13(1):124.
[10] JIA Y Q. Deep learning framework by BAIR[EB/OL].[2017-09-12]. http://caffe.berkeleyvision.org/.
[11] STEHMAN S V. Selecting and interpreting measures of thematic classification accuracy[J]. Remote Sensing of Environment, 1997, 62(1):77-89.
[12] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint, 2014:arXiv:1409.1556[2017-06-03]. https://arxiv.org/abs/1409.1556.

[1]	王贺兵, 张春梅. 基于非对称卷积-压缩激发-次代残差网络的人脸关键点检测[J]. 计算机应用, 2021, 41(9): 2741-2747.
[2]	宋中山, 梁家锐, 郑禄, 刘振宇, 帖军. 基于双向门控尺度特征融合的遥感场景分类[J]. 计算机应用, 2021, 41(9): 2726-2735.
[3]	李康康, 张静. 基于注意力机制的多层次编码和解码的图像描述模型[J]. 计算机应用, 2021, 41(9): 2504-2509.
[4]	张永斌, 常文欣, 孙连山, 张航. 基于字典的域名生成算法生成域名的检测方法[J]. 计算机应用, 2021, 41(9): 2609-2614.
[5]	赵宏, 孔东一. 图像特征注意力与自适应注意力融合的图像内容中文描述[J]. 计算机应用, 2021, 41(9): 2496-2503.
[6]	徐江浪, 李林燕, 万新军, 胡伏原. 结合目标检测的室内场景识别方法[J]. 计算机应用, 2021, 41(9): 2720-2725.
[7]	牟长宁, 王海鹏, 周丕宇, 侯鑫行. 基于图卷积神经网络的串联质谱从头测序[J]. 计算机应用, 2021, 41(9): 2773-2779.
[8]	曹玉红, 徐海, 刘荪傲, 王紫霄, 李宏亮. 基于深度学习的医学影像分割研究综述[J]. 计算机应用, 2021, 41(8): 2273-2287.
[9]	秦斌斌, 彭良康, 卢向明, 钱江波. 司机分心驾驶检测研究进展[J]. 计算机应用, 2021, 41(8): 2330-2337.
[10]	孟凡, 陈广, 王勇, 高阳, 高德群, 贾文龙. 基于多粒度时序结构表示的异常检测算法在储层含油性检测中应用[J]. 计算机应用, 2021, 41(8): 2453-2459.
[11]	黄程程, 董霄霄, 李钊. 基于二维Winograd算法的深流水线5×5卷积方法[J]. 计算机应用, 2021, 41(8): 2258-2264.
[12]	曾祥银, 郑伯川, 刘丹. 基于深度卷积神经网络和聚类的左右轨道线检测[J]. 计算机应用, 2021, 41(8): 2324-2329.
[13]	吴则举, 焦翠娟, 陈亮. 基于改进Faster R-CNN的轮胎缺陷检测方法[J]. 计算机应用, 2021, 41(7): 1939-1946.
[14]	杨粟, 欧阳智, 杜逆索. 基于相关度距离的无监督并行哈希图像检索[J]. 计算机应用, 2021, 41(7): 1902-1907.
[15]	谭道强, 曾诚, 乔金霞, 张俊. 基于混合注意力模型的阴影检测方法[J]. 计算机应用, 2021, 41(7): 2076-2081.

基于卷积神经网络和代价敏感的不平衡图像分类方法

Imbalanced image classification approach based on convolution neural network and cost-sensitivity

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics