Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (5): 1383-1390.DOI: 10.11772/j.issn.1001-9081.2021071240
Special Issue: 人工智能
• Artificial intelligence • Previous Articles Next Articles
Received:
2021-07-16
Revised:
2021-08-31
Accepted:
2021-09-14
Online:
2021-09-28
Published:
2022-05-10
Contact:
Wei REN
About author:
REN Wei, born in 1996, M. S. candidate. His research interests include deep learning, computer vision.Supported by:
通讯作者:
任炜
作者简介:
任炜(1996—),男,山西襄汾人,硕士研究生,主要研究方向:深度学习、计算机视觉 2783800599@qq.com基金资助:
CLC Number:
Wei REN, Hexiang BAI. Multi-label image classification method based on global and local label relationship[J]. Journal of Computer Applications, 2022, 42(5): 1383-1390.
任炜, 白鹤翔. 基于全局与局部标签关系的多标签图像分类方法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1383-1390.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021071240
方法 | mAP | ALL | Top-3 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
CP | CR | CF1 | OP | OR | OF1 | CP | CR | CF1 | OP | OR | OF1 | ||
CNN-RNN | 61.2 | ― | ― | ― | ― | ― | ― | 66.0 | 55.6 | 60.4 | 69.2 | 66.4 | 67.8 |
SRN | 77.1 | 81.6 | 65.4 | 71.2 | 82.7 | 69.9 | 75.8 | 85.2 | 58.8 | 67.4 | 87.4 | 62.5 | 72.9 |
Multi-Evidence | ― | 80.4 | 70.2 | 74.9 | 85.2 | 72.5 | 78.4 | 84.5 | 62.2 | 70.6 | 89.1 | 64.3 | 74.7 |
Res-101 | 80.1 | 78.2 | 71.9 | 74.9 | 82.3 | 75.0 | 78.5 | 82.8 | 63.4 | 71.8 | 87.6 | 65.5 | 75.0 |
CNN-LSTM-Att | ― | 80.9 | 70.9 | 75.6 | 83.7 | 74.9 | 79.1 | ― | ― | ― | ― | ― | ― |
ML-GCN | 83.0 | 85.1 | 72.0 | 78.0 | 85.8 | 75.4 | 80.3 | 89.2 | 64.1 | 74.6 | 90.5 | 66.5 | 76.7 |
SSGRL | 83.8 | 89.9 | 68.5 | 76.8 | 91.3 | 70.8 | 79.7 | 91.9 | 62.5 | 72.7 | 93.8 | 64.1 | 76.2 |
LLR | 83.8 | 86.0 | 72.6 | 78.8 | 86.9 | 75.8 | 81.0 | 89.4 | 64.6 | 75.0 | 90.7 | 67.0 | 77.0 |
ML-GLLR | 84.0 | 86.5 | 72.4 | 78.8 | 87.1 | 75.8 | 81.1 | 90.0 | 64.0 | 74.8 | 91.3 | 66.7 | 77.1 |
Tab. 1 Evaluation index comparison of different methods on MSCOCO2014 dataset
方法 | mAP | ALL | Top-3 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
CP | CR | CF1 | OP | OR | OF1 | CP | CR | CF1 | OP | OR | OF1 | ||
CNN-RNN | 61.2 | ― | ― | ― | ― | ― | ― | 66.0 | 55.6 | 60.4 | 69.2 | 66.4 | 67.8 |
SRN | 77.1 | 81.6 | 65.4 | 71.2 | 82.7 | 69.9 | 75.8 | 85.2 | 58.8 | 67.4 | 87.4 | 62.5 | 72.9 |
Multi-Evidence | ― | 80.4 | 70.2 | 74.9 | 85.2 | 72.5 | 78.4 | 84.5 | 62.2 | 70.6 | 89.1 | 64.3 | 74.7 |
Res-101 | 80.1 | 78.2 | 71.9 | 74.9 | 82.3 | 75.0 | 78.5 | 82.8 | 63.4 | 71.8 | 87.6 | 65.5 | 75.0 |
CNN-LSTM-Att | ― | 80.9 | 70.9 | 75.6 | 83.7 | 74.9 | 79.1 | ― | ― | ― | ― | ― | ― |
ML-GCN | 83.0 | 85.1 | 72.0 | 78.0 | 85.8 | 75.4 | 80.3 | 89.2 | 64.1 | 74.6 | 90.5 | 66.5 | 76.7 |
SSGRL | 83.8 | 89.9 | 68.5 | 76.8 | 91.3 | 70.8 | 79.7 | 91.9 | 62.5 | 72.7 | 93.8 | 64.1 | 76.2 |
LLR | 83.8 | 86.0 | 72.6 | 78.8 | 86.9 | 75.8 | 81.0 | 89.4 | 64.6 | 75.0 | 90.7 | 67.0 | 77.0 |
ML-GLLR | 84.0 | 86.5 | 72.4 | 78.8 | 87.1 | 75.8 | 81.1 | 90.0 | 64.0 | 74.8 | 91.3 | 66.7 | 77.1 |
方法 | mAP | 各类别AP | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
航天 | 自行车 | 鸟 | 船 | 瓶子 | 公交车 | 轿车 | 猫 | 椅子 | 牛 | 桌子 | 狗 | 马 | 摩托 | 人 | 植物 | 羊 | 沙发 | 火车 | 电视机 | ||
CNN-RNN | 84.0 | 96.7 | 83.1 | 94.2 | 92.8 | 61.2 | 82.1 | 89.1 | 94.2 | 64.2 | 83.6 | 70.0 | 92.4 | 91.7 | 84.2 | 93.7 | 59.8 | 93.2 | 75.3 | 99.7 | 78.6 |
RLSD | 88.5 | 96.4 | 92.7 | 93.8 | 94.1 | 71.2 | 92.5 | 94.2 | 95.7 | 74.3 | 90.0 | 74.2 | 95.4 | 96.2 | 92.1 | 97.9 | 66.9 | 93.5 | 73.7 | 97.5 | 87.6 |
VGG | 89.7 | 98.9 | 95.0 | 96.8 | 95.4 | 69.7 | 90.4 | 93.5 | 96.0 | 74.2 | 86.6 | 87.8 | 96.0 | 96.3 | 93.1 | 97.2 | 70.0 | 92.1 | 80.3 | 98.1 | 87.0 |
HCP | 90.9 | 98.6 | 97.1 | 98.0 | 95.6 | 75.3 | 94.7 | 95.8 | 97.3 | 73.1 | 90.2 | 80.0 | 97.3 | 96.1 | 94.9 | 96.3 | 78.3 | 94.7 | 76.2 | 97.9 | 91.5 |
Res-101 | 91.9 | 99.1 | 97.6 | 96.5 | 95.1 | 74.2 | 91.3 | 96.0 | 95.8 | 75.5 | 92.2 | 88.5 | 96.2 | 96.6 | 94.3 | 98.5 | 83.2 | 94.8 | 84.7 | 98.6 | 90.1 |
ML-GCN | 94.0 | 99.5 | 98.5 | 98.6 | 98.1 | 80.8 | 94.6 | 97.2 | 98.2 | 82.3 | 95.7 | 86.4 | 98.2 | 98.4 | 96.7 | 99.0 | 84.7 | 96.7 | 84.3 | 98.9 | 93.7 |
SSGRL | 95.0 | 99.7 | 98.4 | 98.0 | 97.6 | 85.7 | 96.2 | 98.2 | 98.8 | 82.0 | 98.1 | 89.7 | 98.8 | 98.7 | 97.0 | 99.0 | 86.9 | 98.1 | 85.8 | 99.0 | 93.7 |
LLR | 94.6 | 99.4 | 97.5 | 97.9 | 97.1 | 83.9 | 95.2 | 97.7 | 98.0 | 83.6 | 95.4 | 90.0 | 97.7 | 98.0 | 96.3 | 99.0 | 86.8 | 96.5 | 88.4 | 98.7 | 94.4 |
ML-GLLR | 95.9 | 99.8 | 98.4 | 98.2 | 98.2 | 86.2 | 97.6 | 98.2 | 98.8 | 85.7 | 97.2 | 92.6 | 98.7 | 98.9 | 97.1 | 99.2 | 89.2 | 98.3 | 90.7 | 99.3 | 96.1 |
Tab. 2 Comparison of results in various labels on VOC2007 dataset with different methods
方法 | mAP | 各类别AP | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
航天 | 自行车 | 鸟 | 船 | 瓶子 | 公交车 | 轿车 | 猫 | 椅子 | 牛 | 桌子 | 狗 | 马 | 摩托 | 人 | 植物 | 羊 | 沙发 | 火车 | 电视机 | ||
CNN-RNN | 84.0 | 96.7 | 83.1 | 94.2 | 92.8 | 61.2 | 82.1 | 89.1 | 94.2 | 64.2 | 83.6 | 70.0 | 92.4 | 91.7 | 84.2 | 93.7 | 59.8 | 93.2 | 75.3 | 99.7 | 78.6 |
RLSD | 88.5 | 96.4 | 92.7 | 93.8 | 94.1 | 71.2 | 92.5 | 94.2 | 95.7 | 74.3 | 90.0 | 74.2 | 95.4 | 96.2 | 92.1 | 97.9 | 66.9 | 93.5 | 73.7 | 97.5 | 87.6 |
VGG | 89.7 | 98.9 | 95.0 | 96.8 | 95.4 | 69.7 | 90.4 | 93.5 | 96.0 | 74.2 | 86.6 | 87.8 | 96.0 | 96.3 | 93.1 | 97.2 | 70.0 | 92.1 | 80.3 | 98.1 | 87.0 |
HCP | 90.9 | 98.6 | 97.1 | 98.0 | 95.6 | 75.3 | 94.7 | 95.8 | 97.3 | 73.1 | 90.2 | 80.0 | 97.3 | 96.1 | 94.9 | 96.3 | 78.3 | 94.7 | 76.2 | 97.9 | 91.5 |
Res-101 | 91.9 | 99.1 | 97.6 | 96.5 | 95.1 | 74.2 | 91.3 | 96.0 | 95.8 | 75.5 | 92.2 | 88.5 | 96.2 | 96.6 | 94.3 | 98.5 | 83.2 | 94.8 | 84.7 | 98.6 | 90.1 |
ML-GCN | 94.0 | 99.5 | 98.5 | 98.6 | 98.1 | 80.8 | 94.6 | 97.2 | 98.2 | 82.3 | 95.7 | 86.4 | 98.2 | 98.4 | 96.7 | 99.0 | 84.7 | 96.7 | 84.3 | 98.9 | 93.7 |
SSGRL | 95.0 | 99.7 | 98.4 | 98.0 | 97.6 | 85.7 | 96.2 | 98.2 | 98.8 | 82.0 | 98.1 | 89.7 | 98.8 | 98.7 | 97.0 | 99.0 | 86.9 | 98.1 | 85.8 | 99.0 | 93.7 |
LLR | 94.6 | 99.4 | 97.5 | 97.9 | 97.1 | 83.9 | 95.2 | 97.7 | 98.0 | 83.6 | 95.4 | 90.0 | 97.7 | 98.0 | 96.3 | 99.0 | 86.8 | 96.5 | 88.4 | 98.7 | 94.4 |
ML-GLLR | 95.9 | 99.8 | 98.4 | 98.2 | 98.2 | 86.2 | 97.6 | 98.2 | 98.8 | 85.7 | 97.2 | 92.6 | 98.7 | 98.9 | 97.1 | 99.2 | 89.2 | 98.3 | 90.7 | 99.3 | 96.1 |
方法 | MSCOCO2014 | VOC2007 | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
mAP | T-OF1 | T-CF1 | A-OF1 | A-CF1 | mAP | T-OF1 | T-CF1 | A-OF1 | A-CF1 | |
Res-101 | 80.1 | 75.0 | 71.8 | 78.5 | 74.9 | 91.9 | 87.7 | 85.5 | 87.7 | 85.5 |
LLR(无DLSA模块) | 81.4 | 75.6 | 72.9 | 79.2 | 76.5 | 92.7 | 89.3 | 86.9 | 89.3 | 86.9 |
LLR(无语义模块) | 82.1 | 76.0 | 72.8 | 79.6 | 77.0 | 93.6 | 89.9 | 87.7 | 89.9 | 87.7 |
LLR | 83.8 | 77.0 | 75.0 | 81.0 | 78.8 | 94.6 | 90.5 | 88.6 | 90.4 | 88.5 |
ML-GLLR | 84.0 | 77.1 | 74.8 | 81.1 | 78.8 | 95.9 | 90.9 | 89.6 | 90.9 | 89.5 |
Tab. 3 Ablation experimental results
方法 | MSCOCO2014 | VOC2007 | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
mAP | T-OF1 | T-CF1 | A-OF1 | A-CF1 | mAP | T-OF1 | T-CF1 | A-OF1 | A-CF1 | |
Res-101 | 80.1 | 75.0 | 71.8 | 78.5 | 74.9 | 91.9 | 87.7 | 85.5 | 87.7 | 85.5 |
LLR(无DLSA模块) | 81.4 | 75.6 | 72.9 | 79.2 | 76.5 | 92.7 | 89.3 | 86.9 | 89.3 | 86.9 |
LLR(无语义模块) | 82.1 | 76.0 | 72.8 | 79.6 | 77.0 | 93.6 | 89.9 | 87.7 | 89.9 | 87.7 |
LLR | 83.8 | 77.0 | 75.0 | 81.0 | 78.8 | 94.6 | 90.5 | 88.6 | 90.4 | 88.5 |
ML-GLLR | 84.0 | 77.1 | 74.8 | 81.1 | 78.8 | 95.9 | 90.9 | 89.6 | 90.9 | 89.5 |
1 | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks [C]// Proceedings of the 2012 25th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2012: 1097-1105. |
2 | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. [2021-03-15]. . 10.5244/c.28.6 |
3 | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 |
4 | 刘尚旺,郜翔.基于深度模型迁移的细粒度图像分类方法[J].计算机应用,2018,38(8):2198-2204. |
LIU S W, GAO X. Fine-grained image classification method based on deep model transfer [J]. Journal of Computer Applications, 2018, 38(8): 2198-2204. | |
5 | DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database [C]// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2009: 248-255. 10.1109/cvpr.2009.5206848 |
6 | PHAM H, DAI Z H, XIE Q Z, et al. Meta pseudo labels [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 11552-11563. 10.1109/cvpr46437.2021.01139 |
7 | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context [C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8693. Cham: Springer, 2014: 740-755. |
8 | ZHU F, LI H S, OUYANG W L, et al. Learning spatial regularization with image-level supervisions for multi-label image classification [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 2027-2036. 10.1109/cvpr.2017.219 |
9 | WANG J, YANG Y, MAO J H, et al. CNN-RNN: a unified framework for multi-label image classification [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 2285-2294. 10.1109/cvpr.2016.251 |
10 | CHEN S F, CHEN Y C, YEH C K, et al. Order-free RNN with visual attention for multi-label classification [C]// Proceedings of the 2018 32nd AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2018: 6714-6721. |
11 | YAZICI V O, GONZALEZ-GARCIA A, RAMISA A, et al. Orderless recurrent models for multi-label classification [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 13437-13446. 10.1109/cvpr42600.2020.01345 |
12 | CHEN Z M, WEI X S, WANG P, et al. Multi-label image recognition with graph convolutional networks [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 5172-5181. 10.1109/cvpr.2019.00532 |
13 | EVERINGHAM M, GOOL VAN L, WILLIAMS C K I, et al. The PASCAL Visual Object Classes (VOC) challenge [J]. International Journal of Computer Vision, 2010, 88(2):303-338. 10.1007/s11263-009-0275-4 |
14 | XU K, BA J L, KIROS R, et al. Show, attend and tell: neural image caption generation with visual attention [C]// Proceedings of the 2015 32nd International Conference on Machine Learning. New York: JMLR.org, 2015: 2048-2057. |
15 | 张小川,戴旭尧,刘璐,等.融合多头自注意力机制的中文短文本分类模型[J].计算机应用,2020,40(12):3485-3489. 10.11772/j.issn.1001-9081.2020060914 |
ZHANG X C, DAI X Y, LIU L, et al. Chinese short text classification model with multi-head self-attention mechanism [J]. Journal of Computer Applications, 2020, 40(12): 3485-3489. 10.11772/j.issn.1001-9081.2020060914 | |
16 | 高钦泉,赵岩,李根,等.基于知识蒸馏的超分辨率卷积神经网络压缩方法[J].计算机应用,2019,39(10):2802-2808. |
GAO Q Q, ZHAO Y, LI G, et al. Compression method of super-resolution convolutional neural network based on knowledge distillation [J]. Journal of Computer Applications, 2019, 39(10): 2802-2808. | |
17 | 邓棋,雷印杰,田锋.用于肺炎图像分类的优化卷积神经网络方法[J].计算机应用,2020,40(1):71-76. |
DENG Q, LEI Y J, TIAN F. Optimized convolutional neural network method for classification of pneumonia images [J]. Journal of Computer Applications, 2020, 40(1): 71-76. | |
18 | CHEN T S, XU M X, HUI X L, et al. Learning semantic-specific graph representation for multi-label image recognition [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 522-531. 10.1109/iccv.2019.00061 |
19 | GE W F, YANG S B, YU Y Z. Multi-evidence filtering and fusion for multi-label classification, object detection and semantic segmentation based on weakly supervised learning [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 1277-1286. 10.1109/cvpr.2018.00139 |
20 | ZHANG J J, WU Q, SHEN C H, et al. Multilabel image classification with regional latent semantic dependencies [J]. IEEE Transactions on Multimedia, 2018, 20(10): 2801-2813. 10.1109/tmm.2018.2812605 |
21 | WEI Y C, XIA W, LIN M, et al. HCP: a flexible CNN framework for multi-label image classification [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(9): 1901-1907. 10.1109/tpami.2015.2491929 |
[1] | Jieru JIA, Jianchao YANG, Shuorui ZHANG, Tao YAN, Bin CHEN. Unsupervised person re-identification based on self-distilled vision Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2893-2902. |
[2] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. |
[3] | Xiyuan WANG, Zhancheng ZHANG, Shaokang XU, Baocheng ZHANG, Xiaoqing LUO, Fuyuan HU. Unsupervised cross-domain transfer network for 3D/2D registration in surgical navigation [J]. Journal of Computer Applications, 2024, 44(9): 2911-2918. |
[4] | Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738. |
[5] | Shunyong LI, Shiyi LI, Rui XU, Xingwang ZHAO. Incomplete multi-view clustering algorithm based on self-attention fusion [J]. Journal of Computer Applications, 2024, 44(9): 2696-2703. |
[6] | Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877. |
[7] | Yunchuan HUANG, Yongquan JIANG, Juntao HUANG, Yan YANG. Molecular toxicity prediction based on meta graph isomorphism network [J]. Journal of Computer Applications, 2024, 44(9): 2964-2969. |
[8] | Yuhan LIU, Genlin JI, Hongping ZHANG. Video pedestrian anomaly detection method based on skeleton graph and mixed attention [J]. Journal of Computer Applications, 2024, 44(8): 2551-2557. |
[9] | Yanjie GU, Yingjun ZHANG, Xiaoqian LIU, Wei ZHOU, Wei SUN. Traffic flow forecasting via spatial-temporal multi-graph fusion [J]. Journal of Computer Applications, 2024, 44(8): 2618-2625. |
[10] | Qianhong SHI, Yan YANG, Yongquan JIANG, Xiaocao OUYANG, Wubo FAN, Qiang CHEN, Tao JIANG, Yuan LI. Multi-granularity abrupt change fitting network for air quality prediction [J]. Journal of Computer Applications, 2024, 44(8): 2643-2650. |
[11] | Yubo ZHAO, Liping ZHANG, Sheng YAN, Min HOU, Mao GAO. Relation extraction between discipline knowledge entities based on improved piecewise convolutional neural network and knowledge distillation [J]. Journal of Computer Applications, 2024, 44(8): 2421-2429. |
[12] | Zheng WU, Zhiyou CHENG, Zhentian WANG, Chuanjian WANG, Sheng WANG, Hui XU. Deep learning-based classification of head movement amplitude during patient anaesthesia resuscitation [J]. Journal of Computer Applications, 2024, 44(7): 2258-2263. |
[13] | Dongwei WANG, Baichen LIU, Zhi HAN, Yanmei WANG, Yandong TANG. Deep network compression method based on low-rank decomposition and vector quantization [J]. Journal of Computer Applications, 2024, 44(7): 1987-1994. |
[14] | Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072. |
[15] | Zhi ZHANG, Xin LI, Naifu YE, Kaixi HU. DKP: defending against model stealing attacks based on dark knowledge protection [J]. Journal of Computer Applications, 2024, 44(7): 2080-2086. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||