Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (6): 1826-1832.DOI: 10.11772/j.issn.1001-9081.2022071008
Special Issue: 人工智能
• Artificial intelligence • Previous Articles Next Articles
Huibin ZHANG1,2(), Liping FENG1, Yaojun HAO1, Yining WANG1
Received:
2022-07-11
Revised:
2022-11-18
Accepted:
2022-11-30
Online:
2023-01-04
Published:
2023-06-10
Contact:
Huibin ZHANG
About author:
FENG Liping, born in 1976, Ph. D., professor. Her research interests include distributed optimization, deep learning.Supported by:
通讯作者:
张慧斌
作者简介:
张慧斌(1971—),男,山西忻州人,副教授,博士研究生,主要研究方向:深度学习、应用数学Email:927433441@qq.com基金资助:
CLC Number:
Huibin ZHANG, Liping FENG, Yaojun HAO, Yining WANG. Ancient mural dynasty identification based on attention mechanism and transfer learning[J]. Journal of Computer Applications, 2023, 43(6): 1826-1832.
张慧斌, 冯丽萍, 郝耀军, 王一宁. 基于注意力机制和迁移学习的古壁画朝代识别[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1826-1832.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022071008
层名 | 输出map 尺寸 | 输出 channel 数 | 卷积操作方式 | 卷积 操作数 |
---|---|---|---|---|
Linear | 64×6 average pool, 64-6 fc + Softmax | |||
Conv1.X | 112×112 | 16 | 3×3 S=2 | 1 |
Conv2.X | 112×112 | 16 | 3×3 S=1 3个残差块 | 6 |
Attention | 16 | |||
残差连接 | 56×56 | 32 | 改进的残差连接方法( | |
Conv2.X | 56×56 | 32 | 第一个卷积3×3 S=2 其他卷积3×3 S=1 3个残差块 | 6 |
残差连接 | 28×28 | 64 | 改进的残差连接方法( | |
Conv3.X | 28×28 | 64 | 第一个卷积3×3 S=2 其他卷积3×3 S=1 3个残差块 | 6 |
Tab. 1 ResNet20 structure based on attention mechanism
层名 | 输出map 尺寸 | 输出 channel 数 | 卷积操作方式 | 卷积 操作数 |
---|---|---|---|---|
Linear | 64×6 average pool, 64-6 fc + Softmax | |||
Conv1.X | 112×112 | 16 | 3×3 S=2 | 1 |
Conv2.X | 112×112 | 16 | 3×3 S=1 3个残差块 | 6 |
Attention | 16 | |||
残差连接 | 56×56 | 32 | 改进的残差连接方法( | |
Conv2.X | 56×56 | 32 | 第一个卷积3×3 S=2 其他卷积3×3 S=1 3个残差块 | 6 |
残差连接 | 28×28 | 64 | 改进的残差连接方法( | |
Conv3.X | 28×28 | 64 | 第一个卷积3×3 S=2 其他卷积3×3 S=1 3个残差块 | 6 |
古壁画朝代 | 总样本数 | 训练集样本数 | 测试集样本数 |
---|---|---|---|
总计 | 1 926 | 1 158 | 768 |
北魏 | 303 | 175 | 128 |
北周 | 276 | 148 | 128 |
隋代 | 271 | 143 | 128 |
唐朝 | 341 | 213 | 128 |
五代 | 270 | 142 | 128 |
西魏 | 465 | 337 | 128 |
Tab.2 Numbers of images in different dynasties in DH1926 dataset
古壁画朝代 | 总样本数 | 训练集样本数 | 测试集样本数 |
---|---|---|---|
总计 | 1 926 | 1 158 | 768 |
北魏 | 303 | 175 | 128 |
北周 | 276 | 148 | 128 |
隋代 | 271 | 143 | 128 |
唐朝 | 341 | 213 | 128 |
五代 | 270 | 142 | 128 |
西魏 | 465 | 337 | 128 |
模型 | 总样本数 | 训练集 | 测试集 | 准确率/% | ||
---|---|---|---|---|---|---|
样本数 | 占比/% | 样本数 | 占比/% | |||
DunNet[ | 3 860 | 3 000 | 77.7 | 700 | 18.1 | 71.64 |
文献[ | 9 630 | 8 430 | 87.5 | 1 200 | 12.5 | 84.44 |
文献[ | 2 538 | 2 030 | 80.0 | 254 | 10.0 | 88.46 |
文献[ | 9 700 | 7 760 | 80.0 | 970 | 10.0 | 88.70 |
本文网络模型 | 1 926 | 1 158 | 60.1 | 768 | 39.9 | 98.05 |
Tab. 3 Comparison of experimental results of different network models
模型 | 总样本数 | 训练集 | 测试集 | 准确率/% | ||
---|---|---|---|---|---|---|
样本数 | 占比/% | 样本数 | 占比/% | |||
DunNet[ | 3 860 | 3 000 | 77.7 | 700 | 18.1 | 71.64 |
文献[ | 9 630 | 8 430 | 87.5 | 1 200 | 12.5 | 84.44 |
文献[ | 2 538 | 2 030 | 80.0 | 254 | 10.0 | 88.46 |
文献[ | 9 700 | 7 760 | 80.0 | 970 | 10.0 | 88.70 |
本文网络模型 | 1 926 | 1 158 | 60.1 | 768 | 39.9 | 98.05 |
分类器 | 测试准确率/% |
---|---|
Baseline | 97.00 |
Baseline++ | 96.61 |
本文的分类器 | 98.05 |
Tab. 4 Comparative analysis of classifier performance
分类器 | 测试准确率/% |
---|---|
Baseline | 97.00 |
Baseline++ | 96.61 |
本文的分类器 | 98.05 |
总样本 数 | 训练集 | 测试集 | 测试准确率/% | ||
---|---|---|---|---|---|
样本数 | 占比/% | 样本数 | 百分比/% | ||
1 926 | 964 | 50.1 | 962 | 49.9 | 97.56 |
1 926 | 1 158 | 60.1 | 768 | 39.9 | 98.05 |
1 926 | 1 542 | 80.1 | 384 | 19.9 | 98.70 |
Tab.5 Comparison of test accuracy on training sets and testing sets with different sample sizes
总样本 数 | 训练集 | 测试集 | 测试准确率/% | ||
---|---|---|---|---|---|
样本数 | 占比/% | 样本数 | 百分比/% | ||
1 926 | 964 | 50.1 | 962 | 49.9 | 97.56 |
1 926 | 1 158 | 60.1 | 768 | 39.9 | 98.05 |
1 926 | 1 542 | 80.1 | 384 | 19.9 | 98.70 |
网络模型 | 训练集样本数 | 测试集样本数 | 测试准确率/% |
---|---|---|---|
无POSA模块的ResNet20 | 1 158 | 768 | 92.84 |
有POSA模块的ResNet20 | 1 158 | 768 | 96.00 |
Tab. 6 Performance analysis of POSA module
网络模型 | 训练集样本数 | 测试集样本数 | 测试准确率/% |
---|---|---|---|
无POSA模块的ResNet20 | 1 158 | 768 | 92.84 |
有POSA模块的ResNet20 | 1 158 | 768 | 96.00 |
1 | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2015-04-10) [2022-05-10].. |
2 | 曹建芳,闫敏敏,贾一鸣,等. 融合迁移学习的Inception-v3模型在古壁画朝代识别中的应用[J]. 计算机应用, 2021, 41(11): 3219-3227. 10.11772/j.issn.1001-9081.2020121924 |
CAO J F, YAN M M, JIA Y M, et al. Application of Inception-v3 model integrated with transfer learning in dynasty identification of ancient murals[J]. Journal of Computer Applications, 2021, 41(11): 3219-3227. 10.11772/j.issn.1001-9081.2020121924 | |
3 | BALAKRISHNAN T, ROSSTON S, TANG E. Using CNN to classify and understand artists from the Rijksmuseum[R/OL]. [2022-05-10].. |
4 | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 |
5 | LI Q Q, ZOU Q, MA D, et al. Dating ancient paintings of Mogao Grottoes using deeply learnt visual codes[J]. Science China Information Sciences, 2018, 61(9): No.092105. 10.1007/s11432-017-9308-x |
6 | 曹建芳,闫敏敏,田晓东,等. 适应性增强胶囊网络的古壁画朝代识别算法[J]. 图学学报, 2021, 42(5): 744-754. |
CAO J F, YAN M M, TIAN X D, et al. A dynasty classification algorithm of ancient murals based on adaptively enhanced capsule network[J]. Journal of Graphics, 2021, 42(5): 744-754. | |
7 | LI X Y, ZENG Y, GONG Y. Chronological classification of ancient paintings of Mogao Grottoes using convolutional neural networks[C]// Proceedings of the IEEE 4th International Conference on Signal and Image Processing. Piscataway: IEEE, 2019:51-55. 10.1109/siprocess.2019.8868392 |
8 | ZHU Z D, LIN K X, JAIN A K, et al. Transfer learning in deep reinforcement learning: a survey[EB/OL]. (2022-05-16) [2022-06-10].. |
9 | KRIZHEVSKY A. Learning multiple layers of features from tiny images[R/OL]. (2009-04-08) [2022-05-10].. 10.1016/j.tics.2007.09.004 |
10 | YOSINSKI J, CLUNE J, BENGIO Y, et al. How transferable are features in deep neural networks?[C]// Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2014: 3320-3328. |
11 | DONAHUE J, JIA Y Q, VINYALS O, et al. DeCAF: a deep convolutional activation feature for generic visual recognition [C]// Proceedings of the 31st International Conference on Machine Learning. New York: JMLR.org, 2014: 647-655. |
12 | LONG M S, CAO Y, WANG J M, et al. Learning transferable features with deep adaptation networks[C]// Proceedings of the 32nd International Conference on Machine Learning. New York: JMLR.org, 2015:97-105. |
13 | GANIN Y, LEMPITSKY V. Unsupervised domain adaptation by backpropagation[C]// Proceedings of the 32nd International Conference on Machine Learning. New York: JMLR.org, 2015: 1180-1189. |
14 | GUO M H, XU T X, LIU J J, et al. Attention mechanisms in computer vision: a survey[J]. Computational Visual Media, 2022, 8(3): 331-368. 10.1007/s41095-022-0271-y |
15 | BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate [EB/OL]. (2016-05-19) [2022-05-10].. 10.1017/9781108608480.003 |
16 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 6000-6010. |
17 | WANG X L, GIRSHICK R, GUPTA A, et al. Non-local neural networks [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7794-7803. 10.1109/cvpr.2018.00813 |
18 | MISRA D, NALAMADA T, ARASANIPALAI A U, et al. Rotate to attend: convolutional triplet attention module [C]// Proceedings of 2021 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2021:3138-3147. 10.1109/wacv48630.2021.00318 |
19 | QIN Z, SUN W X, DENG H, et al. cosFormer: rethinking softmax in attention[EB/OL]. (2022-02-17) [2022-05-10].. |
20 | LIU H J, LIU F Q, FAN X Y, et al. Polarized self-attention: towards high-quality pixel-wise regression[EB/OL]. (2021-07-08) [2022-05-10].. 10.1016/j.neucom.2022.07.054 |
21 | CHEN W Y, LIU Y H, KIRA Z, et al. A closer look at few-shot classification[EB/OL]. (2020-01-12) [2022-05-10].. |
22 | HE T, ZHANG Z, ZHANG H, et al. Bag of tricks for image classification with convolutional neural networks [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 558-567. 10.1109/cvpr.2019.00065 |
23 | HE K, ZHANG X, REN S, et al. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 1026-1034. 10.1109/iccv.2015.123 |
24 | ZHU C, NI R K, XU Z, et al. GradInit: learning to initialize neural networks for stable and efficient training[C/OL]// Proceedings of the 35th Conference on Neural Information Processing Systems [2022-05-10].. |
25 | DE S, SMITH S L. Batch normalization biases residual blocks towards the identity function in deep networks [C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2020: 19964-19975. |
26 | ZHANG H B, FENG L P, ZHANG X H, et al. Necessary conditions for convergence of CNNs and initialization of convolution kernels[J]. Digital Signal Processing, 2022, 123: No.103397. 10.1016/j.dsp.2022.103397 |
27 | KINGMA D P, BA J L. Adam: a method for stochastic optimization[EB/OL]. (2017-01-30) [2022-05-10].. |
[1] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. |
[2] | Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738. |
[3] | Yun LI, Fuyou WANG, Peiguang JING, Su WANG, Ao XIAO. Uncertainty-based frame associated short video event detection method [J]. Journal of Computer Applications, 2024, 44(9): 2903-2910. |
[4] | Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892. |
[5] | Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392. |
[6] | Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406. |
[7] | Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594. |
[8] | Shangbin MO, Wenjun WANG, Ling DONG, Shengxiang GAO, Zhengtao YU. Single-channel speech enhancement based on multi-channel information aggregation and collaborative decoding [J]. Journal of Computer Applications, 2024, 44(8): 2611-2617. |
[9] | Hong CHEN, Bing QI, Haibo JIN, Cong WU, Li’ang ZHANG. Class-imbalanced traffic abnormal detection based on 1D-CNN and BiGRU [J]. Journal of Computer Applications, 2024, 44(8): 2493-2499. |
[10] | Yangyi GAO, Tao LEI, Xiaogang DU, Suiyong LI, Yingbo WANG, Chongdan MIN. Crowd counting and locating method based on pixel distance map and four-dimensional dynamic convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2233-2242. |
[11] | Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109. |
[12] | Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199. |
[13] | Dahai LI, Zhonghua WANG, Zhendong WANG. Dual-branch low-light image enhancement network combining spatial and frequency domain information [J]. Journal of Computer Applications, 2024, 44(7): 2175-2182. |
[14] | Wenliang WEI, Yangping WANG, Biao YUE, Anzheng WANG, Zhe ZHANG. Deep learning model for infrared and visible image fusion based on illumination weight allocation and attention [J]. Journal of Computer Applications, 2024, 44(7): 2183-2191. |
[15] | Wu XIONG, Congjun CAO, Xuefang SONG, Yunlong SHAO, Xusheng WANG. Handwriting identification method based on multi-scale mixed domain attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2225-2232. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||