Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (7): 2072-2077.DOI: 10.11772/j.issn.1001-9081.2021050740
Special Issue: 人工智能
• Artificial intelligence • Previous Articles Next Articles
Wanjun LIU, Jiaming WANG(), Haicheng QU, Libing DONG, Xinyu CAO
Received:
2021-05-10
Revised:
2021-11-05
Accepted:
2021-11-24
Online:
2021-12-31
Published:
2022-07-10
Contact:
Jiaming WANG
About author:
LIU Wanjun, born in 1959, M. S., professor. His research interests include digital image processing, moving target detection and tracking.Supported by:
通讯作者:
王佳铭
作者简介:
刘万军(1959—),男,辽宁锦州人,教授,硕士,CCF高级会员,主要研究方向:数字图像处理、运动目标检测与跟踪基金资助:
CLC Number:
Wanjun LIU, Jiaming WANG, Haicheng QU, Libing DONG, Xinyu CAO. Music genre classification algorithm based on attention spectral-spatial feature[J]. Journal of Computer Applications, 2022, 42(7): 2072-2077.
刘万军, 王佳铭, 曲海成, 董利兵, 曹欣宇. 基于频谱空间域特征注意的音乐流派分类算法[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2072-2077.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021050740
预处理方式 | 流派分类准确率 |
---|---|
传统傅里叶变换 | 85.35 |
梅尔频谱 | 87.27 |
Tab.1 Genre classification accuracy of ablation experiment of feature preprocessing
预处理方式 | 流派分类准确率 |
---|---|
传统傅里叶变换 | 85.35 |
梅尔频谱 | 87.27 |
实验编号 | 四重卷积 | 空间注意力 | 残差模块 | 准确率/% |
---|---|---|---|---|
a | — | — | — | 87.27 |
b | — | √ | — | 88.38 |
c | √ | √ | — | 89.01 |
d | — | √ | √ | 90.10 |
e | √ | √ | √ | 91.62 |
Tab.2 Genre classification accuracies in ablation experiment for main modules of model
实验编号 | 四重卷积 | 空间注意力 | 残差模块 | 准确率/% |
---|---|---|---|---|
a | — | — | — | 87.27 |
b | — | √ | — | 88.38 |
c | √ | √ | — | 89.01 |
d | — | √ | √ | 90.10 |
e | √ | √ | √ | 91.62 |
网络 | 流派分类准确率 |
---|---|
GoogLeNet | 81.18 |
ResNet-34B | 84.67 |
VGGNet19 | 86.11 |
AlexNet | 86.26 |
DCNN-SSA | 91.62 |
Tab.3 Genre classification accuracy comparison of different networks on verification set
网络 | 流派分类准确率 |
---|---|
GoogLeNet | 81.18 |
ResNet-34B | 84.67 |
VGGNet19 | 86.11 |
AlexNet | 86.26 |
DCNN-SSA | 91.62 |
网络 | 流派分类准确率 |
---|---|
GoogLeNet | 70.00 |
ResNet-34B | 72.00 |
VGGNet19 | 76.00 |
AlexNet | 76.00 |
DCNN-SSA | 82.00 |
Tab.4 Genre classification accuracy comparison of different networks on test set
网络 | 流派分类准确率 |
---|---|
GoogLeNet | 70.00 |
ResNet-34B | 72.00 |
VGGNet19 | 76.00 |
AlexNet | 76.00 |
DCNN-SSA | 82.00 |
1 | 伊恩•本特,戴明瑜. 音乐分析学导论[J]. 中国音乐, 1995(4): 50-51. |
BENT I B, DAI M Y. Introduction to music analysis[J]. Chinese Music, 1995(4): 50-51. | |
2 | SAMSON J. Genre[J/OL]. Grove music online.[2021-02-20]. . 10.1093/gmo/9781561592630.article.40599 |
3 | TZANETAKIS G, COOK P. Musical genre classification of audio signals[J]. IEEE Transactions on Speech and Audio Processing, 2002, 10(5):293-302. 10.1109/tsa.2002.800560 |
4 | WOLD E, BLUM T, KEISLAR D, et al. Content-based classification, search, and retrieval of audio[J]. IEEE Multimedia, 1996, 3(3): 27-36. 10.1109/93.556537 |
5 | COVER T, HART P. Nearest neighbor pattern classification[J]. IEEE Transactions on Information Theory, 1967, 13(1): 21-27. 10.1109/tit.1967.1053964 |
6 | DUDA R O, HART P E, STORK D G. Pattern Classification[M]. 2nd ed. New York: John Wiley & Sons, Inc., 2000: 5-6. |
7 | 徐星. 基于最小一范数的稀疏表示音乐流派与乐器分类算法研究[D]. 天津:天津大学, 2012: 154-171. |
XU X. Research on the musical genre and instruments classification based on sparse representation-based classification via L1-minimization[D]. Tianjin: Tianjin University, 2012: 154-171. | |
8 | 焦李成,杨淑媛,刘芳,等. 神经网络七十年:回顾与展望[J]. 计算机学报, 2016, 39(8): 1697-1716. |
JIAO L C, YANG S Y, LIU F, et al. Seventy years beyond neural networks: retrospect and prospect[J]. Chinese Journal of Computers, 2016, 39(8): 1697-1716. | |
9 | 曹玉红,徐海,刘荪傲,等. 基于深度学习的医学影像分割研究综述[J]. 计算机应用, 2021, 41(8):2273-2287. |
CAO Y H, XU H, LIU S A, et al. Review of deep learning-based medical image segmentation[J]. Journal of Computer Applications, 2021, 41(8):2273-2287. | |
10 | 孔伶旭,吴海锋,曾玉,等. 使用深度学习和不同频率维度的脑功能性连接对轻微认知障碍的诊断[J]. 计算机应用, 2021, 41(2):590-597. |
KONG L X, WU H F, ZENG Y, et al. Diagnosis of mild cognitive impairment using deep learning and brain functional connectivities with different frequency dimensions[J]. Journal of Computer Applications, 2021, 41(2):590-597. | |
11 | 史文旭,鲍佳慧,姚宇. 基于深度学习的遥感图像目标检测与识别[J]. 计算机应用, 2020, 40(12):3558-3562. 10.1109/csrswtc50769.2020.9372469 |
SHI W X, BAO J H, YAO Y. Remote sensing image target detection and identification based on deep learning[J]. Journal of Computer Applications, 2020, 40(12):3558-3562. 10.1109/csrswtc50769.2020.9372469 | |
12 | 彭育辉,郑玮鸿,张剑锋. 基于深度学习的道路障碍物检测方法[J]. 计算机应用, 2020, 40(8):2428-2433. 10.1109/icaica50127.2020.9181920 |
PENG Y H, ZHENG W H, ZHANG J F. Deep learning-based on-road obstacle detection method[J]. Journal of Computer Applications, 2020, 40(8):2428-2433. 10.1109/icaica50127.2020.9181920 | |
13 | LI T L H, CHAN A B, CHUN A H W. Automatic musical pattern feature extraction using convolutional neural network[C]// Proceedings of the 2010 International MultiConference of Engineering and Computer Scientists. [S.l.]: International Association of Engineers, 2010:546-550. |
14 | DIELEMAN S, SCHRAUWEN B. End-to-end learning for music audio[C]// Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2014:6964-6968. 10.1109/icassp.2014.6854950 |
15 | YANG H S, ZHANG W Q. Music genre classification using duplicated convolutional layers in neural networks[C]// Interspeech 2019: Proceedings of the 20th Annual Conference of the International Speech Communication Association. [S.l.]: International Speech Communication Association, 2019: 3382-3386. |
16 | 杜佑宸. 基于卷积神经网络的音乐流派分类研究[D]. 大连:大连理工大学, 2019: 26-27. |
DU Y C. Research of music genre classification based on convolutional neural network[D]. Dalian: Dalian University of Technology, 2019:26-27. | |
17 | MANNEPALLI K, SASTRY P N, SUMAN M. MFCC-GMM based accent recognition system for Telugu speech signals[J]. International Journal of Speech Technology, 2016, 19(1): 87-93. 10.1007/s10772-015-9328-y |
[1] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. |
[2] | Xiyuan WANG, Zhancheng ZHANG, Shaokang XU, Baocheng ZHANG, Xiaoqing LUO, Fuyuan HU. Unsupervised cross-domain transfer network for 3D/2D registration in surgical navigation [J]. Journal of Computer Applications, 2024, 44(9): 2911-2918. |
[3] | Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877. |
[4] | Yunchuan HUANG, Yongquan JIANG, Juntao HUANG, Yan YANG. Molecular toxicity prediction based on meta graph isomorphism network [J]. Journal of Computer Applications, 2024, 44(9): 2964-2969. |
[5] | Shunyong LI, Shiyi LI, Rui XU, Xingwang ZHAO. Incomplete multi-view clustering algorithm based on self-attention fusion [J]. Journal of Computer Applications, 2024, 44(9): 2696-2703. |
[6] | Yuhan LIU, Genlin JI, Hongping ZHANG. Video pedestrian anomaly detection method based on skeleton graph and mixed attention [J]. Journal of Computer Applications, 2024, 44(8): 2551-2557. |
[7] | Yanjie GU, Yingjun ZHANG, Xiaoqian LIU, Wei ZHOU, Wei SUN. Traffic flow forecasting via spatial-temporal multi-graph fusion [J]. Journal of Computer Applications, 2024, 44(8): 2618-2625. |
[8] | Qianhong SHI, Yan YANG, Yongquan JIANG, Xiaocao OUYANG, Wubo FAN, Qiang CHEN, Tao JIANG, Yuan LI. Multi-granularity abrupt change fitting network for air quality prediction [J]. Journal of Computer Applications, 2024, 44(8): 2643-2650. |
[9] | Zheng WU, Zhiyou CHENG, Zhentian WANG, Chuanjian WANG, Sheng WANG, Hui XU. Deep learning-based classification of head movement amplitude during patient anaesthesia resuscitation [J]. Journal of Computer Applications, 2024, 44(7): 2258-2263. |
[10] | Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072. |
[11] | Zhi ZHANG, Xin LI, Naifu YE, Kaixi HU. DKP: defending against model stealing attacks based on dark knowledge protection [J]. Journal of Computer Applications, 2024, 44(7): 2080-2086. |
[12] | Yiqun ZHAO, Zhiyu ZHANG, Xue DONG. Anisotropic travel time computation method based on dense residual connection physical information neural networks [J]. Journal of Computer Applications, 2024, 44(7): 2310-2318. |
[13] | Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199. |
[14] | Xun SUN, Ruifeng FENG, Yanru CHEN. Monocular 3D object detection method integrating depth and instance segmentation [J]. Journal of Computer Applications, 2024, 44(7): 2208-2215. |
[15] | Yaxing BING, Yangping WANG, Jiu YONG, Haomou BAI. Six degrees of freedom object pose estimation algorithm based on filter learning network [J]. Journal of Computer Applications, 2024, 44(6): 1920-1926. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||