计算机应用 ›› 2018, Vol. 38 ›› Issue (3): 891-894.DOI: 10.11772/j.issn.1001-9081.2017092175

• 虚拟现实与多媒体计算 • 上一篇    下一篇

结合多尺度时频调制与多线性主成分分析的乐器识别

王飞, 于凤芹   

  1. 江南大学 物联网工程学院, 江苏 无锡 214100
  • 收稿日期:2017-09-11 修回日期:2017-10-20 出版日期:2018-03-10 发布日期:2018-03-07
  • 通讯作者: 王飞
  • 作者简介:王飞(1991-),男,江苏无锡人,硕士研究生,主要研究方向:语音信号处理、深度学习;于凤芹(1962-),女,辽宁北镇人,教授,博士,主要研究方向:语音信号处理、非平稳信号时频分析。

Musical instrument identification based on multiscale time-frequency modulation and multilinear principal component analysis

WANG Fei, YU Fengqing   

  1. School of Internet of Things Engineering, Jiangnan University, Wuxi Jiangsu 214100, China
  • Received:2017-09-11 Revised:2017-10-20 Online:2018-03-10 Published:2018-03-07

摘要: 针对目前时域频域特征、倒谱特征、稀疏特征、概率特征对同族乐器错分率高且对打击乐器识别不佳的问题,提出一种提取时频信息且低冗余度的模型用于乐器识别。首先利用耳蜗模型对乐音进行谐波分解,生成接近人耳感知且包含时频信息的听觉谱图(AS);随后利用多尺度滤波器对听觉谱图多尺度时频调制(MTFM)以观测时频的变化;最后利用多线性主成分分析(MPCA)对调制输出在保留数据内在相关的前提下降维,并使用支持向量机(SVM)分类。仿真实验表明,该方法在IOWA数据库上取得92.74%的正确率,对打击乐器与同族乐器的错分率分别为3%与9.12%,均优于上述特征。相比主成分分析(PCA)降维,MPCA提高识别准确率6.43%。因此,该模型适用于对同族乐器与打击乐器的识别。

关键词: 多尺度时频调制, 多线性主成分分析, 听觉谱图, 支持向量机, 乐器识别

Abstract: Aiming at time or frequency feature, cepstrum feature, sparse feature and probability feature's poor classification performance for kindred and percussion instrument, an enhanced model for extracting time-frequency information and with lower redundancy was proposed. Firstly, a cochlea model was set to filter music signal, whose output was called Auditory Spectrum (AS) containing harmonic information and close to human perception. Secondly, time-frequency feature was acquired by Multiscale Time-Frequency Modulation (MTFM). Then, dimension reduction was implied by Multilinear Principal Component Analysis (MPCA) to preserve the structure and intrinsic correlation. Finally, classification was conducted using Support Vector Machine (SVM). The experimental results show that MTFM's average accuracy is 92.74% on IOWA database and error rate of percussion or kindred instrument is 3% and 9.12%, which wins out the features mentioned above. The accuracy of MPCA was higher 6.43% than that of Principle Component Analysis (PCA). It is proved that the proposed model is an option for kindred and percussion instrument identification.

Key words: Multiscale Time-Frequency Modulation (MTFM), Multilinear Principal Component Analysis (MPCA), Auditory Spectrum (AS), Support Vector Machine (SVM), musical instrument identification

中图分类号: