基于Mel子带参数化特征的自动鸟鸣识别

doi:10.11772/j.issn.1001-9081.2017.04.1111

计算机应用 ›› 2017, Vol. 37 ›› Issue (4): 1111-1115.DOI: 10.11772/j.issn.1001-9081.2017.04.1111

基于Mel子带参数化特征的自动鸟鸣识别

张赛花, 赵兆, 许志勇, 张怡

南京理工大学电子工程与光电技术学院, 南京 210094

收稿日期:2016-09-14 修回日期:2016-12-26 出版日期:2017-04-10 发布日期:2017-04-19
通讯作者: 赵兆
作者简介:张赛花(1993-),女,江苏南通人,硕士研究生,主要研究方向:信号处理、模式识别;赵兆(1979-),男,湖北襄阳人,副教授,博士,主要研究方向:声探测系统、信号处理、时频分析;许志勇(1968-),男,江苏南京人,副教授,博士,主要研究方向:声探测系统、阵列信号处理;张怡(1994-),女,江苏苏州人,硕士研究生,主要研究方向:信号处理、模式识别。
基金资助:
国家自然科学基金资助项目（61401203，61171167）；江苏省自然科学基金资助项目（BK20130776）。

Automatic bird vocalization identification based on Mel-subband parameterized feature

ZHANG Saihua, ZHAO Zhao, XU Zhiyong, ZHANG Yi

School of Electronic and Optical Engineering, Nanjing University of Science and Technology, Nanjing Jiangsu 210094, China

Received:2016-09-14 Revised:2016-12-26 Online:2017-04-10 Published:2017-04-19
Supported by:
This work is partially supported by the National Natural Science Foundation of China (61401203, 61171167), the Natural Science Foundation of Jiangsu Province (BK20130776).

摘要/Abstract

摘要： 针对自然复杂声学环境下基于鸟鸣的物种分类问题，提出了一种基于Mel子带参数化特征的鸟鸣自动识别方法。采用高斯混合模型（GMM）拟合连续声学监测数据分帧后的对数能量分布，选取高似然率的数据帧组成候选声音事件完成自动分段。在谱图域对相应片段采用Mel带通滤波器组滤波处理，然后基于自回归模型（AR）分别建模各个子带输出的随时间变化的能量序列，得到能够描述不同种类鸟鸣信号时频特性的参数化特征。最后利用支持向量机（SVM）分类器进行分类识别。基于野外自然环境11种鸟鸣信号开展了自动分段与识别实验，所提方法针对各类鸟鸣的查准率、查全率以及F1度量均不低于89%，明显优于现有基于纹理特征的方法，更适用于野外鸟类连续声学监测领域的自动数据分析需求。

关键词: 鸟鸣, 自动识别, Mel子带, 时间序列建模, 支持向量机

Abstract: Aiming at the vocalization-based bird species classification in natural acoustic environments, an automatic bird vocalization identification method was proposed based on a new Mel-subband parameterized feature. The field recordings were first divided into consecutive frames and the distribution of log-energies of those frames were estimated using Gaussian Mixture Model (GMM) of two mixtures. The frames with respect to high likelihood were selected to compose initial candidate acoustic events. Afterwards, a Mel band-pass filter-bank was first employed on the spectrogram of each event. Then, the output of each subband, i.e. a time-series containing time-varying band-limited energy, was parameterized by an AutoRegressive (AR) model, which resulted in a parameterized feature set consisting of all model coefficients for each bird acoustic event. Finally, the Support Vector Machine (SVM) classifier was utilized to identify bird vocalization. The experimental results on real-field recordings containing vocalizations of eleven bird species demonstrate that the precision, recall and F1-measure of the proposed method are all not less than 89%, which indicates that the proposed method considerably outperforms the state-of-the-art texture-feature-based method and is more suitable for automatic data analysis in continuous monitoring of songbirds in natural environments.

Key words: bird vocalization, automated identification, Mel-subband, time-series modeling, Support Vector Machine (SVM)

中图分类号:

TP391.4

张赛花, 赵兆, 许志勇, 张怡. 基于Mel子带参数化特征的自动鸟鸣识别[J]. 计算机应用, 2017, 37(4): 1111-1115.

ZHANG Saihua, ZHAO Zhao, XU Zhiyong, ZHANG Yi. Automatic bird vocalization identification based on Mel-subband parameterized feature[J]. Journal of Computer Applications, 2017, 37(4): 1111-1115.

参考文献

[1] GREGORY R D, NOBLE D, FIELD R, et al. Using birds as indicators of biodiversity[EB/OL].[2016-03-10]. http://ornis.hu/articles/OrnisHungarica_vol12-13_p11-25.pdf.
[2] 沈少青, 宫鹏, 程晓, 等.陆生动物声音遥感:定位与误差分析[J]. 遥感学报, 2011, 15(6): 1255-1275.(SHEN S Q, GONG P, CHENG X, et al. Sound-based remote sensing of terrestrial animals: localization and error analysis[J]. Journal of Remote Sensing, 2011, 15(6): 1255-1275.)
[3] SAHIDULLAH M, SAHA G. Comparison of speech activity detection techniques for speaker recognition[J]. Journal of Immunotherapy, 2012, 33(33): 609-617.
[4] ALAM J, KENNY P, QUELLET P, et al. Supervised/unsupervised voice activity detectors for text-dependent speaker recognition on the RSR2015 corpus[EB/OL].[2016-03-10]. http://www.crim.ca/perso/patrick.kenny/Alam_odyssey2014.pdf.
[5] GANCHEV T D, JAHN O, MARQUES M I, et al. Automated acoustic detection of Vanellus chilensis lampronotus[J]. Expert Systems with Applications, 2015, 42(15/16): 6098-6111.
[6] SWISTON K A, MENNILL D J. Comparison of manual and automated methods for identifying target sounds in audio recordings of pileated, pale-billed, and putative ivory-billed woodpeckers[J]. Journal of Field Ornithology, 2009, 80(1): 42-50.
[7] EHNES M, FOOTE J R. Comparison of autonomous and manual recording methods for discrimination of individually distinctive ovenbird songs[J]. Bioacoustics, 2015, 24(2): 111-121.
[8] LEE C H, HSU S B, SHIH J L, et al. Continuous birdsong recognition using Gaussian mixture modeling of image shape features[J]. IEEE Transactions on Multimedia, 2012, 15(2): 454-464.
[9] 张晓霞, 李应.基于能量检测的复杂环境下的鸟鸣识别[J]. 计算机应用, 2013, 33(10): 2945-2949.(ZHANG X X, LI Y. Bird sounds recognition based on energy detection in complex environments[J]. Journal of Computer Applications, 2013, 33(10): 2945-2949.)
[10] 陈莎莎, 李应.结合时-频纹理特征的随机森林分类器应用于鸟声识别[J]. 计算机应用与软件, 2014, 31(1): 154-157.(CHEN S S, LI Y. Applying random forest classifier combined with time-frequency texture features to bird sounds recognition[J]. Computer Applications and Software, 2014, 31(1): 154-157.)
[11] 魏静明, 李应.利用抗噪纹理特征的快速鸟鸣声识别[J]. 电子学报, 2015, 43(1): 185-190.(WEI J M, LI Y. Rapid bird sound recognition using anti-noise texture features[J]. Acta Electronica Sinica, 2015, 43(1): 185-190.)
[12] SHANNON R V. Is birdsong more like speech or music[J]. Trends in Cognitive Sciences, 2016, 20(4): 245-247.
[13] VENTURA T M, OLIVEIRA A G, GANCHEV T D, et al. Audio parameterization with robust frame selection for improved bird identification[J]. Expert Systems with Applications, 2015, 42(22): 8463-8471.
[14] ZHU X, GONG P, ZHAO Z, et al. Learning similarity metric with SVM[C]//IJCNN 2012: Proceedings of the 2012 International Joint Conference on Neural Networks. Piscataway, NJ: IEEE, 2012: 3342-3349.
[15] HSU C W, LIN C J. A comparison of methods for multiclass support vector machines[J]. IEEE Transactions on Neural Networks, 2002, 13(2): 415-425.
[16] Naturalis biodiversity center. Repository of sound under the creative commons license[EB/OL].[2016-03-25]. http://www.xeno-canto.org/.
[17] CHANG C, LIN C. LIBSVM: a library for support vector machines[CP/OL].[2016-04-05]. http://www.csie.ntu.edu.tw/~cjlin/libsvm/libsvm-3.18zip.
[18] 周志华.机器学习[M]. 北京:清华大学出版社, 2016: 30-32.(ZHOU Z H. Machine Learning[M]. Beijing: Qinghua University Press, 2016: 30-32.)

基于Mel子带参数化特征的自动鸟鸣识别

Automatic bird vocalization identification based on Mel-subband parameterized feature

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	贾鹤鸣, 姜子超, 李瑶, 孙康健. 基于改进斑点鬣狗优化算法的同步优化特征选择[J]. 计算机应用, 2021, 41(5): 1290-1298.
[2]	袁芊芊, 邓洪敏, 王晓航. 基于超像素快速模糊C均值聚类与支持向量机的柑橘病虫害区域分割[J]. 计算机应用, 2021, 41(2): 563-570.
[3]	李凯, 李洁. 基于pinball损失的结构模糊多分类支持向量机算法[J]. 《计算机应用》唯一官方网站, 2021, 41(11): 3104-3112.
[4]	陆荣秀, 陈明明, 杨辉, 朱建勇. 基于溶液图像时序特征的元素组分含量动态监测系统[J]. 计算机应用, 2021, 41(10): 3075-3081.
[5]	童林, 官铮. 改进鲸鱼优化支持向量机的交通流量模糊粒化预测[J]. 计算机应用, 2021, 41(10): 2919-2927.
[6]	张健铭, 施元昊, 徐正蓺, 魏建明. 基于误差预测的自适应UWB/PDR融合定位算法[J]. 计算机应用, 2020, 40(6): 1755-1762.
[7]	王杨, 赵红东. 基于改进粒子群优化的支持向量机与情景感知的人体活动识别[J]. 计算机应用, 2020, 40(3): 665-671.
[8]	黄功, 赵永平, 谢云龙. 基于局部密度的加权一类支持向量机算法及其在涡轴发动机故障检测中的应用[J]. 计算机应用, 2020, 40(3): 917-924.
[9]	赵一, 段兴, 谢仕义, 梁春林. 面向特定目标自识别的交通图像语义检索方法[J]. 计算机应用, 2020, 40(2): 553-560.
[10]	李卉, 杨志霞. 基于Rescaled Hinge损失函数的多子支持向量机[J]. 计算机应用, 2020, 40(11): 3139-3145.
[11]	牛晓可, 黄伊鑫, 徐华兴, 蒋震阳. 基于听皮层神经元感受野的强噪声环境下说话人识别[J]. 计算机应用, 2020, 40(10): 3034-3040.
[12]	白东颖, 易亚星, 王庆超, 余志勇. 面向概念漂移问题的渐进多核学习方法[J]. 计算机应用, 2019, 39(9): 2494-2498.
[13]	何海琳, 郑建彬, 余方利, 余烈, 詹恩奇. 基于改进鲸鱼优化算法的外骨骼机器人步态检测[J]. 计算机应用, 2019, 39(7): 1905-1911.
[14]	潘建国, 李豪. 基于实用拜占庭容错的物联网入侵检测方法[J]. 计算机应用, 2019, 39(6): 1742-1746.
[15]	孔菁, 郭渊博, 刘春辉, 王一丰. 基于智能手机运动传感器的步态特征身份识别方法[J]. 计算机应用, 2019, 39(6): 1747-1752.