Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Homologous spectrogram feature fusion with self-attention mechanism for bird sound classification
Zhihua LIU, Wenjie CHEN, Aibin CHEN
Journal of Computer Applications    2022, 42 (4): 1260-1268.   DOI: 10.11772/j.issn.1001-9081.2021071258
Abstract492)   HTML11)    PDF (1376KB)(190)       Save

At present, most deep learning models are difficult to deal with the classification of bird sound under complex background noise. Because bird sound has the continuity characteristic in time domain and high-low characteristic in frequency domain, a fusion model of homologous spectrogram features was proposed for bird sound classification under complex background noise. Firstly, Convolutional Neural Network (CNN) was used to extract Mel-spectrogram features of bird sound. Then, the time domain and frequency domain dimensions of the same Mel-spectrogram feature were compressed to 1 by specific convolution and down-sampling operations, so that frequency domain feature with only high-low characteristics and the time domain feature with only continuous characteristics were obtained. Based on the above operation to extract frequency domain and time domain features, the features of Mel-spectrogram were extracted both in time domain and frequency domain, the time-frequency domain features with continuity and high-low characteristics were obtained. Then the self-attention mechanism was applied to the obtained time domain, frequency domain and time-frequency domain features, strengthening their own characteristics. Finally, the results of these three homologous spectrogram features after decision fusion were used for bird sound classification. The proposed model was used for audio classification of 8 bird species on Xeno-canto website, achieved the better result in the comparison experiment with the Mean Average Precision (MAP) of 0.939. The experimental results show that the proposed model can deal with the problem of the poor classification effect of bird sound under complex background noise.

Table and Figures | Reference | Related Articles | Metrics