Journal of Computer Applications ›› 2018, Vol. 38 ›› Issue (8): 2411-2415.DOI: 10.11772/j.issn.1001-9081.2018020311

Previous Articles     Next Articles

Improved pitch contour creation and selection algorithm for melody extraction

LI Qiang, YU Fengqin   

  1. College of Internet of Things Engineering, Jiangnan University, Wuxi Jiangsu 214122, China
  • Received:2018-02-05 Revised:2018-03-14 Online:2018-08-10 Published:2018-08-11

改进音高轮廓创建和选择的旋律提取算法

李强, 于凤芹   

  1. 江南大学 物联网工程学院, 江苏 无锡 214122
  • 通讯作者: 李强
  • 作者简介:李强(1991-),男,安徽亳州人,硕士研究生,主要研究方向:语音信号分析处理;于凤芹(1962-),女,辽宁北镇人,教授,博士,主要研究方向:语音信号处理、非平稳信号时频分析。

Abstract: Aiming at the problem that the discontinuity of the pitch sequence of the same sound source was caused by the interference of different sound sources in polyphonic music which reduced the accuracy of pitch estimation, an improved pitch contour creation and selection algorithm for melody extraction was proposed. Firstly, a method based on auditory streaming cues and the continuity of pitch salience was proposed to create pitch contour by calculating the pitch salience of each point in the time-frequency spectrum. In order to further select the melody pitch contour, the non-melodic pitch contours were removed according to the repetitive characteristics of the accompaniment, and dynamic time warping algorithm was used to calculate the similarity between the melodic and non-melodic pitch contours. Finally, the octave errors in the melodic pitch contours was detected based on the long term relationship of the adjacent pitch contours. Simulation experiments on the data set ORCHSET show that the pitch estimation accuracy and the overall accuracy of the proposed algorithm are improved by 2.86% and 3.32% respectively compared with the oringinal algorithm, which can effectively solve the pitch estimation problem.

Key words: melody extraction, pitch contour, continuity of pitch salience, Dynamic Time Warping (DTW), octave error

摘要: 针对复调音乐中不同声源的相互干扰而导致同一声源音高序列的不连续,从而降低音高估计精度的问题,提出改进音高轮廓创建和选择的旋律提取算法。算法首先计算时频谱中每一点的音高显著性,并提出基于听觉流线索和音高显著性的连续性创建音高轮廓;为了进一步选择旋律音高轮廓,随后提出根据伴奏的重复特性去除非旋律音高轮廓,主要采用动态时间规整算法计算旋律和非旋律音高轮廓间的相似度;最后,提出利用相邻音高轮廓的长时关系检测旋律音高轮廓中的倍频错误,并平滑旋律音高轮廓形成旋律音高线。在数据集ORCHSET上进行仿真实验,结果表明所提出的改进算法比改进前提高了2.86%的音高估计精度和3.32%的总精度,可有效解决音高估计问题。

关键词: 旋律提取, 音高轮廓, 音高显著性的连续性, 动态时间规整, 倍频错误

CLC Number: