Indoor speech separation and sound source localization system based on dual-microphone

CHEN Binjie, LU Zhihua, ZHOU Yu, YE Qingwei   

  1. Faculty of Information Science and Engineering, Ningbo University, Ningbo Zhejiang 315211, China
  • Received:2018-04-27 Revised:2018-07-03 Online:2018-12-10 Published:2018-12-15
    This work is partially supported by the National Natural Science Foundation of China (51675286, 61071198), the Science and Technology Innovation Team of Zhejiang Province (2013TD21).


陈斌杰, 陆志华, 周宇, 叶庆卫   

  1. 宁波大学 信息科学与工程学院, 浙江 宁波 315211
  • 作者简介:陈斌杰(1993-),男,湖北荆州人,硕士研究生,主要研究方向:语音信号处理、室内声源定位;陆志华(1983-),男,浙江金华人,讲师,博士,主要研究方向:语音信号处理、多运动目标的实时跟踪;周宇(1960-),男,山东威海人,教授,硕士,主要研究方向:信号处理、网络与信息安全;叶庆卫(1970-),男,浙江衢州人,教授,博士,主要研究方向:信号处理、最优化搜索。
Abstract: In order to explore the possibility of using two microphones for separation and locating of multiple sound sources in a two-dimensional plane, an indoor voice separation and sound source localization system based on dual-microphone was proposed. According to the signal collected by microphones, a dual-microphone time delay-attenuation model was established. Then, Degenerte Unmixing Estimation Technique (DUET) algorithm was used to estimate the delay-attenuation parameters of model, and the parameter histogram was drawn. In the speech separation stage, Binary Time-Frequency Masking (BTFM) was established. According to the parameter histogram, binary masking method was combined to separate the mixed speech. In the sound source localization stage, the mathematical equations for determining the location of sound source were obtained by deducing the relationship between the model attenuation parameters and the signal energy ratio. Roomsimove toolbox was used to simulate the indoor acoustic environment. Through Matlab simulation and geometric coordinate calculation, the locating in the two-dimensional plane was completed while separating multiple targets of sound source. The experimental results show that, the locating errors of the proposed system for multiple signals of sound source are less than 2%. Therefore, it contributes to the research and development of small system.

Key words: dual-microphone, speech separation, sound source localization, Degenerte Unmixing Estimation Technique (DUET) algorithm, two-dimensional plane

摘要: 为了探究利用两个麦克风进行多声源分离和二维平面定位的可能性,提出了一种基于双麦克风的室内语音分离与声源定位系统。该系统根据麦克风采集的信号,建立了双麦克风时延-衰减模型,然后利用DUET算法估计了模型的时延-衰减参数,并绘制了参数直方图。在语音分离阶段,建立了二进制时频掩膜(BTFM),根据参数直方图,结合二值掩蔽的方法对混合语音进行了分离;在声源定位阶段,通过推导模型衰减参数与信号能量比之间的关系,得到了确定声源位置的数学方程组。利用Roomsimove工具箱模拟室内声学环境,通过Matlab仿真和几何坐标计算,在对多个声源目标分离的同时完成了二维平面中的定位。实验结果表明,该系统对多个声源信号的定位误差均在2%以下,有助于小型系统的研究和开发。

关键词: 双麦克风, 语音分离, 声源定位, DUET算法, 二维平面

