Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (2): 556-562.DOI: 10.11772/j.issn.1001-9081.2023020157

• Multimedia computing and computer simulation • Previous Articles    

Channel compensation algorithm for speaker recognition based on probabilistic spherical discriminant analysis

Weipeng JING, Qingxin XIAO, Hui LUO()   

  1. School of Information and Computer Engineering,Northeast Forestry University,Harbin Heilongjiang 150006,China
  • Received:2023-02-21 Revised:2023-04-18 Accepted:2023-04-21 Online:2023-08-14 Published:2024-02-10
  • Contact: Hui LUO
  • About author:JING Weipeng, born in 1979, Ph. D., professor. His research interests include artificial intelligence.
    XIAO Qingxin, born in 1999, M. S. candidate. His research interests include speaker recognition.
  • Supported by:
    National Natural Science Foundation of China(62101114)


景维鹏, 肖庆欣, 罗辉()   

  1. 东北林业大学 信息与计算机工程学院,哈尔滨 150006
  • 通讯作者: 罗辉
  • 作者简介:景维鹏(1979—),男,黑龙江鹤岗人,教授,博士,CCF高级会员,主要研究方向:人工智能
  • 基金资助:


In speaker recognition tasks, the Probabilistic Linear Discriminant Analysis (PLDA) model is a commonly used classification backend. However, due to the inaccurate fitting of the real speaker feature distribution by the distribution assumption of Gaussian PLDA model, length normalization-based channel compensation methods based on the Gaussian distribution assumption may destroy the independence of the within-class distribution of speaker features, making the Gaussian PLDA unable to fully utilize the speaker information contained in the upstream task feature extraction, thereby affecting the recognition results. To address this issue, a Channel Compensation algorithm for speaker recognition based on Probabilistic Spherical Discriminant Analysis(CC-PSDA) was proposed, which introduced a Probabilistic Spherical Discriminant Analysis (PSDA) model with Von Mises-Fisher (VMF) distribution assumption and a feature transformation method to replace the PLDA method based on the Gaussian distribution assumption, for avoiding the impact of channel compensation on the independence of the within-class distribution of speaker features. Firstly,in order to make the speaker features conform to the VMF distribution prior assumption and fit the backend classification model,a nonlinear transformation was used to transform the distribution of the speaker features at the feature level. Then, by utilizing the characteristic of the PSDA model based on the VMF distribution assumption that does not destroy the within-class distribution structure of speaker features, the transformed speaker features were defined on a hypersphere of a specific dimension, maximizing the inter-class distance of features. The proposed model was solved by the EM (Expectation Maximum) algorithm, and the classification task was ultimately completed. Experimental results show that the improved algorithm has the lowest recognition equal error rates compared to the PSDA and Gaussian PLDA models on three test sets. Therefore, the proposed algorithm can effectively distinguish speaker features and improve recognition performance.

Key words: speaker recognition, i-vector, Probabilistic Spherical Discriminant Analysis (PSDA), channel compensation, Von Mises- Fisher (VMF) distribution, length normalization



关键词: 说话人识别, i-vector, 概率球面判别分析, 信道补偿, 冯·米塞斯-费希尔分布, 长度归一化

CLC Number: