Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Channel compensation algorithm for speaker recognition based on probabilistic spherical discriminant analysis
Weipeng JING, Qingxin XIAO, Hui LUO
Journal of Computer Applications    2024, 44 (2): 556-562.   DOI: 10.11772/j.issn.1001-9081.2023020157
Abstract254)   HTML6)    PDF (1543KB)(142)       Save

In speaker recognition tasks, the Probabilistic Linear Discriminant Analysis (PLDA) model is a commonly used classification backend. However, due to the inaccurate fitting of the real speaker feature distribution by the distribution assumption of Gaussian PLDA model, length normalization-based channel compensation methods based on the Gaussian distribution assumption may destroy the independence of the within-class distribution of speaker features, making the Gaussian PLDA unable to fully utilize the speaker information contained in the upstream task feature extraction, thereby affecting the recognition results. To address this issue, a Channel Compensation algorithm for speaker recognition based on Probabilistic Spherical Discriminant Analysis(CC-PSDA) was proposed, which introduced a Probabilistic Spherical Discriminant Analysis (PSDA) model with Von Mises-Fisher (VMF) distribution assumption and a feature transformation method to replace the PLDA method based on the Gaussian distribution assumption, for avoiding the impact of channel compensation on the independence of the within-class distribution of speaker features. Firstly,in order to make the speaker features conform to the VMF distribution prior assumption and fit the backend classification model,a nonlinear transformation was used to transform the distribution of the speaker features at the feature level. Then, by utilizing the characteristic of the PSDA model based on the VMF distribution assumption that does not destroy the within-class distribution structure of speaker features, the transformed speaker features were defined on a hypersphere of a specific dimension, maximizing the inter-class distance of features. The proposed model was solved by the EM (Expectation Maximum) algorithm, and the classification task was ultimately completed. Experimental results show that the improved algorithm has the lowest recognition equal error rates compared to the PSDA and Gaussian PLDA models on three test sets. Therefore, the proposed algorithm can effectively distinguish speaker features and improve recognition performance.

Table and Figures | Reference | Related Articles | Metrics