《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (2): 374-379.DOI: 10.11772/j.issn.1001-9081.2021122043

• 人工智能 • 上一篇    

基于决策边界优化域自适应的跨库语音情感识别

汪洋1, 傅洪亮1, 陶华伟1(), 杨静1, 谢跃2, 赵力3   

  1. 1.粮食信息处理与控制教育部重点实验室(河南工业大学), 郑州 450001
    2.南京工程学院 信息与通信工程学院, 南京 211167
    3.东南大学 信息科学与工程学院, 南京 210096
  • 收稿日期:2021-12-06 修回日期:2022-04-27 接受日期:2022-05-11 发布日期:2022-06-13 出版日期:2023-02-10
  • 通讯作者: 陶华伟
  • 作者简介:汪洋(1999—),男,河南信阳人,硕士研究生,CCF会员,主要研究方向:语音信号处理
    傅洪亮(1965—),男,河南安阳人,教授,博士,主要研究方向:通信与信息系统
    杨静(1983—),女,河南商丘人,副教授,博士,主要研究方向:通信信号处理
    谢跃(1991—),男,江苏淮安人,博士,主要研究方向:人工智能、情感计算
    赵力(1958—),男,江苏南京人,教授,博士,主要研究方向:语音信号处理、情感信息处理。
  • 基金资助:
    国家自然科学基金资助项目(62001215);河南省教育厅自然科学项目(21A120003);河南工业大学高层次人才启动项目(2018BS037)

Cross-corpus speech emotion recognition based on decision boundary optimized domain adaptation

Yang WANG1, Hongliang FU1, Huawei TAO1(), Jing YANG1, Yue XIE2, Li ZHAO3   

  1. 1.Key Laboratory of Grain Information Processing and Control,Ministry of Education (Henan University of Technology),Zhengzhou Henan 450001,China
    2.School of Information and Communication Engineering,Nanjing Institute of Technology,Nanjing Jiangsu 211167,China
    3.School of Information Science and Engineering,Southeast University,Nanjing Jiangsu 210096,China
  • Received:2021-12-06 Revised:2022-04-27 Accepted:2022-05-11 Online:2022-06-13 Published:2023-02-10
  • Contact: Huawei TAO
  • About author:WANG Yang, born in 1999, M. S. candidate. His research interests include speech signal processing.
    FU Hongliang, born in 1965, Ph. D., professor. His research interests include communication and information systems.
    YANG Jing, born in 1983, Ph. D., associate professor. Her research interests include communication signal processing.
    XIE Yue, born in 1991, Ph. D., lecturer. His research interests include artificial intelligence, affective computing.
    ZHAO Li, born in 1958, Ph. D., professor. His research interests include speech signal processing, affective information processing.
  • Supported by:
    National Natural Science Foundation of China(62001215);Natural Science Project of Education Department of Henan Province(21A120003);Start-up Fund for High-level Talents of Henan University of Technology(2018BS037)

摘要:

域自适应算法被广泛应用于跨库语音情感识别中;然而,许多域自适应算法在追求减小域差异的同时,丧失了目标域样本的鉴别性,导致其以高密度的形式存在于模型决策边界处,降低了模型的性能。基于此,提出一种基于决策边界优化域自适应(DBODA)的跨库语音情感识别方法。首先利用卷积神经网络进行特征处理,随后将特征送入最大化核范数及均值差异(MNMD)模块,在减小域间差异的同时,最大化目标域情感预测概率矩阵的核范数,从而提升目标域样本的鉴别性并优化决策边界。在以Berlin、eNTERFACE和CASIA语音库为基准库设立的六组跨库实验中,所提方法的平均识别精度领先于其他算法1.68~11.01个百分点,说明所提模型有效降低了决策边界的样本密度,提升了预测的准确性。

关键词: 跨库语音情感识别, 卷积神经网络, 决策边界优化, 域自适应, 特征分布差异

Abstract:

Domain adaptation algorithms are widely used for cross-corpus speech emotion recognition. However, many domain adaptation algorithms lose the discrimination of target domain samples while pursuing the minimization of domain discrepancy, resulting in their presence at the decision boundary of the model in a high-density form, which degrades the performance of the model. Based on the above problem, a Decision Boundary Optimized Domain Adaptation (DBODA) method based cross-corpus speech emotion recognition was proposed. Firstly, the features were processed by using convolutional neural networks. Then, the features were fed into the Maximum Nuclear-norm and Mean Discrepancy (MNMD) module to maximize the nuclear norm of the sentiment prediction probability matrix of the target domain while reducing the inter-domain discrepancy, thereby enhancing the discrimination of the target domain samples and optimize the decision boundary. In six sets of cross-corpus experiments set up on the basis of Berlin, eNTERFACE and CASIA speech databases, the average recognition accuracy of the proposed method is 1.68 to 11.01 percentage points ahead of those of the other algorithms, indicating that the proposed model effectively reduces the sample density around the decision boundary and improves the prediction accuracy.

Key words: cross-corpus speech emotion recognition, convolutional neural network, decision boundary optimization, domain adaptation, feature distribution discrepancy

中图分类号: