Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Single-channel speech separation model based on auditory modulation Siamese network
Yuan SONG, Xin CHEN, Yarong LI, Yongwei LI, Yang LIU, Zhen ZHAO
Journal of Computer Applications    2025, 45 (6): 2025-2033.   DOI: 10.11772/j.issn.1001-9081.2024050724
Abstract27)   HTML0)    PDF (2813KB)(10)       Save

To address the problem of overlapping time-frequency points among different speakers leading to poor separation performance in single-channel speech separation methods based on spectrogram feature input, a single-channel speech separation model based on auditory modulation Siamese network was proposed. Firstly, the modulation signals were computed through frequency band division and envelope demodulation, and the modulation amplitude spectrum was extracted using Fourier transform. Secondly, mapping relationship between the modulation amplitude spectrum features and speech segments was obtained using a mutation point detection and matching method to achieve effective segmentation of speech segments. Thirdly, a Fusion of Co-attention Mechanisms in Siamese Neural Network (FCMSNN) was designed to extract discriminative features of speech segments of different speakers. Fourthly, a Neighborhood-based Self-Organizing Map (N-SOM) network was proposed to perform feature clustering without pre-specifying the number of speakers by defining a dynamic neighborhood range, so as to obtain mask matrices for different speakers. Finally, to avoid artifacts in the reconstructed signals in the modulation domain, a time-domain filter was designed to convert modulation-domain masks into time-domain masks and reconstruct speech signals by combining phase information. The experimental results show that the proposed model outperforms the Double-Density Dual-Tree Complex Wavelet Transform (DDDTCWT) method in terms of Perceptual Evaluation of Speech Quality (PESQ), Signal-to-Distortion Ratio improvement (SDRi) and Scale-Invariant Signal-to-Distortion Ratio improvement (SI-SDRi); on WSJ0-2mix and WSJ0-3mix datasets the proposed model has PESQ, SDRi, and SI-SDRi improved by 3.47%, 6.91% and 7.79% and 3.08%, 6.71% and 7.51% respectively.

Table and Figures | Reference | Related Articles | Metrics
3D domain parameterization method based on high-dimensional quasi-conformal mapping
Yuanyuan SONG, Maodong PAN
Journal of Computer Applications    2025, 45 (6): 1963-1970.   DOI: 10.11772/j.issn.1001-9081.2024060850
Abstract18)   HTML0)    PDF (2805KB)(3)       Save

Aiming at the high-quality parameterization problem of constructing a complex Three-Dimensional (3D) computational domain with given boundary conditions in isogeometric analysis, a 3D domain parameterization method based on high-dimensional quasi-conformal mapping was proposed. The core of the proposed method is to establish a nonlinear optimization model that describe the bijectivity, angular distortion, and volume distortion of the mapping simultaneously. Firstly, the high-dimensional quasi-conformal mapping theory was used to derive a new formula for measuring angular distortion in 3D space. Then, exponential variable and volume constant were introduced into the optimization model, and geometrical meaning of the Jacobi matrix was exploited to achieve the goal of adding volume distortion while preserving mapping bijectivity. Finally, Alternating Direction Method of Multipliers (ADMM) framework was combined with L-BFGS (Limited-memory Broyden-Fletcher-Goldfarb-Shanno) method to decompose the original problem into tractable subproblems and they were solved alternatively. Experimental results show that the proposed method guarantees global bijectivity on the experimental model; the proposed method has the orthogonality increased by about 5.8% compared to ADMM-LRP (ADMM algorithm for Low-Rank Parameterization), and has the volume uniformity improved by about 34.4% compared to TTS (Tet-To-Spline optimization strategy). It can be seen that the proposed method achieves high-quality parameterization, ensures bijectivity of mapping, and reduces angular distortion as well as and volume distortion.

Table and Figures | Reference | Related Articles | Metrics