Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Single-channel speech separation model based on auditory modulation Siamese network
Yuan SONG, Xin CHEN, Yarong LI, Yongwei LI, Yang LIU, Zhen ZHAO
Journal of Computer Applications    2025, 45 (6): 2025-2033.   DOI: 10.11772/j.issn.1001-9081.2024050724
Abstract27)   HTML0)    PDF (2813KB)(10)       Save

To address the problem of overlapping time-frequency points among different speakers leading to poor separation performance in single-channel speech separation methods based on spectrogram feature input, a single-channel speech separation model based on auditory modulation Siamese network was proposed. Firstly, the modulation signals were computed through frequency band division and envelope demodulation, and the modulation amplitude spectrum was extracted using Fourier transform. Secondly, mapping relationship between the modulation amplitude spectrum features and speech segments was obtained using a mutation point detection and matching method to achieve effective segmentation of speech segments. Thirdly, a Fusion of Co-attention Mechanisms in Siamese Neural Network (FCMSNN) was designed to extract discriminative features of speech segments of different speakers. Fourthly, a Neighborhood-based Self-Organizing Map (N-SOM) network was proposed to perform feature clustering without pre-specifying the number of speakers by defining a dynamic neighborhood range, so as to obtain mask matrices for different speakers. Finally, to avoid artifacts in the reconstructed signals in the modulation domain, a time-domain filter was designed to convert modulation-domain masks into time-domain masks and reconstruct speech signals by combining phase information. The experimental results show that the proposed model outperforms the Double-Density Dual-Tree Complex Wavelet Transform (DDDTCWT) method in terms of Perceptual Evaluation of Speech Quality (PESQ), Signal-to-Distortion Ratio improvement (SDRi) and Scale-Invariant Signal-to-Distortion Ratio improvement (SI-SDRi); on WSJ0-2mix and WSJ0-3mix datasets the proposed model has PESQ, SDRi, and SI-SDRi improved by 3.47%, 6.91% and 7.79% and 3.08%, 6.71% and 7.51% respectively.

Table and Figures | Reference | Related Articles | Metrics
Image inpainting algorithm of multi-scale generative adversarial network based on multi-feature fusion
Gang CHEN, Yongwei LIAO, Zhenguo YANG, Wenying LIU
Journal of Computer Applications    2023, 43 (2): 536-544.   DOI: 10.11772/j.issn.1001-9081.2022010015
Abstract525)   HTML21)    PDF (4735KB)(190)       Save

Aiming at the problems in Multi-scale Generative Adversarial Networks Image Inpainting algorithm (MGANII), such as unstable training in the process of image inpainting, poor structural consistency, insufficient details and textures of the inpainted image, an image inpainting algorithm of multi-scale generative adversarial network was proposed based on multi-feature fusion. Firstly, aiming at the problems of poor structural consistency and insufficient details and textures, a Multi-Feature Fusion Module (MFFM) was introduced in the traditional generator, and a perception-based feature reconstruction loss function was introduced to improve the ability of feature extraction in the dilated convolutional network, thereby supplying more details and texture features for the inpainted image. Then, a perception-based feature matching loss function was introduced into local discriminator to enhance the discrimination ability of the discriminator, thereby improving the structural consistency of the inpainted image. Finally, a risk penalty term was introduced into the adversarial loss function to meet the Lipschitz continuity condition, so that the network was able to converge rapidly and stably in the training process. On the dataset CelebA, compared with MANGII, the proposed multi-feature fusion image inpainting algorithm can converges faster. Meanwhile, the Peak Signal-to-Noise Ratio (PSNR) and Structural SIMilarity (SSIM) of the images inpainted by the proposed algorithm are improved by 0.45% to 8.67% and 0.88% to 8.06% respectively compared with those of the images inpainted by the baseline algorithms, and Frechet Inception Distance score (FID) of the images inpainted by the proposed algorithm is reduced by 36.01% to 46.97% than the images inpainted by the baseline algorithms. Experimental results show that the inpainting performance of the proposed algorithm is better than that of the baseline algorithms.

Table and Figures | Reference | Related Articles | Metrics
Situation assessment method based on improved evidence theory
WANG Yongwei LIU Yunan ZHAO Cairong SI Cheng QIU Wei
Journal of Computer Applications    2014, 34 (2): 491-495.  
Abstract515)      PDF (721KB)(588)       Save
Evidence theory is one of the main approaches to implement situation assessment based on rules. But evidence theory can result in paradox problem in conflicting evidence combination. Concerning this problem, by dissimilarity calculation, the importance of evidence was measured and original evidence was modified. A new approach based on the improved evidence theory was proposed. The new approach contained four steps including rules measurement, evidence modification, rules fusion and situation decision. The experiments show that the new approach can avoid paradox problem in the process of fusion based on evidence theory, and it is superior to typical approaches, such as Dempster approach, Yager approach and Leung approach, etc, in efficiency and accuracy of situation assesment.
Related Articles | Metrics