Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Generative adversarial network underwater image enhancement model based on Swin Transformer

Hui LI, Bingzhi JIA, Chenxi WANG, Ziyu DONG, Jilong LI, Zhaoman ZHONG, Yanyan CHEN

Journal of Computer Applications 2025, 45 (5): 1439-1446. DOI: 10.11772/j.issn.1001-9081.2024050730

Abstract （57）

HTML （2）

PDF （3642KB）（26）

Save

Aiming at the problems of low contrast， heavy noise and color deviation in underwater images， using Generative Adversarial Network （GAN） model as the core framework， a new underwater image enhancement model was proposed based on GAN， namely SwinGAN （GAN based on Swin Transformer）. Firstly， the generative network was designed according to the encoder-bottleneck-decoder structure， where the input feature maps were divided into multiple non-overlapping local windows at the bottleneck layer. Secondly， a Dual-path Window Multi-head Self-Attention mechanism（DWMSA） was introduced to enhance local attention while simultaneously capturing global information and long-range dependencies. Finally， the decoder recombined the multiple windows back into the original size feature maps， and the discriminator network employed a Markov discriminator. Compared to the URSCT-SESR model， SwinGAN model shows an improvement of 0.837 2 dB in Peak Signal-to-Noise Ratio （PSNR） and 0.003 6 in Structural SIMilarity index （SSIM） on the UFO-120 dataset. On the EUVP-515 dataset， SwinGAN model achieves more significant improvement， with a 0.843 9 dB boost in PSNR， an increase of 0.005 1 in SSIM， an enhancement of 0.112 4 in Underwater Image Quality Measure （UIQM）， and a slight increase of 0.001 0 in Underwater Color Image Quality Evaluation （UCIQE）. Experimental results demonstrate that the SwinGAN model excels in both subjective and objective evaluation metrics， achieving notable improvements in correcting color deviation in underwater images.

Table and Figures | Reference | Related Articles | Metrics

Select

Multimodal sarcasm detection model integrating contrastive learning with sentiment analysis

Wenbin HU, Tianxiang CAI, Tianle HAN, Zhaoman ZHONG, Changxia MA

Journal of Computer Applications 2025, 45 (5): 1432-1438. DOI: 10.11772/j.issn.1001-9081.2024050731

Abstract （44）

HTML （4）

PDF （1779KB）（14）

Save

Comments on social media platforms sometimes express their attitudes towards events through sarcasm. Sarcasm detection can more accurately analyze user sentiments and opinions. But traditional models based on vocabulary and syntactic structure ignore the role of text sentiment information in sarcasm detection and suffer from performance degradation due to data noise. To address these limitations， a Multimodal Sarcasm Detection model integrating Contrastive learning with Sentiment analysis （MSDCS） was proposed. Firstly， BERT （Bidirectional Encoder Representation from Transformers） was used to extract text features， and ViT （Vision Transformer） was used to extract image features. Then， the contrastive loss in contrastive learning was employed to train a shallow model， and the image and text features were aligned before fusion. Finally， the cross-modal features were combined with the sentiment features to make classification judgments， and the use of information between different modalities was maximized to achieve sarcasm detection. Experimental results on the open dataset of multimodal sarcasm detection show that the accuracy and F1 value of MSDCS are at least 1.85% and 1.99% higher than those of the baseline model Decomposition and Relation Network （D&R Net）， verifying the effectiveness of using sentiment information and contrastive learning in multimodal sarcasm detection.

Table and Figures | Reference | Related Articles | Metrics