Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Generative adversarial network underwater image enhancement model based on Swin Transformer
Hui LI, Bingzhi JIA, Chenxi WANG, Ziyu DONG, Jilong LI, Zhaoman ZHONG, Yanyan CHEN
Journal of Computer Applications    2025, 45 (5): 1439-1446.   DOI: 10.11772/j.issn.1001-9081.2024050730
Abstract57)   HTML2)    PDF (3642KB)(26)       Save

Aiming at the problems of low contrast, heavy noise and color deviation in underwater images, using Generative Adversarial Network (GAN) model as the core framework, a new underwater image enhancement model was proposed based on GAN, namely SwinGAN (GAN based on Swin Transformer). Firstly, the generative network was designed according to the encoder-bottleneck-decoder structure, where the input feature maps were divided into multiple non-overlapping local windows at the bottleneck layer. Secondly, a Dual-path Window Multi-head Self-Attention mechanism(DWMSA) was introduced to enhance local attention while simultaneously capturing global information and long-range dependencies. Finally, the decoder recombined the multiple windows back into the original size feature maps, and the discriminator network employed a Markov discriminator. Compared to the URSCT-SESR model, SwinGAN model shows an improvement of 0.837 2 dB in Peak Signal-to-Noise Ratio (PSNR) and 0.003 6 in Structural SIMilarity index (SSIM) on the UFO-120 dataset. On the EUVP-515 dataset, SwinGAN model achieves more significant improvement, with a 0.843 9 dB boost in PSNR, an increase of 0.005 1 in SSIM, an enhancement of 0.112 4 in Underwater Image Quality Measure (UIQM), and a slight increase of 0.001 0 in Underwater Color Image Quality Evaluation (UCIQE). Experimental results demonstrate that the SwinGAN model excels in both subjective and objective evaluation metrics, achieving notable improvements in correcting color deviation in underwater images.

Table and Figures | Reference | Related Articles | Metrics
Multimodal sarcasm detection model integrating contrastive learning with sentiment analysis
Wenbin HU, Tianxiang CAI, Tianle HAN, Zhaoman ZHONG, Changxia MA
Journal of Computer Applications    2025, 45 (5): 1432-1438.   DOI: 10.11772/j.issn.1001-9081.2024050731
Abstract44)   HTML4)    PDF (1779KB)(14)       Save

Comments on social media platforms sometimes express their attitudes towards events through sarcasm. Sarcasm detection can more accurately analyze user sentiments and opinions. But traditional models based on vocabulary and syntactic structure ignore the role of text sentiment information in sarcasm detection and suffer from performance degradation due to data noise. To address these limitations, a Multimodal Sarcasm Detection model integrating Contrastive learning with Sentiment analysis (MSDCS) was proposed. Firstly, BERT (Bidirectional Encoder Representation from Transformers) was used to extract text features, and ViT (Vision Transformer) was used to extract image features. Then, the contrastive loss in contrastive learning was employed to train a shallow model, and the image and text features were aligned before fusion. Finally, the cross-modal features were combined with the sentiment features to make classification judgments, and the use of information between different modalities was maximized to achieve sarcasm detection. Experimental results on the open dataset of multimodal sarcasm detection show that the accuracy and F1 value of MSDCS are at least 1.85% and 1.99% higher than those of the baseline model Decomposition and Relation Network (D&R Net), verifying the effectiveness of using sentiment information and contrastive learning in multimodal sarcasm detection.

Table and Figures | Reference | Related Articles | Metrics