Digital image tampering detection is critically important in the fields such as digital forensics and media content verification. However, in real-world applications, the tampered images are often post-processed in brightness and contrast, which will weaken tampering traces and degrade performance of the existing algorithms. To address this challenge, a restoration-assisted image tampering detection network ReConWave-Net was proposed. The network was consisted of two key modules: a classification-guided image restoration module was used to perform targeted restoration of images based on the categories of image disturbances, thereby reducing the impact of brightness and contrast disturbances; and a tampering localization module was used to strengthen the feature expression and localization ability of the tampered regions through multi-scale wavelet features and contrastive learning mechanism. The proposed network was evaluated on multiple datasets under various brightness and contrast disturbances. In terms of restoration quality, compared with the unrestored post-processed images, the proposed method increased the average Peak Signal-to-Noise Ratio (PSNR) in tampered regions from 10.86 dB to 31.57 dB, and improved the average Structural SIMilarity index (SSIM) from 0.40 to 0.92; in terms of detection performance, under typical disturbances, the network had the F1 score of 0.730 and an Intersection over Union (IoU) of 0.653. It can be seen that combining targeted restoration with detection can enhance the robustness of tampering localization of post-processed images significantly.
Existing models still face multiple challenges in Aspect Sentiment Quad Prediction (ASQP) task. They have difficulty in dealing with implicit sentiment expressions (such as implicit aspects or opinions) that lack explicit lexical cues, making it difficult for models to accurately capture sentiment tendencies. A quad prediction is considered correct only when all predicted elements of this prediction exactly match the correct elements. However, models may generate easily confused synonyms or synonymous words, leading to completely incorrect quad predictions. Moreover, existing models focus on improving the probability of predicting correct words, ignoring the suppression of easily confused words. Additionally, the cross-entropy loss used by these models makes them overconfident about incorrect predictions, lacking uncertainty modeling and thus failing in actively suppressing high-risk errors. These problems limit the performance of existing models in Aspect-Based Sentiment Analysis (ABSA) tasks. To address these problems, a Supervised Contrastive generative sentiment analysis method with Uncertainty-Aware Unlikelihood Learning (SCUAUL) was proposed. Firstly, supervised contrastive learning was used to shorten the semantic space distance of similar samples (e.g., same sentiment polarity) through contrastive loss, enhancing the model's ability to distinguish key features (e.g., sentiment polarity, implicit aspects) of input data. Secondly, Monte Carlo Dropout (MC Dropout) was used to capture the model's inherent uncertainty and identify easily confused words. By marginalizing unlikely learning, the generation probability of easily confused words was dynamically suppressed while maintaining the probability of generating correct words, and a minimum entropy constraint was combined to balance generation diversity and accuracy. Average results of five experiments on the Rest15 and Rest16 datasets showed that, compared with the suboptimal model AugABSA (data Augmentation by text generation for ABSA) and the classic model PARAPHRASE, SCUAUL improved precision by 0.40, 3.98 and 0.38, 3.83 percentage points, the recall by 0.30, 2.87 and 0.48, 2.88 percentage points, and the F1 score by 0.35, 3.43 and 0.42, 3.37 percentage points, respectively, verifying the effectiveness of SCUAUL in ABSA tasks.
Traditional data augmentation techniques, such as synonym substitution, random insertion, and random deletion, may change the original semantics of text and even result in the loss of critical information. Moreover, data in text classification tasks typically have both textual and label parts. However, traditional data augmentation methods only focus on the textual part. To address these issues, a Label Confusion incorporated Data Augmentation (LCDA) technique was proposed for providing a comprehensive enhancement of data from both textual and label aspects. In terms of text, by enhancing the text through random insertion and replacement of punctuation marks and completing end-of-sentence punctuation marks, textual diversity was increased with all textual information and sequence preserved. In terms of labels, simulated label distribution was generated using a label confusion approach, and used to replace the traditional one-hot label distribution, so as to better reflect the relationships among instances and labels as well as between labels. In experiments conducted on few-shot datasets constructed from THUCNews (TsingHua University Chinese News) and Toutiao Chinese news datasets, the proposed technique was combined with TextCNN, TextRNN, BERT (Bidirectional Encoder Representations from Transformers), and RoBERTa-CNN (Robustly optimized BERT approach Convolutional Neural Network) text classification models. The experimental results indicate that compared to those before enhancement, all models demonstrate significant performance improvements. Specifically, on 50-THU, a dataset constructed on THUCNews dataset, the accuracies of four models combing LCDA technique are improved by 1.19, 6.87, 3.21, and 2.89 percentage points, respectively, compared to those before enhancement, and by 0.78, 7.62, 1.75, and 1.28 percentage points, respectively, compared to those of the four models combining softEDA (Easy Data Augmentation with soft labels) method. By both textual and label processing results, model accuracy is enhanced by LCDA technique significantly, particularly in application scenarios characterized by limited data availability.
Unsupervised contrastive learning for Chinese faces multiple challenges: first, the structure of Chinese sentences is highly flexible and the semantic ambiguity is high, which make it difficult for models to capture deep semantic features accurately; second, on small-scale datasets, the feature-expression ability of contrastive learning models is insufficient, and effective semantic representations are hard to be learned fully; third, redundant noise may be introduced by the data augmentation process, further enhancing the instability of training. These issues limit the performance of models in Chinese semantic understanding jointly. To solve these problems, an unsupervised contrastive learning method for Chinese with Mutual Information (MI) and Prompt Learning (CMIPL) was proposed. Firstly, data augmentation approach of prompt learning was adopted to construct the sample pairs required for contrastive learning, so that all text information and order were maintained, text diversity was increased, the input structure of samples was standardized, and prompt templates were provided for input samples as context to guide the model to learn fine-grained semantics more deeply. Secondly, based on the output representation of the pre-trained language model, a prompt template denoising method was used to remove the redundant noise introduced by data augmentation. Finally, the structural information of positive samples was incorporated into the model training system, so that MI of the attention tensor of the augmented view was calculated, and the attention MI was introduced into the loss function. By minimizing the loss function, the attention distribution of the model was optimized, and alignment of the augmented view structure was maximized, so as to enable the model to better narrow the distance between positive pairs. Comparison experiments were conducted on few-shot data constructed from three public Chinese text similarity datasets: ATEC, BQ, and PAWSX. The results show that the proposed method has the best average performance, especially when the training data size is small. When using 1% and 10% sample size, compared with the baseline contrastive learning model SimCSE (Simple Contrastive learning of Sentence Embeddings), CMIPL has the average accuracy and the Spearman’s Rank correlation coefficient (SR) increased by 3.45, 4.07 and 1.64, 2.61 percentage points, respectively, verifying the effectiveness of CMIPL in the field of unsupervised few-shot contrastive learning for Chinese.
As a kind of side information, Knowledge Graph (KG) can effectively improve the recommendation quality of recommendation models, but the existing knowledge-awareness recommendation methods based on Graph Neural Network (GNN) suffer from unbalanced utilization of node information. To address the above problem, a new recommendation method based on Knowledge?awareness and Cross-level Contrastive Learning (KCCL) was proposed. To alleviate the problem of unbalanced node information utilization caused by the sparse interaction data and noisy knowledge graph that deviate from the true representation of inter-node dependencies during information aggregation, a contrastive learning paradigm was introduced into knowledge-awareness recommendation model of GNN. Firstly, the user-item interaction graph and the item knowledge graph were integrated into a heterogeneous graph, and the node representations of users and items were realized by a GNN based on the graph attention mechanism. Secondly, consistent noise was added to the information propagation aggregation layer for data augmentation to obtain node representations of different levels, and the obtained outermost node representation was compared with the innermost node representation for cross-level contrastive learning. Finally, the supervised recommendation task and the contrastive learning assistance task were jointly optimized to obtain the final representation of each node. Experimental results on DBbook2014 and MovieLens-1m datasets show that compared to the second prior contrastive method, the Recall@10 of KCCL is improved by 3.66% and 0.66%, respectively, and the NDCG@10 is improved by 3.57% and 3.29%, respectively, which verifies the effectiveness of KCCL.
Focusing on the failure of intrusion detection resulted from low captured image width of traditional Wireless Visual Sensor Network (WVSN) target-barrier, a Wireless visual sensor network β Quality of Monitoring (β-QoM) Target-Barrier coverage Construction (WβTBC) algorithm was proposed to ensure that the captured image width is not less than β. Firstly, the geometric model of the visual sensor β-QoM region was established, and it was proven that the width of intruder image captured by the target-barrier of intersection of all adjacent visual sensor β-QoM regions must be greater than or equal to β. Then, based on the linear programming modeling for optimal β-QoM target-barrier coverage of WVSN, it was proven that this coverage problem is NP-hard. Finally, in order to obtain suboptimal solution of the problem, a heuristic algorithm WβTBC was proposed. In this algorithm, the directed graph of WVSN was constructed according to the counterclockwise β neighbor relationship between sensors, and Dijkstra algorithm was used to search β-QoM target-barriers in WVSN. Experimental results show that WβTBC algorithm can construct β-QoM target-barriers effectively, and save about 23.3%, 10.8% and 14.8% sensor nodes compared with Spiral Periphery Outer Coverage (SPOC), Spiral Periphery Inner Coverage (SPIC) and Target-Barrier Construction (TBC) algorithms, respectively. In addition, under the condition of meeting the requirements of intrusion detection, with the use of WβTBC algorithm, the smaller β is, the higher success rate of building β-QoM target-barrier will be, the fewer nodes will be needed in forming the barrier, and the longer working period of WVSN for β-QoM intrusion detection will be.
A multi-stage low-illuminance image enhancement network based on attention mechanism was proposed to solve the problem that the details of low-illuminance images are lost due to the overlapping of image contents and large brightness differences in some regions during the enhancement process of low-illuminance images. At the first stage, an improved multi-scale fusion module was used to perform preliminary image enhancement. At the second stage, the enhanced image information of the first stage was cascaded with the input of this stage, and the result was used as the input of the multi-scale fusion module in this stage. At the third stage, the enhanced image information of the second stage was cascaded with the input of the this stage, and the result was used as the input of the multi-scale fusion module in this stage. In this way, with the use of multi-stage fusion, not only the brightness of the image was improved adaptively, but also the details were retained adaptively. Experimental results on open datasets LOL and SICE show that compared to the algorithms and networks such as MSR (Multi-Scale Retinex) algorithm, gray Histogram Equalization (HE) algorithm and RetinexNet (Retina cortex Network), the proposed network has the value of Peak Signal-to-Noise Ratio (PSNR) 11.0% to 28.9% higher, and the value of Structural SIMilarity (SSIM) increased by 6.8% to 46.5%. By using multi-stage method and attention mechanism to realize low-illuminance image enhancement, the proposed network effectively solves the problems of image content overlapping and large brightness difference, and the images obtained by this network are more detailed and subjective recognizable with clearer textures.
Aiming at the problem of category vocabulary noise and label noise in weakly-supervised text classification tasks, a weakly-supervised text classification model with label semantic enhancement was proposed. Firstly, the category vocabulary was denoised on the basis of the contextual semantic representation of the words in order to construct a highly accurate category vocabulary. Then, a word category prediction task based on MASK mechanism was constructed to fine-tune the pre-training model BERT (Bidirectional Encoder Representations from Transformers), so as to learn the relationship between words and categories. Finally, a self-training module with label semantics introduced was used to make full use of all data information and reduce the impact of label noise in order to achieve word-level to sentence-level semantic conversion, thereby accurately predicting text sequence categories. Experimental results show that compared with the current state-of-the-art weakly-supervised text classification model LOTClass (Label-name-Only Text Classification), the proposed method improves the classification accuracy by 5.29, 1.41 and 1.86 percentage points respectively on the public datasets THUCNews, AG News and IMDB.
The existing knowledge concept recommendation system does not consider the short-term interest of users. To solve the problem, a Knowledge Concept Recommendation system based on Interest Enhancement (KCRec-IE) was proposed. Firstly, users’ short-term interests were captured according to the users’ knowledge concept click sequences, and a heterogeneous graph was constructed by using the side information. Then, the representation learning of knowledge concept entities and user entities was carried out on heterogeneous graph by using meta-path-guided graph convolution. Different from the representation learning of knowledge concept entities, when learning the representation of user entities, the contributions of different neighbor users to target users were able to be distinguished according to the short-term interests of users. Finally, the score prediction was realized according to the knowledge concept entities, the user entities and the user’s short-term interests. Experimental results on public dataset Xuetang X show that compared with KCRec-SEIGNN, KCRec-IE is improved by 3.60 percentage points on HR@5; compared with KCRec-IEn, KCRec-IE is improved by 1.02 percentage points on HR@10; compared with KCRec-SEIGNN, KCRec-IE is improved by 1.60 and 1.18 percentage points respectively on NDGC@5 and NDGC@10 respectively, verifying the effectiveness of the proposed method.
Concerning the lack of flexibility in adversarial training of Deep Convolutional Generative Adversarial Network (DCGAN) and the problems of inflexible optimization and unclear convergence state of Binary Cross-Entropy loss (BCE loss) function used in DCGAN, an improved algorithm of Generative Adversarial Network (GAN) based on arbitration mechanism was proposed. In this algorithm, the proposed arbitration mechanism was added on the basis of DCGAN. Firstly, the network structure of the proposed improved algorithm was composed of generator, discriminator, and arbiter. Secondly, the adversarial training was conducted by the generator and discriminator according to the training plan, and the abilities to generate images and verify the authenticity of images were strengthened according to the characteristics learned from the dataset respectively. Thirdly, the arbiter was generated by the generator and the discriminator after the last round of adversarial training and metric score calculation module, and the adversarial training results of the generator and the discriminator were measured by this arbiter and fed back into the training plan. Finally, a wining limit was added to the network structure to improve the stability of model training, and the Circle loss function was used to replace the BCE loss function, which made the model optimization process more flexible and the convergence state more clear. Experimental results show that the proposed algorithm has a good generation effect on the architectural and face datasets. On the Large-scale Scene UNderstanding (LSUN) dataset, the proposed algorithm has the Fréchet Inception Distance (FID) index decreased by 1.04% compared with the DCGAN original algorithm; on the CelebA dataset, the proposed algorithm has the Inception Score (IS) index increased by 4.53% compared with the DCGAN original algorithm. The images generated by the proposed algorithm have better diversity and higher quality.
To solve the problem of retinal vascular structure division in fundus images, an adaptive breadth-first search algorithm was proposed. Firstly, based on the structure of retinal vessels, the concept of hierarchical features was proposed and feature extraction was carried out. Then, the segmented retinal vessels were analyzed and processed, and several undirected subgraphs were extracted. Finally, the adaptive breadth-first search algorithm was used to classify the hierarchical features in each subgraph. The division of retinal vascular structure was transformed into the classification of hierarchical features. By classifying the hierarchical features of retinal vessels, the hierarchical structures of retinal vascular segments containing these hierarchical features were able to be determined, thus realizing the division of retinal vascular structures. The algorithm has excellent performance when applied to public fundus image databases.