Traditional Knowledge Graph (KG) provides a unified and machine-interpretable representation of information on the web, but its limitations in handling multimodal applications are increasingly recognized. To address these limitations, Multi-Modal Knowledge Graph (MMKG) was proposed as an effective solution. However, the integration of multi-modal data into KG often leads to problems such as inadequate modality fusion and reasoning difficulties, which constrain the application and development of MMKG. Therefore, Multi-Modal Knowledge Graph Completion (MMKGC) techniques were introduced to integrate cross-modal information fully in the construction phase and to predict missing links after construction, thereby solving issues in modality fusion and reasoning. Subsequently, an overview of MMKGC methods were presented. Firstly, the basic concepts, widely used benchmark datasets and evaluation metrics of MMKGC were elaborated in detail. Secondly, the existing methods were classified into fusion tasks during the MMKG construction phase and reasoning tasks after construction. The former focused on key techniques such as entity alignment and entity linking, while the latter encompassed three techniques: relation inference, missing information completion, and multi-modal expansion. Thirdly, various MMKGC methods in each category were introduced thoroughly and their characteristics were analyzed. Finally, the problems and challenges faced by MMKGC methods were examined, and a summary of the above was provided.
Precise segmentation of colon polyps in gastrointestinal endoscopy images holds significant clinical value. However, the traditional segmentation methods often struggle with capturing enough fine details and rely on large-scale data heavily, leading to poor performance when addressing complex polyp morphologies. Although Segment Anything Model (SAM) has notable progress in natural image segmentation, the ideal effect in polyp segmentation task cannot be achieved by SAM methods due to domain differences between natural and medical images. To address this issue, a lightweight fine-tuning method based on SAM architecture was proposed, named Segment Anything Model for Colon Polyps (SAMCP). In this method, a streamlined adapter module focusing on channel-dimension information was introduced, a joint loss function was simplified using Dice and Intersection over Union (IoU), and parameters of the original image encoder and prompt encoder were frozen during training to enhance polyp segmentation performance with low training cost. Experimental results on three public datasets comparing SAMCP with nine advanced methods demonstrate that SAMCP outperforms other SAM methods. Specifically, SAMCP improves the Dice and IoU values by 56.7% and 84.5%, respectively, on the Kvasir-SEG dataset, by 46.0% and 86.0%, respectively, on the CVC-ClinicDB, and by 95.3% and 122.2%, respectively, on the CVC-ColonDB dataset, surpassing the current best performance of SAM-based methods. With the introduction of point-based prompts, even with a single click, SAMCP can also outperform other SAM-based methods. The above validates that SAMCP performs well in handling complex shapes and local details, providing physicians with more precise segmentation guidance.
For fine-grained sentiment analysis in Natural Language Processing (NLP), in order to explore the influence of Pre-trained Language Models (PLMs) with structural biases on the end-to-end sentiment triple extraction task, and solve the problem of low fault tolerance rate of aspect semantic feature dependence that is common in previous studies, combining aspect-aware attention mechanism and Graph Convolutional Network (GCN), an Aspect-aware attention Enhanced GCN (AE-GCN) model was proposed for aspect sentiment triple extraction tasks. Firstly, multiple types of relations were introduced for the aspect sentiment triple extraction task. Then, these relations were embedded into the adjacent tensors between words in the sentence by using the double affine attention mechanism. At the same time, the aspect-aware attention mechanism was introduced to obtain the sentence attention scoring matrix, and the aspect-related semantic features were further mined. Next, a sentence was converted into a multi-channel graph through the graph convolutional neural network, to learn a relation-aware node representation by treating words and relation adjacent tensors as edges and nodes, respectively. Finally, an effective word pair representation refinement strategy was used to determine whether word pairs matched, which was used to consider the implicit results of aspect and opinion extraction. Experimental results show that, on ASTE-D1 benchmark dataset, the F1 values of the proposed model on the 14res, 14lap, 15res and 16res sub-datasets are improved by 0.20, 0.21, 1.25 and 0.26 percentage points compared with the Enhanced Multi-Channel Graph Convolutional Network (EMC-GCN) model; on ASTE-D2 benchmark dataset, the F1 values of the proposed model on the 14lap, 15res and 16res sub-datasets are increased by 0.42, 0.31 and 2.01 percentage points compared with the EMC-GCN model. It can be seen that the proposed model has great improvement in precision and effectiveness compared with the EMC-GCN model.
In order to solve the problems that the UAV (Unmanned Aerial Vehicle) is vulnerable to environmental interference in image recognition, and the traditional signal recognition is difficult to accurately extract features and has poor real-time performance, a UAV detection and recognition method based on improved CNN (Convolutional Neural Network) and RF (Radio Frequency) fingerprint was proposed. Firstly, a USRP (Universal Software Radio Peripheral) was used for capturing radio signals in an environment, a deviation value was obtained through multi-resolution analysis, to detect whether the radio signal was an unmanned aerial vehicle radio frequency signal or not. Secondly, the detected unmanned aerial vehicle radio frequency signal was subjected to wavelet transformation and PCA (Principal Component Analysis) to obtain a radio frequency signal spectrum which was used as an input of a neural network. Finally, a LRCNN (Lightweight Residual Convolutional Neural Network) was constructed, and the RF spectrum was input to train the network for UAV classification and recognition. Experimental results show that LRCNN can effectively detect and recognize UAV signals, and the average recognition accuracy reaches 84%. When the SNR (Signal-to-Noise Ratio) is greater than 20 dB, the recognition accuracy of LRCNN reaches 88%, which is 31 and 7 percentage points higher than those of SVM (Support Vector Machine) and the original OracleCNN, respectively. Compared with these two methods, LRCNN has improved recognition accuracy and robustness.
To address the issue of existing information diffusion models overlooking user subjectivity and social network dynamics, an SCBRD (Susceptible-Commented-Believed-Recovered-Defensed) opinion propagation model that considers user initiative and mobility in heterogeneous networks was proposed.Firstly, the basic reproduction number was determined using the next-generation matrix method, and the system’s dynamics and optimal control were investigated by applying Lyapunov’s stability theorem and Pontryagin’s principle. Then, a simulation analysis was performed based on BA (Barabási-Albert) scale-free network to identify the significant factors affecting the opinion propagation. The results reveal that users’ curiosity, forwarding behavior, and admission rate play dominant roles in information diffusion and the system has an optimal control solution. Finally, the model’s rationality was validated based on actual data. Compared to the SCIR (Susceptible-inCubation-Infective-Refractory) model, the SCBRD model improves fitting accuracy by 27.40% and reduces the Root Mean Square Error (RMSE) of prediction by 39.02%. Therefore, the proposed model can adapt to the complex and changing circumstances of information diffusion and provide better guidance for official public opinion regulation.
The existing text sentiment classification methods face serious challenges due to the complex semantics of natural language, the multiple sentiment polarities of words, and the long-term dependency of text. To solve these problems, a semantically enhanced sentiment classification model based on multi-level attention was proposed. Firstly, the contextualized dynamic word embedding technology was used to mine the multiple semantic information of words, and the context semantics was modeled. Secondly, the long-term dependency within the text was captured by the multi-layer parallel multi-head self-attention in the internal attention layer to obtain comprehensive text feature information. Thirdly, in the external attention layer, the summary information in the review metadata was integrated into the review features through a multi-level attention mechanism to enhance the sentiment information and semantic expression ability of the review features. Finally, the global average pooling layer and Softmax function were used to realize sentiment classification. Experimental results on four Amazon review datasets show that, compared with the best-performing TE-GRU (Transformer Encoder with Gated Recurrent Unit) in the baseline models, the proposed model improves the sentiment classification accuracy on App, Kindle, Electronic and CD datasets by at least 0.36, 0.34, 0.58 and 0.66 percentage points, which verifies that the proposed model can further improve the sentiment classification performance.
Body size parameters are important indicators to evaluate the growth status of sheep. How to achieve the measurement with non-stress instrument is an urgent and important problem that needs to be resolved in the breeding process of sheep. This paper introduced corresponding machine vision methods to measure the parameters. Sheep body in complex environment was detected by gray-based background subtraction method and chromaticity invariance principle. By virtue of grid method, the contour envelope of sheep body was extracted. After analyzing the contour sequence with D-P algorithm and Helen-Qin Jiushao formula, the point with maximum curvature in the contour was acquired. The point was chosen as the measurement point at the hip of sheep. Based on the above information, the other three measurment points were attained using four-point method and combing the spatial resolution, the body size parameters of sheep body were acquired. And the contactless measurement was achieved. The experimental results show that, the proposed method can effectively extract sheep body in complex environment; the measurement point at hip of sheep can be stably determined and the height of sheep can be stably attained. Due to the complexity of the ambient light, there still exits some problems when determining the shoulder points.
In this paper, aiming at the priority selection of the Gaussian kernel parameter (β) in the Kernel Principal Component Analysis (KPCA), a kernel parameter discriminant method was proposed for the KPCA. It calculated the kernel window widths in the classes and between two classes for the training samples.The kernel parameter was determined with the discriminant method for the kernel window widths. The determined kernel matrix based on the discriminant selected kernel parameter could exactly describe the structure characteristics of the training space. In the end, it used Principal Component Analysis (PCA) to the decomposition for the feature space, and obtained the principal component to realize dimensionality reduction and feature extraction. The method of discriminant kernel window width chose smaller window width in the dense regions of classification, and larger window width in the sparse ones. The simulation of the numerical process and Tennessee Eastman Process (TEP) using the Discriminated Kernel Principle Component Analysis (Dis-KPCA) method, by comparing with KPCA and PCA, show that Dis-KPCA method is effective to the sample data dimension reduction and separates three classes of data by 100%,therefore, the proposed method has higher precision of dimension reduction.