Commonsense Question Answering (CQA) aims to use commonsense knowledge to answer questions described in natural language automatically to obtain accurate answer, and it belongs to intelligent question answering field. Typically, this task demands background commonsense knowledge to enhance the model in problem-solving capability. While most related methods rely on extracting and utilizing commonsense from textual data, however, commonsense is often implicit and not always represented in the text directly, which affects the application range and effectiveness of these methods. Therefore, a cross-modal contrastive learning-based CQA model was proposed to fully utilize cross-modal information for enriching the expression of commonsense knowledge. Firstly, a cross-modal commonsense representation module was designed to integrate the commonsense bases and a cross-modal large model, thereby obtaining a cross-modal commonsense representation. Secondly, in order to enhance the ability of the model to distinguish among different options, contrastive learning was carried out on the cross-modal representations of problems and options. Finally, the softmax layer was used to generate relevance scores for the problem option pairs, and the option with the highest score was taken as the final predicted answer. Experimental results on public datasets CommonSenseQA (CSQA) and OpenBookQA (OBQA) show that compared to DEKCOR (DEscriptive Knowledge for COmmonsense question answeRing), the proposed model is improved by 1.46 and 0.71 percentage points respectively in accuracy.
The task of Visual Relationship Detection (VRD) is to further detect the relationship between target objects on the basis of target recognition, which belongs to the key technology of visual understanding and reasoning. Due to the interaction and combination between objects, it is easy to cause the combinatorial explosion problem of relationship between objects, resulting in many entity pairs with weak correlation, which in turn makes the subsequent relationship detection recall rate low. To solve the above problems, a knowledge-guided visual relationship detection model was proposed. Firstly, visual knowledge was constructed, data analysis and statistics were carried out on entity labels and relationship labels in common visual relationship detection datasets, and the interaction co-occurrence frequency between entities and relationships was obtained as visual knowledge. Then, the constructed visual knowledge was used to optimize the combination process of entity pairs, the score of entity pairs with weak correlation decreased, while the score of entity pairs with strong correlation increased, and then the entity pairs were ranked according to their scores and the entity pairs with lower scores were deleted; the relationship score was also optimized in a knowledge-guided way for the relationship between entities, so as to improve the recall rate of the model. The effect of the proposed model was verified in the public datasets VG (Visual Genome) and VRD, respectively. In predicate classification tasks, compared with the existing model PE-Net (Prototype-based Embedding Network), the recall rates Recall@50 and Recall@100 improved by 1.84 and 1.14 percentage points respectively in the VG dataset. Compared to Coacher, the Recall@20, Recall@50 and Recall@100 increased by 0.22, 0.32 and 0.31 percentage points respectively in the VRD dataset.
Due to the high dimensionality and the complex variable distribution of Multivariate Time Series (MTS) data, the existing anomaly detection models generally suffer from high error rates and training difficulties when dealing with MTS datasets. Moreover, most models only consider the spatial-temporal features of time series samples, which are not sufficient to learn the features of time series. To solve the above problems, a multivariate Time Series anomaly detection model based on Multi-domain Feature Extraction (MFE-TS) was proposed. Firstly, starting from the original data domain, the Long Short-Term Memory (LSTM) network and the Convolutional Neural Network (CNN) were used to extract the temporal correlation and spatial correlation features of the MTS respectively. Secondly, Fourier transform was used to convert the original time series into frequency domain space, and Transformer was used to learn the amplitude and phase features of the data in frequency domain space. Multi-domain feature learning was able to model time series features more comprehensively, thereby improving anomaly detection performance of the model to MTS. In addition, the masking strategy was introduced to further enhance the feature learning ability of the model and make the model have a certain degree of noise resistance. Experimental results show that MFE-TS has superior performance on multiple real MTS datasets, while it still maintain good detection accuracy on datasets with noise.
An important strategy for lightweighting a 3D model is to use the mesh simplification algorithm to reduce the number of triangular meshes on the model surface. The widely used edge collapse algorithm is more efficient and has better simplification effect than other mesh simplification algorithms, but some detailed geometric features may be damaged or lost during the simplification process of this algorithm. Therefore, the approximate curvature of curve and the average area of the first-order neighborhood triangle of the edge to be collapsed were added as penalty factors to optimize the edge collapse cost of the original algorithm. First, according to the definition of curve curvature in geometry, the calculation formula of the approximate curvature of curve was proposed. Then, in the calculation process of vertex normal vector, two stages - area weighting and interior angle weighting were used to modify the initial normal vector, thereby considering more abundant geometric information of the model. The performance of the optimized algorithm was verified by experiments. Compared with the classical Quadratic Error Metric (QEM) algorithm and the mesh simplification algorithm considering the angle error, the optimized algorithm has the maximum error reduced by 73.96% and 49.77% at least and respectively. Compared with the QEM algorithm, the optimized algorithm has the Hausdorff distance reduced by 17.69% at least. It can be seen that in the process of model lightweighting, the optimized algorithm can reduce the deformation of the model and better maintain its own detailed geometric features.
In order to truly understand a piece of text, it is very important to grasp the main clues of the original text in the process of reading comprehension. Aiming at the questions of main clues in machine reading comprehension, a machine reading comprehension method based on event representation was proposed. Firstly, the textual event graph including the representation of events, the extraction of event elements and the extraction of event relations was extracted from the reading material by clue phrases. Secondly, after considering the time elements, emotional elements of events and the importance of each word in the document, the TextRank algorithm was used to select the events related to the clues. Finally, the answers of the questions were constructed based on the selected clue events. Experimental results show that on the test set composed of the collected 339 questions of clues, the proposed method is better than the sentence ranking method based on TextRank algorithm on BiLingual Evaluation Understudy (BLEU) and Consensus-based Image Description Evaluation (CIDEr) evaluation indexes. In specific, BLEU-4 index is increased by 4.1 percentage points and CIDEr index is increased by 9 percentage points.
The existing privacy protection methods for data with multiple numerical sensitive attributes not only have the problem of large loss of information about quasi-identifier attributes, but also have the problem that they cannot satisfy the user’s personalized need for ranking the importance of numerically sensitive attributes. To solve the above problems, a personalized privacy protection method based on clustering and weighted Multi-Sensitive Bucketization (MSB) was proposed. Firstly, according to the similarity of quasi-identifiers, the dataset was divided into several subsets with similar values of quasi-identifier attributes. Then, considering the different sensitivities of users to sensitive attributes, the sensitivity and the bucket capacity of multi-dimensional buckets were used to calculate the weighted selectivity and to construct the weighted multi-dimensional buckets. Finally, the data were grouped and anonymized according to all above. Eight attributes in UCI’s standard Adult dataset were selected for experiments, and the proposed method was compared with MNSACM and WMNSAPM. Experimental results show that the proposed method is better generally and is significantly superior to the comparison methods in reducing information loss and running time, which improves the data quality and operating efficiency.