Visualization reconstruction technology aims to transform graphics into data forms that can be parsed and operated by machines, providing the necessary basic information for large-scale analysis, reuse and retrieval of visualization. However, the existing reconstruction methods focus on the recovery of visual information obviously, while ignoring the key role of interaction information in data analysis and understanding. To address the above problem, a visual interaction information reconstruction method for machine understanding was proposed. Firstly, interactions were defined formally to divide the visual elements into different visual groups, and the automated tools were used to extract interaction information of the visual graphics. Secondly, associations among interactions and visual elements were decoupled, and the interactions were split into independent experimental variables to build an interaction entity library. Thirdly, a standardized declarative language was formulated to realize querying of the interaction information. Finally, migration rules were designed to achieve migration adaptation of interactions among different visualizations based on visual element matching and adaptive adjustment mechanisms. The experimental cases focused on downstream tasks for machine understanding, such as visual question answering, querying, and migration. The results show that adding interaction information can enable machines to understand the semantics of visual interaction, thereby expanding the application scope of the above tasks. The above experimental results verify that proposed method can achieve structural integrity of the reconstructed visual graphics by integrating dynamic interaction information.
Affective computing can provide a better teaching effectiveness and learning experience for intelligent education. Current research on affective computing in classroom domain still suffers from limited adaptability and weak perception on complex scenarios. To address these challenges, a novel hybrid architecture was proposed, namely SC-ACNet, aiming at accurate affective computing for students in classroom. In the architecture, the followings were included: a multi-scale student face detection module capable of adapting to small targets, an affective computing module with an adaptive spatial structure that can adapt to different facial postures to recognize five emotions (calm, confused, jolly, sleepy, and surprised) of students in classroom, and a self-attention module that visualized the regions of the model contributing most to the results. In addition, a new student classroom dataset, SC-ACD, was constructed to alleviate the lack of face emotion image datasets in classroom. Experimental results on SC-ACD dataset show that SC-ACNet improves the mean Average Precision (mAP) by 4.2 percentage points and the accuracy of affective computing by 9.1 percentage points compared with the baseline method YOLOv7. Furthermore, SC-ACNet has the accuracies of 0.972 and 0.994 on common sentiment datasets, namely KDEF and RaFD, validating the viability of the proposed method as a promising solution to elevate the quality of teaching and learning in intelligent classroom.
Focused on the issue that current classification models are generally effective on texts of one length, and a large number of long and short texts occur in actual scenes in a mixed way, a General Long and Short Text Classification Model based on Hybrid Neural Network (GLSTCM-HNN) was proposed. Firstly, BERT (Bidirectional Encoder Representations from Transformers) was applied to encode texts dynamically. Then, convolution operations were used to extract local semantic information, and a Dual Channel ATTention mechanism (DCATT) was built to enhance key text regions. Meanwhile, Recurrent Neural Network (RNN) was utilized to capture global semantic information, and a Long Text Cropping Mechanism (LTCM) was established to filter critical texts. Finally, the extracted local and global features were fused and input into Softmax function to obtain the output category. In comparison experiments on four public datasets, compared with the baseline model (BERT-TextCNN) and the best performing comparison model BERT, GLSTCM-HNN has the F1 scores increased by up to 3.87 and 5.86 percentage points respectively. In two generality experiments on mixed texts, compared with the generality model — CNN-BiLSTM/BiGRU hybrid text classification model based on Attention (CBLGA) proposed by existing research, GLSTCM-HNN has the F1 scores increased by 6.63 and 37.22 percentage points respectively. Experimental results show that the proposed model can improve the accuracy of text classification task effectively, and has generality of classification on texts with different lengths from training data and on long and short mixed texts.
In Capacitated Vehicle Routing Problem (CVRP), the influence of uncertain factors including traffic congestion, resource supply and customer demand will easily make the single optimal solution infeasible or non-optimal. To solve this problem, a Multimodal Differential Evolution (MDE) algorithm was proposed to obtain multiple alternative vehicle routing schemes with similar objective values. Firstly, combined with the characteristics of CVRP, an efficient solution individual coding and decoding strategy was constructed, and the solution individual quality was improved using a repair mechanism. Secondly, in the framework of Differential Evolution (DE) algorithm, a dynamic radius niche generation method was introduced from the perspective of multimodal optimization, and the Jaccard coefficient was used to measure the similarity between solution individuals, which realized the calculation of the distance between solution individuals. Finally, the neighborhood search strategy was modified, and a multimodal optimal solution set was obtained using elite archiving and updating strategy. Simulation and analysis results based on typical datasets show that the average number of optimal solutions obtained by the proposed MDE algorithm reaches 1.743 4, and the deviation between the average optimal solution obtained by the proposed MDE algorithm and the known optimal solution is 0.03%, better than 0.8486 and 0.63% obtained by the DE algorithm. It can be seen that the proposed algorithm has high effectiveness and stability in solving CVRP, and can obtain multiple optimal solutions for CVRP simultaneously.
It is an effective hybrid strategy for imbalanced data classification of integrating cost-sensitivity and resampling methods into the ensemble algorithms. Concerning the problem that the misclassification cost calculation and undersampling process less consider the intra-class and inter-class distributions of samples in the existing hybrid methods, an imbalanced data classification algorithm based on ball cluster partitioning and undersampling with density peak optimization was proposed, named Boosting algorithm based on Ball Cluster Partitioning and UnderSampling with Density Peak optimization (DPBCPUSBoost). Firstly, the density peak information was used to define the sampling weights of majority samples, and the majority ball cluster with “neighbor cluster” was divided into “area misclassified easily” and “area misclassified hardly”, then the sampling weight of samples in “area misclassified easily” was increased. Secondly, the majority samples were undersampled based on the sampling weights in the first iteration, then the majority samples were undersampled based on the sample distribution weight in every iteration. And the weak classifier was trained on the temporary training set combining the undersampled majority samples with all minority samples. Finally, the density peak information of samples was combined with the categorical distribution of samples to define the different misclassification costs for all samples, and the weights of samples with higher misclassification cost were increased by the cost adjustment function. Experimental results on 10 KEEL datasets indicate that, the number of datasets with the highest performance achieved by DPBCPUSBoost is more than that of the imbalanced data classification algorithms such as Adaptive Boosting (AdaBoost), Cost-sensitive AdaBoost (AdaCost), Random UnderSampling Boosting (RUSBoost) and UnderSampling and Cost-sensitive Boosting (USCBoost), in terms of evaluation metrics such as Accuracy, F1-Score, Geometric Mean (G-mean) and Area Under Curve (AUC) of Receiver Operating Characteristic (ROC). Experimental results verify that the definition of sample misclassification cost and sampling weight of the proposed DPBCPUSBoost is effective.
The large number of duplicate images in the database not only affects the performance of the learner, but also consumes a lot of storage space. For massive image deduplication, a duplicate detection algorithm for massive images was proposed based on pHash (perception Hashing). Firstly, the pHash values of all images were generated. Secondly, the pHash values were divided into several parts with the same length. If the values of one of the pHash parts of the two images were equal to each other, the two images might be duplicate. Finally, the transitivity of image duplicate was discussed, and corresponding algorithms for transitivity case and non-transitivity case were proposed. Experimental results show that the proposed algorithms are effective in processing massive images. When the similarity threshold is 13, detecting the duplicate of nearly 300000 images by the proposed transitive algorithm only takes about two minutes with the accuracy around 53%.