In order to make up for the high demand of supervised information in supervised learning, a self-supervised learning method based on minimal prior knowledge was proposed. Firstly, the unlabeled data were clustered on the basis of the prior knowledge of data, or the initial labels were generated for unlabeled data based on center distances of labeled data. Secondly, the data were selected randomly after labeling, and the machine learning method was selected to build sub-models. Thirdly, the weight and error of each data extraction were calculated to obtain average error of the data as the data label degree for each dataset, and set an iteration threshold based on the initial data label degree. Finally, the termination condition was determined on the basis of comparing the data-label degree and the threshold during the iteration process. Experimental results on 10 UCI public datasets show that compared with unsupervised learning algorithms such as K-means, supervised learning methods such as Support Vector Machine (SVM) and mainstream self-supervised learning methods such as TabNet (Tabular Network), the proposed method achieves high classification accuracy on unbalanced datasets without using labels or on balanced datasets using limited labels.
A personalized exercise recommendation method that combines cognitive diagnosis and deep factorization machine was proposed to address the problems of single modeling angle and unreasonable exercise recommendation results of the existing exercise recommendation based on cognitive diagnosis. Firstly, a new method for calculating the relationship between knowledge points was designed to construct a course knowledge tree, and the concept of enhanced Q matrix to accurately represent the relationship between knowledge points contained in exercises was proposed. Secondly, the Neural Cognitive Diagnosis with Knowledge-based Discernment (NeuralCD-KD) model was proposed to calculate the enhanced Q matrix. In the model, the feature second-order cross and attention mechanism were used to fuse internal and external factors of exercise difficulty, and the students’ cognitive states were simulated. The effectiveness of the proposed cognitive diagnosis model was verified on private and public datasets, and this method was able to give reasonable explanations for students’ cognitive states. To personalize exercise recommendation, a Neural Knowledge-based Cognitive Diagnosis with Deep Bilinear Factorization Machine (NKD-DBFM) method was proposed by combining the diagnostic model with deep bilinear factorization machine, and the effectiveness of this proposed exercise recommendation method was verified on the private dataset. Compared with the optimal baseline model Neural Cognitive Diagnosis Model (NeuralCDM), the proposed method improves the Area Under Curve (AUC) by 3.7 percentage points.
The Multi-Object Tracking (MOT) task needs to track multiple objects at the same time and ensures the continuity of object identities. To solve the problems in the current MOT process, such as object occlusion, object ID Switch (IDSW) and object loss, the Transformer-based MOT model was improved, and a multi-object tracking method based on dual-decoder Transformer was proposed. Firstly, a set of trajectories was generated by model initialization in the first frame, and in each frame after the first one, attention was used to establish the association between frames. Secondly, the dual-decoder was used to correct the tracked object information. One decoder was used to detect the objects, and the other one was used to track the objects. Thirdly, the histogram template matching was applied to find the lost objects after completing the tracking. Finally, the Kalman filter was utilized to track and predict the occluded objects, and the occluded results were associated with the newly detected objects to ensure the continuity of the tracking results. In addition, on the basis of TrackFormer, the modeling of apparent statistical characteristics and motion features was added to realize the fusion between different structures. Experimental results on MOT17 dataset show that compared with TrackFormer, the proposed algorithm has the IDentity F1 Score (IDF1) increased by 0.87 percentage points, the Multiple Object Tracking Accuracy (MOTA) increased by 0.41 percentage points, and the IDSW number reduced by 16.3%. The proposed method also achieves good results on MOT16 and MOT20 datasets. Consequently, the proposed method can effectively deal with the object occlusion problem, maintain object identity information, and reduce object identity loss.
In order to solve the problems that the low-level features of the backbone are not fully utilized, and the effective features are lost due to large-times upsampling in DeepLabV3+ semantic segmentation, a Cumulative Distribution Channel Attention DeepLabV3+ (CDCA-DLV3+) model was proposed. Firstly, a Cumulative Distribution Channel Attention (CDCA) was proposed based on the cumulative distribution function and channel attention. Then, the cumulative distribution channel attention was used to obtain the effective low-level features of the backbone part. Finally, the Feature Pyramid Network (FPN) was adopted for feature fusion and gradual upsampling to avoid the feature loss caused by large-times upsampling. On validation set Pascal Visual Object Classes (VOC)2012 and dataset Cityscapes, the mean Intersection over Union (mIoU) of CDCA-DLV3+ model was 80.09% and 80.11% respectively, which was 1.24 percentage points and 1.02 percentage points higher than that of DeepLabV3+ model. Experimental results show that the proposed model has more accurate segmentation results.
Popular science text classification aims to classify the popular science articles according to the popular science classification system. Concerning the problem that the length of popular science articles often exceeds 1 000 words, which leads to the model hard to focus on key points and causes poor classification performance of the traditional models, a model for long text classification combining knowledge graph to perform two-level screening was proposed to reduce the interference of topic-irrelevant information and improve the performance of model classification. First, a four-step method was used to construct a knowledge graph for the domains of popular science. Then, this knowledge graph was used as a distance monitor to filter out irrelevant information through training sentence filters. Finally, the attention mechanism was used to further filter the information of the filtered sentence set, and the attention-based topic classification model was completed. Experimental results on the constructed Popular Science Classification Dataset (PSCD) show that the text classification algorithm model based on the domain knowledge graph information enhancement has higher F1-Score. Compared with the TextCNN model and the BERT (Bidirectional Encoder Representations from Transformers) model, the proposed model has the F1-Score increased by 2.88 percentage points and 1.88 percentage points respectively, verifying the effectiveness of knowledge graph to long text information screening.
In view of the general problems of not fully utilizing historical information and slow parameter optimization process in the research of clustering algorithms, an adaptive classification algorithm based on data field was proposed in combination with edge intelligent computing, which can be deployed on Edge Computing (EC) nodes to provide local intelligent classification service. By introducing supervision information to modify the structure of the traditional data field clustering model, the proposed algorithm enabled the traditional data field to be applied to classification problems, extending the applicable fields of data field theory. Based on the idea of the data field, the proposed algorithm transformed the domain value space of the data into the data potential field space, and divided the data into several unlabeled cluster results according to the spatial potential value. After comparing the cluster results with the historical supervision information for cloud similarity, the cluster results were attributed to the most similar category. Besides, a parameter search strategy based on sliding step length was proposed to speeded up the parameter optimization of the proposed algorithm. Based on this algorithm, a distributed data processing scheme was proposed. Through the cooperation of cloud center and edge devices, classification tasks were cut and distributed to different levels of nodes to achieve modularity and low coupling. Simulation results show that the precision and recall of the proposed algorithm maintained above 96%, and the Hamming loss was less than 0.022. Experimental results show that the proposed algorithm can accurately classify and accelerate the speed of parameter optimization, and outperforms than Logistic Regression (LR) algorithm and Random Forest (RF) algorithm in overall performance.
When big data flow calculation tasks with different attributes generated by networked vehicle nodes are transmitted and offloaded, issues such as time delay jitter, large computational energy consumption and system overhead usually happen. Therefore, according to the actual communication environment, a scheme for task offloading and resource allocation based on Simulated Annealing Algorithm (SAA) in Cellular Vehicle to Everything (C-V2X) Internet of Vehicles (IoV) was proposed. Firstly, according to the task processing priority, the tasks with high processing priority were processed by collaborative offloading and computing. Secondly, an SAA-based task offloading strategy was developed with the aid of globally searching for the optimal offloading scale factor. And the task offloading scale factor was analyzed and optimized. Finally, during the update process of task offloading scale factor, the problem of minimizing the system overhead was transformed into the convex optimization problem of power and computational resource allocation. And the Lagrange multiplier method was used to obtain the optimal solution. By comparing the proposed algorithm with the local offloading and adaptive genetic algorithm, it can be seen that: as the calculation task data size increases, the time delay, power consumption and system overhead of the adaptive genetic algorithm are decreased by 5.97%, 49.40%, and 49.36% respectively, compared with those of the local offloading. On this basis, the time delay, power consumption and system overhead of the proposed SAA-based scheme are further decreased by 6.35%, 92.27%, and 91.7% respectively, compared with those of the adaptive genetic algorithm. As the CPU cycles of the calculation task increase, the time delay, power consumption and system overhead of the adaptive genetic algorithm are decreased by 16.4%, 49.58%, and 49.23% respectively, compared with local offloading. On this basis, the time delay, power consumption and system overhead of the proposed SAA-based scheme are further decreased by 19.61%, 94.39%, and 89.88% respectively, compared with those of the adaptive genetic algorithm. Experimental results show that SAA cannot only reduce the time delay, power consumption and system overhead of communication systems but also accelerate convergence of the results.
The existing privacy protection methods for data with multiple numerical sensitive attributes not only have the problem of large loss of information about quasi-identifier attributes, but also have the problem that they cannot satisfy the user’s personalized need for ranking the importance of numerically sensitive attributes. To solve the above problems, a personalized privacy protection method based on clustering and weighted Multi-Sensitive Bucketization (MSB) was proposed. Firstly, according to the similarity of quasi-identifiers, the dataset was divided into several subsets with similar values of quasi-identifier attributes. Then, considering the different sensitivities of users to sensitive attributes, the sensitivity and the bucket capacity of multi-dimensional buckets were used to calculate the weighted selectivity and to construct the weighted multi-dimensional buckets. Finally, the data were grouped and anonymized according to all above. Eight attributes in UCI’s standard Adult dataset were selected for experiments, and the proposed method was compared with MNSACM and WMNSAPM. Experimental results show that the proposed method is better generally and is significantly superior to the comparison methods in reducing information loss and running time, which improves the data quality and operating efficiency.
Aiming at the problems of energy limitation and secure communication of Unmanned Aerial Vehicle (UAV) communication, by jointly optimizing the transmitting power and flight trajectory of UAV, and aiming at the maximization of average secrecy capacity of the system, a security scheme of mobile edge computing system based on UAV Wireless Power Transfer (WPT) was proposed. In this scheme, due to the non-smooth nature of the objective function, it is difficult to solve directly. To solve the above problems, firstly, the non-smooth objective function was reconstructed into an equivalent smooth objective function. Then, the reconstructed objective function was decoupled into two sub-problems, namely, optimizing the transmitting power for a given flight trajectory and optimizing the flight trajectory for a given transmitting power. And the optimal solutions of the two sub-problems were solved using the two-dimensional binary search method and the successive convex approximation algorithm respectively. Finally, the suboptimal solution of the problem was found through the block coordinate descent method, and the two sub-problems were solved iteratively and alternately. The results of comparing the proposed scheme with the trajectory optimization scheme without transmitting power control, the optimal trajectory optimization scheme with transmitting power control, and the optimal trajectory optimization scheme without transmitting power control show that: as the flight time increases, the average secrecy rate of the proposed scheme is increased by 36.0%, 9.2%, and 34.8%, respectively; as the transmitting power increases, the average secrecy rate of the proposed scheme is increased by 12.4%, 3.0%, and 14.4%, respectively. So the proposed scheme can effectively improve the average secrecy rate of the system.