To address the issues of missing distribution information and distribution collapse encountered by deep variational text clustering models in practical applications, a Deep Variational text Clustering Model based on Distribution augmentation (DVCMD) was proposed. In this model, the enhanced latent semantic distributions were integrated into the original latent semantic distribution by enhancing distribution information, so as to improve information completeness and accuracy of the latent distribution. At the same time, a distribution consistency constraint strategy was employed to promote the learning of consistent semantic representations by the model, thereby enhancing the model’s ability to express true information of the data through learned semantic distributions, and thus improving clustering performance. Experimental results show that compared with existing deep clustering models and structural semantic-enhanced clustering models, DVCMD has the Normalized Mutual Information (NMI) metric improved by at least 0.16, 9.01, 2.30, and 2.72 percentage points on the four real-world datasets: Abstract, BBC, Reuters-10k, and BBCSports, respectively, validating the effectiveness of the model.
Aiming at the problem of low data availability caused by existing disturbance mechanisms that do not consider the semantic relationship of location points, a Trajectory Location Privacy protection Mechanism based on Differential Privacy was proposed, namely DP-TLPM. Firstly, the sliding windows were used to extract trajectory dwell points to generate the fuzzy regions, and the regions were sampled using exponential and Laplacian mechanisms. Secondly, a road network matching algorithm was proposed to eliminate possible semantic free location points in the sampled points, and the trajectory was segmented and iteratively matched by using Error Ellipse Matching (EEM). Finally, a disturbance trajectory was formed based on the matched location points, which was sent to the server by the user. The mechanism was evaluated comprehensively by confusion quality and Root Mean Square Error (RMSE). Compared with the GeoInd algorithm, the data quality loss of the DP-TLPM is reduced by 24% and the confusion quality of the trajectories is improved by 52%, verifying the effectiveness of DP-TLPM in terms of both privacy protection strength and data quality.
Aiming at the problem of noise arising from user’s unexpected interactions in practical recommendation scenarios and the challenge of capturing short-term demand biases due to the dispersed attention in self-attention mechanism, a model namely FTARec (sequential Recommendation based on hierarchical Filter and Temporal convolution enhanced self-Attention network) was proposed. Firstly, hierarchical filter was used to filter noise in the original data. Then, user embeddings were obtained by combining temporal convolution enhanced self-attention networks with decoupled hybrid location encoding. The deficiencies in modeling short-term dependencies among items were supplemented by enhancing the self-attention network with temporal convolution in this process. Finally, contrastive learning was incorporated to refine user embeddings and predictions were made based on the final user embeddings. Compared to existing sequential recommendation models such as the Self-Attentive Sequential Recommendation (SASRec) and the Filter-enhanced Multi-Layer Perceptron approach for sequential Recommendation (FMLP-Rec), FTARec achieves higher Hit Rate (HR) and Normalized Discounted Cumulative Gain (NDCG) on three publicly available datasets: Beauty, Clothing, and Sports. Compared with the suboptimal DuoRec, FTARec has the HR@10 increased by 7.91%, 13.27%, 12.84%, and the NDCG@10 increased by 5.52%, 8.33%, 9.88%, respectively, verifying the effectiveness of the proposed model.
With the rapid development of cloud computing technology, the number of data centers have increased significantly, and the subsequent energy consumption problem gradually become one of the research hotspots. Aiming at the problem of server energy consumption optimization, a data center server energy consumption optimization combining eXtreme Gradient Boosting (XGBoost) and Multi-Gated Recurrent Unit (Multi-GRU) (ECOXG) algorithm was proposed. Firstly, the data such as resource occupation information and energy consumption of each component of the servers were collected by the Linux terminal monitoring commands and power consumption meters, and the data were preprocessed to obtain the resource utilization rates. Secondly, the resource utilization rates were constructed in series into a time series in vector form, which was used to train the Multi-GRU load prediction model, and the simulated frequency reduction was performed to the servers according to the prediction results to obtain the load data after frequency reduction. Thirdly, the resource utilization rates of the servers were combined with the energy consumption data at the same time to train the XGBoost energy consumption prediction model. Finally, the load data after frequency reduction were input into the trained XGBoost model, and the energy consumption of the servers after frequency reduction was predicted. Experiments on the actual resource utilization data of 6 physical servers showed that ECOXG algorithm had a Root Mean Square Error (RMSE) reduced by 50.9%, 31.0%, 32.7%, 22.9% compared with Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM) network, CNN-GRU and CNN-LSTM models, respectively. Meanwhile, compared with LSTM, CNN-GRU and CNN-LSTM models, ECOXG algorithm saved 43.2%, 47.1%, 59.9% training time, respectively. Experimental results show that ECOXG algorithm can provide a theoretical basis for the prediction and optimization of server energy consumption optimization, and it is significantly better than the comparison algorithms in accuracy and operating efficiency. In addition, the power consumption of the server after the simulated frequency reduction is significantly lower than the real power consumption, and the effect of reducing energy consumption is outstanding when the utilization rates of the servers are low.
As the existing dynamic programming algorithm cannot quickly solve Discounted {0-1} Knapsack Problem (D{0-1}KP), based on the idea of dynamic programming and combined with New Greedy Repair Optimization Algorithm (NGROA) and core algorithm, a Greedy Core Acceleration Dynamic Programming (GCADP) algorithm was proposed with the acceleration of the problem solving by reducing the problem scale. Firstly, the incomplete item was obtained based on the greedy solution of the problem by NGROA. Then, the radius and range of fuzzy core interval were found by calculation. Finally, Basic Dynamic Programming (BDP) algorithm was used to solve the items in the fuzzy core interval and the items in the same item set. The experimental results show that GCADP algorithm is suitable for solving D{0-1}KP. Meanwhile, the average solution speed of GCADP improves by 76.24% and 75.07% respectively compared with that of BDP algorithm and FirEGA (First Elitist reservation strategy Genetic Algorithm).
Since Scale Invariant Local Ternary Pattern (SILTP) background modeling algorithm is of high complexity and slow computing speed, which is not suitable for real-time video processing, a new method named Uniform Scale Invariant Local Ternary Pattern (USILTP) background modeling algorithm was proposed. Firstly, the feature of USILTP was extracted by regulating the frequency of SILTP coding jump in order to reduce the feature dimension of SILTP. Secondly, a USILTP background modeling parallel algorithm based on Intel core graphics (Intel HD) and Open Computing Language technology (OpenCL) was designed and implemented to further accelerate USILTP background modeling algorithm. Finally, the foreground result of USILTP background modeling algorithm was optimized by combing multiple color channel models. The experimental result shows that the proposed algorithm can be applied to process 320×240 resolution video at a rate of 98 frame/s on the Intel HD 4600, which is 4 times faster than that of SILTP background modeling algorithm. In terms of foreground detection, the performance of the proposed algorithm is improved by 2.1% compared with SILTP background modeling algorithm on the public dataset.
Aiming at the problem of low efficiency of tampering detection and accuracy of location, a homologous video copy-move tampering detection and recovering method based on Geometric Mean Decomposition (GMD) and Structural SIMilarity (SSIM) was proposed. Firstly, the videos were translated into grayscale image sequences. Then, the geometric mean decomposition was adopted as a feature and a block-based search strategy was put forward to locate the starting frame of the duplicated sequences. In addition, SSIM was first extended to measure the similarity between two frames of a video. The starting frame of duplicated sequences was rechecked by using the structural similarity. Since the value of similarity between duplicated frames is higher than that between the normal inter-frames, a coarse-to-fine method based on SSIM was put forward to locate the tail frame. Finally, the video was recovered. In comparison with other classical algorithms, the experimental results show that the proposed method can not only achieve detection of copy-move forgery but also accurately detect and localize duplicated clips in different kinds of videos. Besides, the method has a great improvement in terms of precision, recall and computation time.
This paper analysed the functions of all parts of the USB transfer subsystem in real-time human motion capture device, and introduced its detailed designing methods of this subsystem. The USB transfer subsystem can transfer exact data in real time while the device is running and collecting human motion data.