Aiming at reduction problem in fuzzy relation decision systems, a fuzzy relation decision system with two universes and its attribute reduction concept were proposed by combining framework of the rough set theory with two universes. Firstly, the binary relations induced by conditional attributes and decision attributes were defined as fuzzy relations according to different universes, leading to introduction of the fuzzy relation decision system with two universes. Secondly, to obtain a deeper understanding of essence of reduction, the concept of approximate reduction in the fuzzy relation decision system with two universes was proposed. Thirdly, based on definition of approximate reduction, an discernibility matrix corresponding to approximate reduction was designed and constructed, and through proof of the discernibility matrix, discernibility matrix-based approximate reduction algorithms — LRFT and URFT were proposed. Finally, the feasibility and effectiveness of the proposed algorithms were further verified through experiments of comparing the classification accuracy metrics of the dataset before and after reduction.
For insufficient edge weight window threshold design in Text Graph Convolutional Network (Text GCN), to mine the word association structure more accurately and improve prediction accuracy, a fake review detection algorithm combining Gaussian Mixture Model (GMM) and Text GCN named F-Text GCN was proposed. The edge signal strength of fake reviews that are relatively weak compared to normal reviews in training data size was improved by using GMM nature to separate noise edge weight distributions. Additionally, considering the diversity of information sources, the adjacency matrix was constructed by combing documents, words, reviews and non-text features. Finally, the fake review association structure of the adjacency matrix was extracted through spectral decomposition of Text GCN. Validation experiments were performed on 126 086 actual Chinese reviews collected by a large domestic e-commerce platform. Experimental results show that, for detecting fake reviews, the F1 value of F-Text GCN is 82.92%, outperforming BERT (Bidirectional Encoder Representation from Transformers) and Text CNN by 10.46% and 11.60%, respectively, the F1 of F-Text GCN is 2.94% higher than that of Text GCN. For highly imitated fake reviews which are challenging to detect, F-Text GCN achieves the overall prediction accuracy of 94.71% by secondary detection on the samples that Support Vector Machine (SVM) was difficult to detect, which is 2.91% and 14.54% higher than those of Text GCN and SVM. Based on study findings, lexical interference in consumer decision-making is evident in fake reviews’ second-order graph neighbor structure. This result indicates that the proposed algorithm is especially suitable for extracting long-range word collocation structures and global sentence feature pattern variations for fake reviews detection.
High Utility Itemsets Mining (HUIM) is able to mine the items with high significance from transaction database, thus helping users to make better decisions. In view of the fact that the application of intelligent optimization algorithms can significantly improve the mining efficiency of high utility itemsets in massive data, a survey of intelligent optimization algorithm-based HUIM methods was presented. Firstly, detailed analysis and summary of the intelligent optimization algorithm-based HUIM methods were performed from three aspects: swarm intelligence optimization-based, evolution-based and other intelligent optimization algorithms-based methods. Meanwhile, the Particle Swarm Optimization (PSO)-based HUIM methods were sorted out in detail from the aspect of particle update methods, including traditional update strategy-based, sigmoid function-based, greedy-based, roulette-based and ensemble-based methods. Additionally, the swarm intelligence optimization algorithm-based HUIM methods were compared and analyzed from the perspectives of population update methods, comparison algorithms, parameter settings, advantages and disadvantages, etc. Next, the evolution-based HUIM methods were summarized and outlined in terms of both genetic and bionic aspects. Finally, the next research directions were proposed for the problems of the existing intelligent optimization algorithm-based HUIM methods.
The traditional classifiers are difficult to cope with the challenges of complex types of data streams with concept drift, and the obtained classification results are often unsatisfactory. Aiming at the methods of dealing with concept drift in different types of data streams, classification methods for complex data streams with concept drift were summarized from four aspects: imbalance, concept evolution, multi-label and noise-containing. Firstly, classification methods of four aspects were introduced and analyzed: block-based and online-based learning approaches for classifying imbalanced concept drift data streams, clustering-based and model-based learning approaches for classifying concept evolution concept drift data streams, problem transformation-based and algorithm adaptation-based learning approaches for classifying multi-label concept drift data streams and noisy concept drift data streams. Then, the experimental results and performance metrics of the mentioned concept drift complex data stream classification methods were compared and analyzed in detail. Finally, the shortcomings of the existing methods and the next research directions were given.
With increasingly severe network security threats and increasingly complex security defense means, zero trust network is a new evaluation and review of traditional boundary security architecture. Zero trust emphasizes never always trusting anything and verifying things continuously. Zero trust network emphasizes that the identity is not identified by location, all access controls strictly execute minimum permissions, and all access processes are tracked in real time and evaluated dynamically. Firstly, the basic definition of zero trust network was given, the main problems of traditional perimeter security were pointed out, and the zero trust network model was described. Secondly, the key technologies of zero trust network, such as Software Defined Perimeter (SDP), identity and access management, micro segmentation and Automated Configuration Management System (ACMS), were analyzed. Finally, zero trust network was summarized and its future development was prospected.
In data stream ensemble classification, to make the classifiers adapt to the constantly changing data stream and adjust the weights of base classifiers to select an appropriate set of classifiers, an ensemble classification algorithm based on dynamic weighting function was proposed. Firstly, a new weighting function was proposed to adjust the weights of the base classifiers, and the classifiers were trained with constantly updated data blocks. Then a weight function was used to make a reasonable selection of candidate classifiers. Finally, the incremental nature of decision tree was applied to the base classifiers, and the classification of data stream was realized. Through a large amount of experiments, it is found that the performance of the proposed algorithm is not affected by block size. Compared with AUE2 algorithm, the average number of leaves is reduced by 681.3, the average number of nodes is reduced by 1 192.8, and the average depth of the tree is reduced by 4.42. At the same time, the accuracy is relatively improved and the time-consuming is reduced. Experimental results show that the algorithm can not only guarantee the accuracy but also save a lot of memory and time when classifying data stream.
High Utility Pattern Mining (HUPM) is one of the emerging data science research contents. The unit profit and number of items in the transaction database are considered to extract more useful information. The utility value of each item is assumed to be positive by the traditional HUPM methods, but in practical applications, the utility values of some data items may be negative (for example, the profit value of the product is negative due to a loss), and the pattern mining with negative items is as important as the pattern mining with only positive terms. Firstly, the relevant concepts of HUPM were explained, and the examples of corresponding positive and negative utilities were given. Then, the HUPM methods were divided into positive and negative perspectives, among which the pattern mining methods with positive utility were further divided into dynamic and static database perspectives; the pattern mining methods with negative utility included priori-based, tree-based, utility list-based, and array-based key technologies. the HUPM methods were discussed and summarized from different aspects. Finally, the shortcomings of the existing HUPM methods and the next research directions were given.
In the traditional ensemble classification algorithm, the ensemble number is generally set to a fixed value, which may lead to a low classification accuracy. Aiming at this problem, an accuracy Climbing Ensemble Classification Algorithm (C-ECA) was proposed. Firstly, the base classifiers was no longer replaced the same number of base classifiers with the worst performance, but updated based on the accuracy in this algorithm, and then the optimal ensemble number was determined. Secondly, on the basis of C-ECA, a Dynamic Weighted Ensemble Classification Algorithm based on Climbing (C-DWECA) was proposed. When the base classifier was trained on the data stream with different features, the best weight of the base classifier was able to be obtained by a weighting function proposed in this algorithm, thereby improving the performance of the ensemble classifier. Finally, in order to detect the concept drift earlier and improve the final accuracy, Fast Hoffding Drift Detection Method (FHDDM) was adopted. Experimental results show that the accuracy of C-DWECA can reach up to 97.44%, and the average accuracy of the proposed algorithm is about 40% higher than that of Adaptable Diversity-based Online Boosting (ADOB) algorithm, and is also better than those of other comparison algorithms such as Leveraging Bagging (LevBag) and Adaptive Random Forest (ARF).
As standard Backtracking Search Optimization Algorithm (BSA) has the shortcoming of slow convergence, a new mutation scale factor based on Maxwell-Boltzmann distribution and a crossover strategy with greedy property were introduced to improve it. Maxwell-Boltzmann distribution was used to generate mutation scale factor, which could enhance search efficiency and convergence speed. Mutation population learning from outstanding individuals was adopted in less exchange-dimensional crossover strategy to add greedy property to crossover as well as fully ensure population diversity, which managed to avoid the problem that most existed algorithms easily trap into local minima when added greedy property. The simulation experiments were conducted on fifteen Benchmark functions. The results show that the improved algorithm has faster convergence speed and higher convergence precision, even in the high-dimensional multimodal functions, the improved algorithm's search results are nearly 14 orders of magnitude higher than those of original BSA after the same iterations, and its convergence precision can reach 10-10 or less.
Element-sizing fields are necessities for guiding the generation of high-quality meshes used in finite element analyses and smoothness of this field heavily affect the element quality of the resulting mesh. A new algorithm was proposed to smooth the element-sizing fields. Based on the H-variant proposed by Borouchaki, et al. (BOROUCHAKI H, HECHT F, FREY P J. Mesh gradation control. International Journal for Numerical Methods in Engineering, 1998,43(6):1143-1165), some basic geometric concepts and a theory that aimed at analyzing the smoothness of 2D element-sizing fields quantitatively was established in this paper. The rule of the reasonable sizing transition for 2D area was achieved. Based on this rule, an improved algorithm from Borouchaki sizing-correcting method defined on unstructured background meshes was designed. This algorithm adjusted the size values attached on a small set of background mesh nodes to ensure the output of a well graded size field. Finally, mesh examples were given to validate the proposed theory and algorithm. Compared with other smoothing methods, the proposed algorithm in this paper can help get meshes with better quality.