Text sentiment analysis has gradually become an important part of Natural Language Processing(NLP) in the fields of systematic recommendation and acquisition of user sentiment information, as well as public opinion reference for the government and enterprises. The methods in the field of sentiment analysis were compared and summarized by literature research. Firstly, literature investigation was carried out on the methods of sentiment analysis from the dimensions of time and method. Then, the main methods and application scenarios of sentiment analysis were summarized and compared. Finally, the advantages and disadvantages of each method were analyzed. According to the analysis results, in the face of different task scenarios, there are mainly three sentiment analysis methods: sentiment analysis based on emotion dictionary, sentiment analysis based on machine learning and sentiment analysis based on deep learning. The method based on multi-strategy mixture has become the trend of improvement. Literature investigation shows that there is still room for improvement in the techniques and methods of text sentiment analysis, and it has a large market and development prospects in e-commerce, psychotherapy and public opinion monitoring.
Knowledge graph representation learning aims to map entities and relations into a low-dimensional dense vector space. Most existing related models pay more attention to learn the structural features of the triples while ignoring the semantic information features of the entity relationships within the triples and the entity description information features outside the triples, so that the abilities of knowledge expression of these models are poor. In response to the above problem, a knowledge representation learning method BAGAT (knowledge representation learning based on BERT model And Graph Attention Network) was proposed by fusing multi-source information. First, the entity target nodes and neighbor nodes of the triples were constructed by combining knowledge graph features, and Graph Attention Network (GAT) was used to aggregate the semantic information representation of the triple structure. Then, the Bidirectional Encoder Representations from Transformers (BERT) word vector model was used to perform the embedded representation of entity description information. Finally, the both representation methods were mapped to the same vector space for joint knowledge representation learning. Experimental results show that BAGAT has a large improvement compared to other models. Among the indicators Hits@1 and Hits@10 on the public dataset FB15K-237, compared with the translation model TransE (Translating Embeddings), BAGAT is increased by 25.9 percentage points and 22.0 percentage points respectively, and compared with the graph neural network model KBGAT (Learning attention-based embeddings for relation prediction in knowledge graphs), BAGAT is increased by 1.8 percentage points and 3.5 percentage points respectively, indicating that the multi-source information representation method incorporating entity description information and semantic information of the triple structure can obtain stronger representation learning capability.
Most of the current Chinese questions and answers matching technologies require word segmentation first, and the word segmentation problem of Chinese medical text requires maintenance of medical dictionaries to reduce the impact of segmentation errors on subsequent tasks. However, maintaining dictionaries requires a lot of manpower and knowledge, making word segmentation problem always be a great challenge. At the same time, the existing Chinese medical questions and answers matching methods all model the questions and the answers separately, and do not consider the relationship between the keywords contained in the questions and the answers respectively. Therefore, an Attention mechanism based Stack Convolutional Neural Network (Att-StackCNN) model was proposed to solve the problem of Chinese medical questions and answers matching. Firstly, character embedding was used to encode the questions and answers to obtain the respective character embedding matrices. Then, the respective feature attention mapping matrices were obtained by constructing the attention matrix using the character embedding matrices of the questions and answers. After that, Stack Convolutional Neural Network (Stack-CNN) model was used to perform convolution operation to the above matrices at the same time to obtain the respective semantic representations of the questions and answers. Finally, the similarity was calculated, and the max-margin loss was calculated by using the similarity to update the network parameters. On the cMedQA dataset, the Top-1 accuracy of proposed model was about 1 percentage point higher than that of Stack-CNN model and about 0.5 percentage point higher than that of Multi-CNNs model. Experimental results show that Att-StackCNN model can improve the matching effect of Chinese medical questions and answers.
Wild animal object detection based on infrared camera images is conducive to the research and protection of wild animals. Because of the large difference in the number of different species of wildlife, there is the long-tail data problem of uneven distribution of numbers of species in the wildlife dataset collected by infrared cameras. This problem affects the overall performance improvement of the object detection neural network models. In order to solve the problem of low accuracy of object detection caused by long-tail data of wild animals, a method based on two-stage learning and re-weighting to solve long-tail data was proposed, and the method was applied to wildlife object detection based on YOLOv4-Tiny. Firstly, a new wildlife dataset with obvious long-tail data characteristics was collected, labelled and constructed. Secondly, a two-stage method based on transfer learning was used to train the neural network. In the first stage, the classification loss function was trained without weighting. In the second stage, two improved re-weighting methods were proposed, and the weights obtained in the first stage were used as the pre-training weights for re-weighting training. Finally, the wildlife test set was used to tested. Experimental results showed that the proposed long-tail data solving method achieved 60.47% and 61.18% mAP (mean Average Precision) with cross-entropy loss function and focal loss function as classification loss respectively, which was 3.30 percentage points and 5.16 percentage points higher than that the no-weighting method under the two loss functions, and 2.14 percentage points higher than that of the proposed improved effective sample weighting method under focus loss function. It shows that the proposed method can improve the object detection performance of YOLOv4-Tiny network for wildlife datasets with long-tail data characteristics.
Popular science text classification aims to classify the popular science articles according to the popular science classification system. Concerning the problem that the length of popular science articles often exceeds 1 000 words, which leads to the model hard to focus on key points and causes poor classification performance of the traditional models, a model for long text classification combining knowledge graph to perform two-level screening was proposed to reduce the interference of topic-irrelevant information and improve the performance of model classification. First, a four-step method was used to construct a knowledge graph for the domains of popular science. Then, this knowledge graph was used as a distance monitor to filter out irrelevant information through training sentence filters. Finally, the attention mechanism was used to further filter the information of the filtered sentence set, and the attention-based topic classification model was completed. Experimental results on the constructed Popular Science Classification Dataset (PSCD) show that the text classification algorithm model based on the domain knowledge graph information enhancement has higher F1-Score. Compared with the TextCNN model and the BERT (Bidirectional Encoder Representations from Transformers) model, the proposed model has the F1-Score increased by 2.88 percentage points and 1.88 percentage points respectively, verifying the effectiveness of knowledge graph to long text information screening.
Wearable ElectroEncephaloGram (EEG) device is a wireless EEG system to daily real-time monitoring. It is developed rapidly and widely applied because of its portability, real-time performance, non-invasiveness, and low-cost advantages. This system is mainly composed of hardware parts such as signal acquisition module, signal processing module, micro-control module, communication module and power supply module, and software parts such as mobile terminal module and cloud storage module. The key technologies of wearable EEG devices were discussed. First, the improvement of EEG signal acquisition module was explained. In addition, the comparisons of wearable EEG device signal preprocessing module, signal noise reduction, artifact processing and feature extraction technology were performed. Then, the advantages and disadvantages of machine learning and deep learning classification algorithms were analyzed, and the application fields of wearable EEG device were summarized. Finally, future development trends of the key technologies of wearable EEG device were proposed.
Aiming at the problem of insufficient adaptive ability of network intrusion detection models, the large-scale fast search ability of Sparrow Search Algorithm (SSA) was introduced into Particle Swarm Optimization (PSO) algorithm, and a network intrusion detection algorithm based on Sparrow Search Algorithm and improved Particle Swarm Optimization Algorithm (SSAPSO) was proposed. In the algorithm, by optimizing the parameters that are difficult to set in Light Gradient Boosting Machine (LightGBM) algorithm, PSO algorithm converged quickly while ensuring the optimization accuracy, and an optimal network intrusion detection model was obtained. Simulation results show that on the four benchmark functions, SSAPSO converged faster than basic PSO algorithm. Compared with Categorical features+gradient Boosting (CatBoost) algorithm, SSAPSO optimized LightGBM (SSAPSO-LightGBM) has the accuracy, recall, precision and F1_score improved by 15.12%, 3.25%, 21.26% and 12.25% respectively on KDDCUP99 dataset. Compared with LightGBM algorithm, SSAPSO-LightGBM has the detection accuracy for Normal, Remote-to-Login (R2L) attack, User-to-Root (U2R) attack and Probeing (PROBE) attack on the above dataset improved by 0.61%, 3.14%, 4.24%, 1.04% and 5.03% respectively.
Clustering is a technique to find the internal structure between data, which is a basic problem in many data-driven applications. Clustering performance depends largely on the quality of data representation. In recent years, deep learning is widely used in clustering tasks due to its powerful feature extraction ability, in order to learn better feature representation and improve clustering performance significantly. Firstly, the traditional clustering tasks were introduced. Then, the representative clustering methods based on deep learning were introduced according to the network structure, the existing problems were pointed out, and the applications of deep learning based clustering in different fields were presented. At last, the development of deep learning based clustering was summarized and prospected.
The virtual try-on technologies based on image synthesis mask strategy can better retain details of the clothing when the warped clothing is fused with the human body. However, because the position and structure of the human body and the clothing are difficult to align during the try-on process, the try-on result is likely to produce severe occlusion, affecting visual effect. In order to solve the occlusion in the try-on process, a U-Net based generator was proposed. In the generator, a cascaded spatial attention module and a channel attention module were added to the U-Net decoder, thereby achieving the cross-domain fusion between local features of warped clothes and global features of the human body. Formally, first, by predicting the Thin Plate Spline (TPS) conversion using the convolutional network, the clothing was distorted according to the target human body pose. Then, the dressed-on person representation information and the warped clothing were input into the proposed generator, and the mask image of the corresponding clothing area was obtained to render the intermediate result. Finally, the strategy of mask synthesis was used to synthesize the warped clothing with the intermediate result through mask processing to obtain the final try-on result. Experimental results show that the proposed method can not only reduce occlusion, but also enhance image details. Compared with Characteristic-Preserving Virtual Try-On Network (CP-VTON) method, the proposed method has the generated image with the average Peak Signal-to-Noise Ratio (PSNR) increased by 10.47%, the average Fréchet Inception Distance (FID) decreased by 47.28%, and the average Structural SIMilarity (SSIM) increased by 4.16%.
Convolutional Neural Network (CNN) is one of the important research directions in the field of computer vision based on deep learning at present. It performs well in applications such as image classification and segmentation, target detection. Its powerful feature learning and feature representation capability are admired by researchers increasingly. However, CNN still has problems such as incomplete feature extraction and overfitting of sample training. Aiming at these issues, the development of CNN, classical CNN network models and their components were introduced, and the methods to solve the above issues were provided. By reviewing the current status of research on CNN models in image classification, the suggestions were provided for further development and research directions of CNN.
Aiming at the problem of big clustering error of the Sparse Subspace Clustering (SSC) methods, an SSC method based on random blocking was proposed. First, the original problem dataset was divided into several subsets randomly to construct several sub-problems. Then, after obtaining the coefficient matrices of several sub-problems by the sparse subspace Alternating Direction Method of Multipliers (ADMM) respectively, these coefficient matrices were expanded into coefficient matrices of the same size as the original problem and integrated into a coefficient matrix. Finally, a similarity matrix was calculated according to the coefficient matrix obtained by the integration, and the clustering result of the original problem was obtained by using the Spectral Clustering (SC) algorithm. The SSC method based on random blocking has the subspace clustering error reduced by 3.12 percentage points on average compared with the optional algorithm among SSC, Stochastic Sparse Subspace Clustering via Orthogonal Matching Pursuit with Consensus (S3COMP-C), scalable Sparse Subspace Clustering by Orthogonal Matching Pursuit (SSCOMP), SC and K-Means algorithms, and has all the mutual information, Rand index and entropy significantly better than comparison algorithms. Experimental results show that the SSC method based on random blocking can significantly reduce subspace clustering error, and improve the clustering performance.
In view of the problems of the current Convolutional Neural Network (CNN) using end layer features to recognize facial expression, such as complex model structure, too many parameters and unsatisfactory recognition, an optimization algorithm based on the combination of improved CNN and Support Vector Machine (SVM) was proposed. First, the network model was designed by the idea of continuous convolution to obtain more nonlinear activations. Then, the adaptive Global Average Pooling (GAP) layer was used to replace the fully connected layer in traditional CNN to reduce the network parameters. Finally, in order to improve generalization ability of the model, SVM classifier instead of the traditional Softmax function was used to realize expression recognition. Experimental results show that the proposed algorithm achieves 73.4% and 98.06% recognition accuracy on Fer2013 and CK+ datasets, which is 2.2 percentage points higher than the traditional LeNet-5 algorithm on Fer2013 dataset. Moreover, this network model has simple structure, less parameters and good robustness.
Concerning the problems of the lack of standard words, fuzzy semantics and feature sparsity in news topic text, a news topic text classification method based on Bidirectional Encoder Representations from Transformers(BERT) and Feature Projection network(FPnet) was proposed. The method includes two implementation modes. In mode 1: the multiple-layer fully connected layer features were extracted from the output of news topic text at BERT model, and the final extracted text features were purified with the combination of feature projection method, thereby strengthening the classification effect. In mode 2, the feature projection network was fused in the hidden layer inside the BERT model for feature projection, so that the classification features were enhanced and purified through the hidden layer feature projection. Experimental results on Toutiao, Sohu News, THUCNews-L、THUCNews-S datasets show that the two above modes have better performance in accuracy and macro-averaging F1 value than baseline BERT method with the highest accuracy reached 86.96%, 86.17%, 94.40% and 93.73% respectively, which proves the feasibility and effectiveness of the proposed method.
The parity blocks of the Maximum-Distance-Separable (MDS) code are all global parity blocks. The length of the reconstruction chain increases with the expansion of the storage system, and the reconstruction performance gradually decreases. Aiming at the above problems, a new type of Non-Maximum-Distance-Separable (Non-MDS) code called local redundant hybrid code Code-LM(s,c) was proposed. Firstly, two types of local parity blocks called horizontal parity block in the strip-set and horizontal-diagonal parity block were added in any strip-sets to reduce the length of the reconstruction chain, and the parity layout of the local redundant hybrid code was designed. Then, four reconstruction formulations of the lost data blocks were designed according to the generation rules of the parity blocks and the common block existed in the reconstruction chains of different data blocks. Finally, double-disk failures were divided into three situations depending on the distances of the strip-sets where the failed disks located and the corresponding reconstruction methods were designed. Theoretical analysis and experimental results show that with the same storage scale, compared with RDP (Row-Diagonal Parity), the reconstruction time of CodeM(s,c) for single-disk failure and double-disk failure can be reduced by 84% and 77% respectively; compared with V2-Code, the reconstruction time of Code-LM(s,c) for single-disk failure and double-disk failure can be reduced by 67% and 73% respectively. Therefore, local redundant hybrid code can support fast recovery from failed disks and improve reliability of storage system.
Sentiment analysis, as a subdivision of Natural Language Processing(NLP), has experienced the development of using sentiment lexicon, machine learning and deep learning to analyze. According to the problem of low accuracy, over fitting phenomenon in training process and low coverage, large workload when compiling the sentiment lexicon when using the generalized deep learning model as a text classifier to analysis of Web text reviews in a specific field, a sentiment analysis model based on sentiment lexicon and stacked residual Bidirectional Long Short-Term Memory (Bi-LSTM) network was proposed. Firstly, the sentiment words in the sentiment lexicon were designed to cover the professional words in the research field of "educational robot", thereby making up for the lack of accuracy of Bi-LSTM model in analyzing such texts. Then, Bi-LSTM and SnowNLP were used to reduce the volume of compilation of the sentiment lexicon. The memory gate and forget gate structures of Long Short-Term Memory (LSTM) network were able to ensure that the relevance of the words before and after in the comment text were fully considered with some analyzed words selected to be forgotten at the same time, thereby avoiding the problem of gradient explosion during the back propagation. After the introduction of the stacked residual Bi-LSTM, not only the number of layers of the model was deepened to 8, but also the "degradation" problem caused by the residual network stacking LSTM was avoided. Finally, by setting and adjusting the score weights of the two parts appropriately, and the sigmoid activation function was used to normalize the total score to the interval of [0,1]. According to the interval division of [0,0.5] and (0.5,1], negative and positive emotions were represented respectively, and sentiment classification was completed. Experimental results show that the sentiment classification accuracy of the proposed classification model for the reviews dataset about "educational robot" is improved by about 4.5 percentage points compared with the standard LSTM model and by about 2.0 percentage points compared with the BERT (Bidirectional Encoder Representation from Transformers). In conclusion, the sentiment classification model based on sentiment lexicon and deep learning classification model was generalized by the proposed model, and by modifying the sentiment words in the lexicon and appropriately adjusting the layer number and the structure of the deep learning model, the proposed model can be applied to accurate sentiment analysis of shopping reviews of all kinds of goods in e-commerce platform, thereby helping enterprises to understand the consumers’ shopping psychology and the market demand, as well as providing consumers with a reference standard for the quality of goods.
Aiming at the problem that the improved federated average algorithm based on analytic hierarchy process was affected by subjective factors when calculating its data quality, an improved federated weighted average algorithm was proposed to process multi-source data from the perspective of data quality. Firstly, the training samples were divided into pre-training samples and pre-testing samples. Then, the accuracy of the initial global model on the pre-training data was used as the quality weight of the data source. Finally, the quality weight was introduced into the federated average algorithm to reupdate the weights in the global model. The simulation results show that the model trained by the improved federal weighted average algorithm get the higher accuracy compared with the model trained by the traditional federal average algorithm, which is improved by 1.59% and 1.24% respectively on equally divided and unequally divided datasets. At the same time, compared with the traditional multi-party data retraining method, although the accuracy of the proposed model is slightly reduced, the security of data and model is improved.
Designing a unified solution to the combinational optimization problems of undesigned heuristic algorithms has become a research hotspot in the field of machine learning. At present, mature technologies are mainly aiming at static combinatorial optimization problems, but the combinational optimization problems with dynamic changes are not fully solved. In order to solve above problems, a lightweight model called Dy4TSP (Dynamic model for Traveling Salesman Problems) was proposed, which combined multi-head-attention mechanism with distributed reinforcement learning to solve the traveling salesman problem on a dynamic graph. Firstly, the node representation vector from graph convolution neural network was processed by the prediction network based on multi-head-attention mechanism. Then, the distributed reinforcement learning algorithm was used to quickly predict the possibility that each node in the graph was output as the optimal solution, and the optimal solution space of the problems in different possibilities were comprehensively explored. Finally, the action decision sequence which could meet the specific reward function in real time was generated by the trained model. The model was evaluated on three typical combinatorial optimization problems, and the experimental results showed that the solution qualities of the proposed model are 0.15 to 0.37 units higher than those of the open source solver LKH3 (Lin-Kernighan-Helsgaun 3), and are significantly better than those of the latest algorithms such as Graph Attention Network with Edge Embedding (EGATE). The proposed model can reach an optimal path gap of 0.1 to 1.05 in other dynamic traveling salesman problems, and the results are slightly better.
Aiming at how to use the food safety standard reference network to find the key standards that have a great impact on food safety inspection and detection from many national food safety standards, a method of finding the important nodes in the food safety standard reference network based on multi-attribute comprehensive evaluation was proposed. Firstly, the importance of standard nodes were evaluated by using the degree centrality, closeness centrality and betweenness centrality in social network analysis as well as the Web page importance evaluation algorithm PageRank respectively. Secondly, the Analytic Hierarchy Process (AHP) was used to calculate the weight of each evaluation index in the importance evaluation, and multi-attribute decision-making method based on TOPSIS (Technique for Order Preference by Similarity to an Ideal Solution) was used to comprehensively evaluate the importance of standard nodes and found out the important nodes. Thirdly, the important nodes obtained from the comprehensive evaluation and the important nodes obtained from the degree based evaluation were deleted from their own reference network respectively, and the connectivity of the reference networks after deleting the important nodes was tested. The worse the connectivity was, the more important the nodes were. Finally, the Louvain community discovery algorithm was used to test the network connectivity, that is to find the communities of the network nodes. The more nodes not included in the communities, the worse the network connectivity. Experimental results show that after deleting the important nodes found by the comprehensive evaluation method based on multi-attribute, more nodes cannot be included in the communities than those of the evaluation method based on degree, proving that the proposed method can better find the important nodes in the reference network. The proposed method helps standard makers quickly grasp the core contents and key nodes when revising and updating standards, and plays a guiding role in the construction of the system of national food safety standards.
Aiming at the problems of the slow model convergence and low diagnosis accuracy due to the time-series fault diagnosis data with strong noise in the industrial field, an improved one-Dimensional Convolutional and Bidirectional Long Short-Term Memory(1DCNN-BiLSTM) Neural Network fault diagnosis method was proposed. The method includes preprocessing of fault vibration signals, automatic feature extraction and vibration signal classification. Firstly, Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) technology was used to preprocess the original vibration signal. Secondly, the 1DCNN-BiLSTM dual channel model was constructed, and the processed signal was input into the Bidirectional Long Short-Term Memory (BiLSTM) model channel and the One-dimensional Convolution Neural Network (1DCNN) model channel to fully extract the timing correlation characteristics, the non-correlation characteristics of the local space and the weak periodic laws of the signal. Thirdly, in response to the problem of strong noise in the signal, the Squeeze and Excitation Network (SENet) module was improved and applied to the two different channels. Finally, the features extracted from the two channels were fused by putting them into the fully connected layer, and the accurate identification of equipment faults was realized by the help of the Softmax classifier. The bearing dataset of Case Western Reserve University was used for experimental comparison and verification. The results show that after applying the improved SENet module to the 1DCNN channel and the stacked BiLSTM channel at the same time, the 1DCNN-BiLSTM dual channel model performs the highest diagnosis accuracy 96.87% with fast convergence, which is better than traditional one-channel models, thereby effectively improving the efficiency of equipment fault diagnosis.
Aiming at the problems that the existing knowledge graph recommendation models do not consider the periodic features of the user and the items to be recommended will affect the recent interests of the user, a knowledge graph recommendation model with Multiple Time scales and Feature Enhancement (MTFE) was proposed. Firstly, Long Short-Term Memory (LSTM) network was used to mine the user’s periodic features on different time scales and integrate them into user representation. Then, attention mechanism was used to mine the features strongly correlated with the user’s recent features in the items to be recommended and integrate them into the item representation after enhancement. Finally, the scoring function was used to calculate user’s ratings of items to be recommended. The proposed model was compared with PER(Personalized Entity Recommendation), CKE(Collaborative Knowledge base Embedding), LibFM, RippleNet, KGCN(Knowledge Graph Convolutional Network), CKAN(Collaborative Knowledge-aware Attentive Network) knowledge graph recommendation models on real datasets Last.FM, MovieLens-1M and MovieLens-20M. Experimental results show that compared with the model with the best prediction performance, MTFE model has the F1 value improved by 0.78 percentage points, 1.63 percentage points and 1.92 percentage points and the Area Under Curve of ROC (AUC)metric improved by 3.94 percentage points, 2.73 percentage points and 1.15 percentage points on three datasets respectively. In summary, compared with comparative knowledge graph recommendation models, the proposed knowledge graph recommendation model has better recommendation effect.
Current research on knowledge graph recommendation mainly focus on model establishment and training. However, in practical applications, it is necessary to update the model regularly by using incremental updating method to adapt to the changes of preferences of new and old users. Because most of these models only use the users’ long-term interest representations for recommendation, do not consider the users’ short-term interests, and during the aggregation of neighborhood entities to obtain the item vector representation, the interpretability of the aggregation methods is insufficient, and there is the problem of catastrophic forgetting in the process of updating the model, a Knowledge Graph Preference ATtention network based Long- and Short-term recommendation (KGPATLS) model and its updating method were proposed. Firstly, the aggregation method of preference attention network and the user representation method combining users’ long- and short-term interests were proposed through KGPATLS model. Then, in order to alleviate the catastrophic forgetting problem during model update, an incremental updating method Fusing Predict Sampling and Knowledge Distillation (FPSKD) was proposed. The proposed model and incremental updating method were tested on MovieLens-1M and Last.FM datasets. Compared with the optimal baseline model Knowledge Graph Convolutional Network (KGCN), KGPATLS has the Area Under Curve (AUC) increased by 2.2% and 1.4% respectively and the Accuracy (Acc) increased by 2.5% and 2.9% on the two datasets respectively. Compared with three baseline incremental updating methods on the two datasets, the AUC and Acc indexes of FPSKD are better than those of Fine Tune and Random Sampling respectively, the training time index of FPSKD is reduced to about one eighth and one quarter of that of Full Batch respectively. Experimental results verify the performance of KGPATLS model and that FPSKD can update the model efficiently while maintaining the model performance.
In view of the difficulty of traditional mechanical fault diagnosis methods to solve the problem of the uncertainty of manual extraction, a large number of deep learning feature extraction methods have been proposed, which greatly promotes the development of mechanical fault diagnosis. As a typical representative of deep learning, convolution neural networks have made significant developments in image classification, target detection, image semantic segmentation and other fields. There is also a lot of literature in the field of mechanical fault diagnosis. In view of the published literature, in order to further understand the problem of mechanical fault diagnosis by using the method of convolutional neural network, on the basis of a brief introduction to the relevant theories of convolution neural network, and then from the aspects such as data input type, transfer learning, and prediction, the applications of convolution neural network in mechanical fault diagnosis were summarized. Finally, the development directions of convolution neural network and its applications in mechanical fault diagnosis were prospected.
In view of the large amount of new data in the Industrial Internet Of Things(IIOT) and the imbalance of data at the factory sub-ends, a data sharing method of IIOT based on Federal Incremental Learning (FIL-IIOT) was proposed. Firstly, the industry federation model was distributed to the factory sub-end as the local initial model. Then, the federal sub-end optimization algorithm was proposed to dynamically adjust the participating subset. Finally, the incremental weight of the factory sub-end was calculated through the federal incremental learning algorithm, thereby integrating the new state data with the original industry federation model quickly. Experimental results the Case Western Reserve University (CWRU) bearing failure dataset show that the proposed FIL-IIOT makes the accuracy of bearing fault diagnosis reached 93.15%, which is 6.18 percentage points and 2.59 percentage points higher than those of Federated Averaging (FedAvg) algorithm and FIL-IIOT of Non Increment (FIL-IIOT-NI) method, respectively. The proposed method meets the needs of continuous optimization of industry federation model based on industrial incremental data.
Label modeling is the basic task of label system construction and portrait construction. Traditional label modeling methods have problems such as difficulty in processing fuzzy labels, unreasonable label extraction, and ineffective integration of multi-modal entities and multi-dimensional relationships. Aiming at these problems, an enterprise profile construction method based on label layering and deepening modeling, called EPLLD (Enterprise Portrait of Label Layering and Deepening), was proposed. Firstly, the multi-characteristic information was extracted through multi-source information fusion, and the fuzzy labels of enterprises (such as labels in wholesale and retail industries that cannot fully summarize the characteristics of enterprises) were counted and screened. Secondly, the professional domain lexicon was established for feature expansion, and the BERT (Bidirectional Encoder Representation from Transformers) language model was combined for multi-feature extraction. Thirdly, Bi-directional Long Short-Term Memory (BiLSTM) was used to obtain fuzzy label deepening results. Finally, the keywords were extracted through TF-IDF (Term Frequency-Inverse Document Frequency), TextRank, and Latent Dirichlet Allocation (LDA) model to achieve label layering and deepening modeling. Experimental analysis on the same enterprise dataset shows that the precision of EPLLD in the fuzzy label deepening task is 91.11%, which is higher than those of 8 label processing methods such as BiLSTM+Attention and BERT+Deep CNN.
Focused on the disadvantages of slow convergence and easy to fall into local optimum of Harris Hawks Optimization (HHO) algorithm, an improved HHO algorithm called Chemotaxis Correction HHO (CC-HHO) algorithm was proposed. Firstly, the state of convergence curve was identified by calculating the rate of decline and change weight of the optimal solution. Secondly, the CC mechanism of the Bacterial Foraging Optimization (BFO) algorithm was introduced into the local search stage to improve the accuracy of optimization. Thirdly, the law of energy consumption was integrated into the updating process of the escape energy factor and the jump distance to balance the exploration and exploitation. Fourthly, elite selection for different combinations of optimal solution and sub-optimal solution was used to improve the universality of global search of the algorithm. Finally, when the search was falling into local optimum, the escape energy was disturbed to realize the forced jumping out. The performance of the improved algorithm was tested by ten benchmark functions. The results show that the search accuracy of CC-HHO algorithm on unimodal functions is better than those of Gravitational Search Algorithm (GSA), Particle Swarm Optimization (PSO) algorithm, Whale Optimization Algorithm (WOA) and other four improved HHO algorithms for more than ten orders of magnitude; there is also more than one order of magnitude superiority on multimodal functions; on the premise that search stability is improved by more than 10% on average, the proposed algorithm has faster convergence speed significantly than the above-mentioned several comparative optimization algorithms with more obvious convergence trend. Experimental results show that CC-HHO algorithm effectively improves the efficiency and robustness of the original algorithm.
In order to solve the problems of insufficient stability and poor accuracy of label propagation algorithms, a label propagation overlapping community detection algorithm OCKELP (Overlapping Community detection algorithm combining K-shell and label Entropy in Label Propagation) was proposed, which combined K-shell and label entropy. Firstly, the K-shell algorithm was used to reduce the label initialization time, and the update sequence of label entropy was used to improve the stability of the algorithm. Secondly, the comprehensive influence was introduced for label selection, and the community level information and node local information were fused to improve the accuracy of the algorithm. Compared with Community Overlap PRopagation Algorithm (COPRA), Overlapping community detection in complex networks based on Multi Kernel Label Propagation(OMKLP) and Speaker-listener Label Propagation Algorithm (SLPA), OCKELP algorithm has the greatest modularity improvement of about 68.64%, 53.99% and 42.29% respectively on the real network datasets. It also has obvious advantages over the other three algorithms in the Normalized Mutual Information (NMI) value of the artificial network datasets, and with the increase of the number of communities to which overlapping nodes belong, the real structures of the communities can also be excavated.
Aiming at the problems of strong interference and low detection precision of the existing safety helmet wearing detection, an algorithm of safety helmet detection based on improved YOLOv5 (You Only Look Once version 5) model was proposed. Firstly, for the problem of different sizes of safety helmets, the K-Means++ algorithm was used to redesign the size of the anchor box and match it to the corresponding feature layer. Secondly, the multi-spectral channel attention module was embedded in the feature extraction network to ensure that the network was able to learn the weight of each channel autonomously and enhance the information dissemination between the features, thereby strengthening the network ability to distinguish foreground and background. Finally, images of different sizes were input randomly during the training iteration process to enhance the generalization ability of the algorithm. Experimental results show as follows: on the self-built safety helmet wearing detection dataset, the proposed algorithm has the mean Average Precision (mAP) reached 96.0%, the the Average Precision (AP) of workers wearing safety helmet reached 96.7%, and AP of workers without safety helmet reached 95.2%. Compared with the YOLOv5 algorithm, the proposed algorithm has the mAP of helmet safety-wearing detection increased by 3.4 percentage points, and it meets the accuracy requirement of helmet safety-wearing detection in construction scenarios.
Traditional machine learning methods fail to fully dig out semantic information and association information when classifying the sentiment polarity of online comment text. Although the existing deep learning methods can extract the semantic information and contextual information, the process is often one-way and there are some deficiencies in the process of obtaining the deep semantic information of comment text. Aiming at the above problems, a text sentiment analysis method was proposed by combining generalized autoregressive pretraining for language understanding model (XLNet) and RCNN (Recurrent Convolutional Neural Network). Firstly, XLNet was used to represent the text features. And by introducing the segment-level recurrence mechanism and relative position information encoding, the contextual information of comment text was fully considered, thereby improving the expression ability of text features effectively. Then, RCNN was used to train the text features in both directions and extract the context semantic information of the text at a deeper level, thereby improving the comprehensive performance in the sentiment analysis task. The experiments with the proposed method were carried out on three public datasets weibo-100k, waimai-10k and ChnSentiCorp. The results show that the accuracy reaches 96.4%, 91.8% and 92.9% respectively, which proves the effectiveness of the proposed method in the sentiment analysis task.
At present, most deep learning models are difficult to deal with the classification of bird sound under complex background noise. Because bird sound has the continuity characteristic in time domain and high-low characteristic in frequency domain, a fusion model of homologous spectrogram features was proposed for bird sound classification under complex background noise. Firstly, Convolutional Neural Network (CNN) was used to extract Mel-spectrogram features of bird sound. Then, the time domain and frequency domain dimensions of the same Mel-spectrogram feature were compressed to 1 by specific convolution and down-sampling operations, so that frequency domain feature with only high-low characteristics and the time domain feature with only continuous characteristics were obtained. Based on the above operation to extract frequency domain and time domain features, the features of Mel-spectrogram were extracted both in time domain and frequency domain, the time-frequency domain features with continuity and high-low characteristics were obtained. Then the self-attention mechanism was applied to the obtained time domain, frequency domain and time-frequency domain features, strengthening their own characteristics. Finally, the results of these three homologous spectrogram features after decision fusion were used for bird sound classification. The proposed model was used for audio classification of 8 bird species on Xeno-canto website, achieved the better result in the comparison experiment with the Mean Average Precision (MAP) of 0.939. The experimental results show that the proposed model can deal with the problem of the poor classification effect of bird sound under complex background noise.
When using the Bscan image generated by Ground Penetrating Radar (GPR) to detect underground targets, the current target detection network models based on deep learning have some problems, such as high demand of training samples, long time consuming, unable to distinguish the significance of targets, and difficult to identify complex targets. To solve the above problems, a double threshold segmentation algorithm based on histogram was proposed. Firstly, based on the distribution characteristics of GPR image histogram of underground target, two thresholds for underground target segmentation were calculated quickly from the histogram. Then, a combination classifier model with Support Vector Machine (SVM) and LeNet was used to classify the segmentation results. Finally, classification results were integrated and the accuracy values were counted. Compared with the traditional threshold segmentation algorithms such as Ostu and iterative methods, the structure of the underground target segmentation results obtained by the proposed algorithm was more complete and almost free of noise. On the real dataset, the average recognition accuracy of the proposed algorithm reached more than 90%, which was more than 40% higher than that of the algorithm using a single classifier. The experimental results show that the salient and non-salient underground targets can be effectively segmented at the same time, and the combination classifier can obtain better classification results. It is suitable for automatic detection and recognition of underground targets with small sample datasets.