In power monitoring scenarios, real-time monitoring and analysis of personnel behavior are crucial for ensuring the safe and stable operation of power systems. However, directly exposing facial information in monitoring videos without proper processing poses serious privacy risks. Traditional face detection-based blurring methods face challenges such as insufficient robustness and high computational costs in complex power environments, making them hard to meet both accuracy and real-time requirements. To address these issues, a real-time face blurring method based on head skeleton point detection was proposed. Firstly, a lightweight head skeleton point detection framework based on a hierarchical processing strategy was designed to locate personnel regions in compressed videos rapidly and stitch the cropped areas at original resolution to batch-detect head skeleton points of all the people, thus improving detection efficiency and accuracy. Secondly, an adaptive inter-frame optimization strategy was introduced, to use frame differencing to detect changes in the number of personnel quickly and adjust detection frequency dynamically by incorporating a tracking mechanism for personnel detection boxes, thereby reducing redundant computational overhead effectively. Finally, a prototype system for real-time face blurring was constructed on edge nodes, and its performance was validated through experiments. Experimental results indicate that taking the KAPAO-S model as an example, the proposed method improves the face blurring accuracy in monitoring videos by 3.6 percentage points and reduces the processing time per frame by 2.5 ms approximately compared to the original model, thereby ensuring accuracy and real-time performance at the same time.
To address the issues of low sensitivity to the feature parameters of new categories and difficulty in distinguishing category related and category unrelated parameters accurately of the existing few-shot object detection models, leading to unclear feature boundaries and category confusion, a Few-Shot Object Detection algorithm based on new categories Feature Enhancement and Metric Mechanism (FEMM-FSOD) was proposed. Firstly, a Cross-Domain Parameter perception Module (CDPM) was introduced to improve the neck network, thereby reconstructing re-weighting operations of channel and spatial features, and dilated convolution was combined with cross-stage information transfer and feature fusion to provide rich gradient information guidance and enhance the sensitivity of new category parameters. Meanwhile, an Integrated Correlated Multi-Feature module (ICMF) was constructed before Region of Interest Pooling (RoI Pooling) to establish correlation between features and optimize the fusion method of relevant features dynamically, thereby enhancing salient features. The introduction of CDPM and ICMF enhanced the feature representation of new categories effectively, so as to alleviate feature boundary ambiguity. Additionally, to further reduce category confusion, an orthogonal loss function based on metric mechanism, Coherence-Separation Loss (CohSep Loss), was proposed in the detection head to achieve intra-class feature aggregation and inter-class feature separation by measuring feature vector similarity. Experimental results show that compared to the baseline algorithm TFA (Two-stage Fine-tuning Approach), on PASCAL VOC dataset, the proposed algorithm improves the mAP50 (mean Average Precision (mAP) of new categories with threshold of 0.50) of 15 types of few-shot instance numbers by 5.3 percentage points; on COCO dataset, the proposed algorithm improves the mAP (mAP of new categories with threshold from 0.50 to 0.95) for 10shot and 30shot settings by 3.6 and 5.2 percentage points, respectively, realizing higher accuracy in few-shot object detection.
Aspect-Based Sentiment Analysis (ABSA) is a fine-grained sentiment analysis task aiming to analyze sentiment polarity of specific aspect words in a given text. Existing ABSA methods use Graph Convolutional Network (GCN) to process syntactic and semantic information, but they treat all syntactic dependencies of aspect words equally, ignoring the impact of distant unrelated words on target aspect words, resulting in inappropriate weight allocation of target aspect words and viewpoint words, and insufficient extraction of semantic information. Aiming at these issues, an ABSA model integrating syntax and sentiment knowledge was proposed. Firstly, a reachability matrix was constructed according to syntactic information. Based on this, a syntactic enhancement graph was constructed by weighting the central position through the aspect words. Secondly, a semantic enhancement graph was constructed by external emotional knowledge and aspect enhancement, and graph convolution was used to fully model the syntactic enhancement graph and semantic enhancement graph, respectively, so as to form different feature channels. Thirdly, biaffine attention was used to integrate syntactic and semantic information effectively. Finally, average-pooling and concatenation operations were used to obtain the final feature vectors corresponding to the aspect words. Experimental results indicate that compared with the deep dependency aware graph convolutional network model — DA-GCN-BERT (deep Dependency Aware GCN+BERT(Bidirectional Encoder Representations from Transformers)), the proposed model achieves the accuracy improvements of 1.71, 1.41, 1.27, 0.17, and 0.43 percentage points on five publicly available datasets, respectively. It can be seen that the proposed model has strong applicability in the ABSA field.
To address the problems of low detection accuracy and high missed detection rate caused by the narrow lateral, multi-scale, and long-range dependency characteristics of pavement defect morphology, a pavement defect detection algorithm improved by YOLOv8_n with enhanced morphological perception was proposed. Firstly, an Edge-Enhancement Focus Module (EEFM) was introduced in the backbone fusion stage, a strip pooling kernel was used to capture directional and position-aware information, thereby enhancing edge details in deep features and improving representation ability of elongated features. Secondly, a Dual Chain Feature Redistribution Pyramid Network (DCFRPN) was designed to reconstruct the fusion method, so as to provide multi-scale features with extensive perception and rich localization information, thereby improving fusion ability for multi-scale defects. Additionally, a Morphological Aware Task Interaction Detection Head (MATIDH) was constructed to enhance task interaction between classification and localization, thereby adjusting data representation dynamically and integrating multi-scale strip convolutions to optimize the classification and regression of elongated defects. Finally, a PWIoU (Penalized Weighted Intersection over Union) loss function was proposed to allocate gradient gains dynamically for prediction boxes of different qualities, thereby optimizing the regression of bounding boxes. Experimental results show that on the RDD2022 dataset, compared to YOLOv8_n, the proposed algorithm has the precision and recall improved by 3.5 and 2.3 percentage points, respectively, and the mean Average Precision (mAP) at 50% Intersection over Union (IoU) increased by 3.2 percentage points, verifying the effectiveness of the proposed algorithm.
Mining users’ multi-dimensional opinions on products from the complaint texts of new energy vehicles can provide support for product design decisions. Because the complaint text has the characteristics of high entity density and lengthy sentence structure, the existing methods for Aspect-Opinion Pair Extraction (AOPE) suffer from weak correlations between aspect terms and opinion terms. To address this problem, an Aspect-Opinion pair Extraction model based on Context Enhancement (AOE-CE) was proposed, fusing topic features and text features as contextual representation to enhance the correlations between entities. This model was consisted of an entity recognition module and a relation detection module. Firstly, in the entity recognition module, the text was encoded by using a pre-trained model and a part-of-speech tagging tool. Secondly, Bi-directional Long Short-Term Memory (Bi-LSTM) network combined with multi-head attention was employed to capture contextual information and then derive text features. Subsequently, these text features were input into a Conditional Random Field (CRF) model to obtain the entity set. In the relation detection module, the topic features were obtained through BERT (Bidirectional Encoder Representations from Transformers) and fused with the text features to obtain the enhanced contextual representation. Then the tri-affine mechanism was used to enhance the correlations between entities with the help of contextual representation. Finally, the extraction result was obtained by Sigmoid. The experimental results show that the precision, recall, and F1 value of AOE-CE are 2.19, 1.08, and 1.60 percentage points higher than those of SDRN (Synchronous Double-channel Recurrent Network) model respectively, indicating that AOE-CE has better AOPE effect.
A complex causal relationship extraction model based on prompt enhancement and Bi-Graph ATtention network (BiGAT) — PE-BiGAT (Prompt Enhancement and Bi-Graph Attention Network) was proposed to address the issues of insufficient external information and information transmission forgetting caused by the high density and long sentence patterns of complex causal sentences. Firstly, the result entities from the sentence were extracted and combined with the prompt learning template to form the prompt information, and the prompt information was enhanced through an external knowledge base. Then, the prompt information was input into the BiGAT, the attention layer was combined with syntax and semantic dependency graphs, and the biaffine attention mechanism was used to alleviate feature overlapping and enhance the model’s perception of relational features. Finally, all causal entities in the sentence were predicted iteratively by the classifier, and all causal pairs in the sentence were analyzed through a scoring function. Experimental results on SemEval-2010 task 8 and AltLex datasets show that compared with RPA-GCN (Relationship Position and Attention?Graph Convolutional Network), the proposed model improves the F1 score by 1.65 percentage points, with 2.16 and 4.77 percentage points improvements in chain causal and multi-causal sentences, which confirming that the proposed model has an advantage in dealing with complex causal sentences.
As one of the basic tasks of natural language processing, new word identification provides theoretical support for the establishment of Chinese dictionary and analysis of word sentiment tendency. However, the current new word identification methods do not consider the homophonic neologism identification, resulting in low precision of homophonic neologism identification. To solve this problem, a Chinese homophonic neologism discovery method based on Pinyin similarity was proposed, and the precision of homophonic neologism identification was improved by introducing the phonetic comparison of new and old words in this method. Firstly, the text was preprocessed, the Average Mutual Information (AMI) was calculated to determine the degree of internal cohesion of candidate words, and the improved branch entropy was used to determine the boundaries of candidate new words. Then, the retained words were transformed into Chinese Pinyin with similar pronunciations and compared to the Chinese Pinyin of the old words in the Chinese dictionary, and the most similar results of comparisons would be retained. Finally, if a comparison result exceeded the threshold, the new word in the result was taken as the homophonic neologism, and its corresponding word was taken as the original word. Experimental results on self built Weibo datasets show that compared with BNshCNs (Blended Numeric and symbolic homophony Chinese Neologisms) and DSSCNN (similarity computing model based on Dependency Syntax and Semantics), the proposed method has the precision, recall and F1 score improved by 0.51 and 5.27 percentage points, 2.91 and 6.31 percentage points, 1.75 and 5.81 percentage points respectively, indicating that the proposed method has better Chinese homophonic neologism identification effect.
Concerning the shortcomings of unbalanced search, easy to fall into local optimum and weak comprehensive solution performance of Teaching-Learning-Based Optimization (TLBO) algorithm in dealing with optimization problems, an improved TLBO based on equilibrium optimization and Lévy flight strategy, namely ELMTLBO (Equilibrium-Lévy-Mutation TLBO), was proposed. Firstly, an elite equilibrium guidance strategy was designed to improve the global optimization ability of the algorithm through the equilibrium guidance of multiple elite individuals in the population. Secondly, a strategy combining Lévy flight with adaptive weight was added after the learner phase of TLBO algorithm, and adaptive scaling was performed by the weight to the step size generated by Lévy flight, which improved the population's local optimization ability and enhanced the self-adaptability of individuals to complex environments. Finally, a mutation operator pool escape strategy was designed to improve the population diversity of the algorithm by the cooperative guidance of multiple mutation operators. To verify the effectiveness of the algorithm improvement, the comprehensive convergence performance of the ELMTLBO algorithm was compared with 7 state-of-the-art intelligent optimization algorithms such as the Dwarf Mongoose Optimization Algorithm (DMOA), as well as the same type of algorithms such as Balanced TLBO (BTLBO) and standard TLBO on 15 international test functions. The statistical experiment results show that compared with advanced intelligent optimization algorithms and TLBO algorithm variants, ELMTLBO algorithm can effectively balance its search ability, not only solving both unimodal and multimodal problems, but also having significant optimization ability in complex multimodal problems. It can be seen that with the combined effect of different strategies, ELMTLBO algorithm has outstanding comprehensive optimization performance and stable global convergence performance. In addition, ELMTLBO algorithm was successfully applied to the Multiple Sequence Alignment (MSA) problem based on Hidden Markov Model (HMM), and the high-quality aligned sequences obtained by this algorithm can be used in disease diagnosis, gene tracing and some other fields, which can provide good algorithmic support for the development of bioinformatics.
Focusing on the problems of disturbed car-following behavior and instability of traffic flow caused by the uncertainty of the driver’s acquisition of road velocity limit and time delay information, a car-following model TD-VDVL (Time-Delayed Velocity Difference and Velocity limit) was proposed with the consideration of the time-delayed velocity difference and the velocity limit information in the Internet of Vehicles (IoV) environment. Firstly, the speed change caused by time delay and road velocity limit information were introduced to improve the Full Velocity Difference (FVD) model. Then, the linear spectrum wave perturbation method was used to derive the traffic flow stability judgment basis of TD-VDVL model, and the influence of each parameter in the model on the stability of the system was analyzed separately. Finally, the numerical simulation experiments and comparative analysis were carried out using Matlab. In the simulation experiments, straight roads and circular roads were selected, and slight disturbance was imposed on the fleet during driving. When conditions were the same, TD-VDVL model had the smallest velocity fluctuation rate and the fluctuation of fleet headway compared to the Optimal Velocity (OV) and FVD models. Especially when the sensitivity coefficient of the velocity limit information was 0.3, and the sensitivity coefficient of the time-delayed speed difference was 0.3, the proposed model had the average fluctuation rate of the fleet velocity reached 2.35% at time of 500 s, and the peak and valley difference of fleet headway of only 0.019 4 m. Experimental results show that TD-VDVL model has a better stable area after introducing time-delayed velocity difference and velocity limit information, and can significantly enhance the ability of car-following fleet to absorb disturbance.
Concerning that the multi-view data analysis is susceptible to the noise of the original dataset and requires additional steps to calculate the clustering results, a Robust Multi-view subspace clustering based on Consistency Graph Learning (RMCGL) algorithm was proposed. Firstly, the potential robust representation of data in the subspace was learned in each view, and the similarity matrix of each view was obtained based on these representations. Then, a unified similarity graph was learned based on the obtained multiple similarity matrices. Finally, by adding rank constraints to the Laplacian matrix corresponding to the similarity graph, the obtained similarity graph had the optimal clustering structure, and the final clustering results were able to be obtained directly by using this similarity graph. The process was completed in a unified optimization framework, in which potential robust representations, similarity matrices and consistency graphs could be learned simultaneously. The clustering Accuracy (ACC) of RMCGL algorithm is 3.36 percentage points, 5.82 percentage points and 5.71 percentage points higher than that of Graph-based Multi-view Clustering (GMC) algorithm on BBC, 100leaves and MSRC datasets, respectively. Experimental results show that the proposed algorithm has a good clustering effect.
To the shortage of theoretical support in the policy-making process of traffic guidance management, the research method of choice behavior with confinement mechanism of traffic information was proposed. From the perspective of human perception, the deep analysis of Multi-Source Traffic Information (MSTI) constraint rule was presented based on fuzzy clustering algorithm, then the road network environment was simulated by VISSIM and the traffic state pattern recognition model was established to simulate the mental activity of traveler under restriction of information. Then by means of Biogeme software, the choice model was constructed based on the behavior survey data, which was obtained in the road network example by using Stated Preference (SP) investigate method. Results show that the sanction of traffic information on travel behavior is very limited and the travelers prefer the preference path when traffic of this preference path is not very heavy, while this sanction enhances gradually and the path change behavior, which is influenced by the information, becomes more frequent when the preference path is more congested. The conclusions provided a new idea and reference for incomplete rational behavior research under the information environment, and also provided decision support for traffic management department.