Journal of Computer Applications

Research advances in disentangled representation learning

Keyang CHENG, Chunyun MENG, Wenshan WANG, Wenxi SHI, Yongzhao ZHAN

2021, 41(12): 3409-3418. DOI: 10.11772/j.issn.1001-9081.2021060895

Asbtract ( )

HTML ( )

PDF (877KB) ( )

Figures and Tables | References | Related Articles | Metrics

The purpose of disentangled representation learning is to model the key factors that affect the form of data， so that the change of a key factor only causes the change of data on a certain feature， while the other features are not affected. It is conducive to face the challenge of machine learning in model interpretability， object generation and operation， zero-shot learning and other issues. Therefore， disentangled representation learning always be a research hotspot in the field of machine learning. Starting from the history and motives of disentangled representation learning， the research status and applications of disentangled representation learning were summarized， the invariance， reusability and other characteristics of disentangled representation learning were analyzed， and the research on the factors of variation via generative entangling， the research on the factors of variation with manifold interaction， and the research on the factors of variation using adversarial training were introduced， as well as the latest research trends such as a Variational Auto-Encoder （VAE） named β-VAE were introduced. At the same time， the typical applications of disentangled representation learning were shown， and the future research directions were prospected.

Constrained multi-objective evolutionary algorithm based on space shrinking technique

Erchao LI, Yuyan MAO

2021, 41(12): 3419-3425. DOI: 10.11772/j.issn.1001-9081.2021060887

Asbtract ( )

HTML ( )

PDF (979KB) ( )

Figures and Tables | References | Related Articles | Metrics

The reasonable exploration of the infeasible region in constrained multi-objective evolutionary algorithms for solving optimization problems with large infeasible domains not only helps the population to converge quickly to the optimal solution in the feasible region， but also reduces the impact of unpromising infeasible region on the performance of the algorithm. Based on this， a Constrained Multi-Objective Evolutionary Algorithm based on Space Shrinking Technique （CMOEA-SST） was proposed. Firstly， an adaptive elite retention strategy was proposed to improve the initial population in the Pull phase of Push and Pull Search for solving constrained multi-objective optimization problems （PPS）， so as to increase the diversity and feasibility of the initial population in the Pull phase. Then， the space shrinking technique was used to gradually reduce the search space during the evolution process， which reduced the impact of unpromising infeasible regions on the algorithm performance. Therefore， the algorithm was able to improve the convergence accuracy while taking account of both convergence and diversity. In order to verify the performance of the proposed algorithm， it was simulated and compared with four representative algorithms including C-MOEA/D （adaptive Constraint handling approach embedded MOEA/D）， ToP （handling constrained multi-objective optimization problems with constraints in both the decision and objective spaces）， C-TAEA （Two-Archive Evolutionary Algorithm for Constrained multi-objective optimization） and PPS on the test problems of LIRCMOP series. Experimental results show that CMOEA-SST has better convergence and diversity when dealing with constrained optimization problems with large infeasible regions.

Specific knowledge learning based on knowledge distillation

Zhaoxia DAI, Yudong CAO, Guangming ZHU, Peiyi SHEN, Xu XU, Lin MEI, Liang ZHANG

2021, 41(12): 3426-3431. DOI: 10.11772/j.issn.1001-9081.2021060923

Asbtract ( )

HTML ( )

PDF (648KB) ( )

Figures and Tables | References | Related Articles | Metrics

In the framework of traditional knowledge distillation， the teacher network transfers all of its own knowledge to the student network， and there are almost no researches on the transfer of partial knowledge or specific knowledge. Considering that the industrial field has the characteristics of single scene and small number of classifications， the evaluation of recognition performance of neural network models in specific categories need to be focused on. Based on the attention feature transfer distillation algorithm， three specific knowledge learning algorithms were proposed to improve the classification performance of student networks in specific categories. Firstly， the training dataset was filtered for specific classes to exclude other non-specific classes of training data. On this basis， other non-specific classes were treated as background and the background knowledge was suppressed in the distillation process， so as to further reduce the impact of other irrelevant knowledge on specific classes of knowledge. Finally， the network structure was changed， that is the background knowledge was suppressed only at the high-level of the network， and the learning of basic graphic features was retained at the bottom of the network. Experimental results show that the student network trained by a specific knowledge learning algorithm can be as good as or even has better classification performance than a teacher network whose parameter scale is six times of that of the student network in specific category classification.

Dynamic graph representation learning method based on deep neural network and gated recurrent unit

Huibo LI, Yunxiao ZHAO, Liang BAI

2021, 41(12): 3432-3437. DOI: 10.11772/j.issn.1001-9081.2021060994

Asbtract ( )

HTML ( )

PDF (869KB) ( )

Figures and Tables | References | Related Articles | Metrics

Learning the latent vector representations of nodes in the graph is an important and ubiquitous task， which aims to capture various attributes of the nodes in the graph. A lot of work demonstrates that static graph representation learning can learn part of the node information； however， real-world graphs evolve over time. In order to solve the problem that most dynamic network algorithms cannot effectively retain node neighborhood structure and temporal information， a dynamic network representation learning method based on Deep Neural Network （DNN） and Gated Recurrent Unit （GRU）， namely DynAEGRU， was proposed. With Auto-Encoder （AE） as the framework of the DynAEGRU， the neighborhood information was aggregated by encoder with a DNN to obtain low-dimensional feature vectors， then the node temporal information was extracted by a GRU network，finally， the adjacency matrix was reconstructed by the decoder and compared with the real graph to construct the loss. Experimental results on three real-word datasets show that DynAEGRU method has better performance gain compared to several static and dynamic graph representation learning algorithms.

Robust multi-view subspace clustering based on consistency graph learning

Zhenjun PAN, Cheng LIANG, Huaxiang ZHANG

2021, 41(12): 3438-3446. DOI: 10.11772/j.issn.1001-9081.2021061056

Asbtract ( )

HTML ( )

PDF (781KB) ( )

Figures and Tables | References | Related Articles | Metrics

Concerning that the multi-view data analysis is susceptible to the noise of the original dataset and requires additional steps to calculate the clustering results， a Robust Multi-view subspace clustering based on Consistency Graph Learning （RMCGL） algorithm was proposed. Firstly， the potential robust representation of data in the subspace was learned in each view， and the similarity matrix of each view was obtained based on these representations. Then， a unified similarity graph was learned based on the obtained multiple similarity matrices. Finally， by adding rank constraints to the Laplacian matrix corresponding to the similarity graph， the obtained similarity graph had the optimal clustering structure， and the final clustering results were able to be obtained directly by using this similarity graph. The process was completed in a unified optimization framework， in which potential robust representations， similarity matrices and consistency graphs could be learned simultaneously. The clustering Accuracy （ACC） of RMCGL algorithm is 3.36 percentage points， 5.82 percentage points and 5.71 percentage points higher than that of Graph-based Multi-view Clustering （GMC） algorithm on BBC， 100leaves and MSRC datasets， respectively. Experimental results show that the proposed algorithm has a good clustering effect.

Directed graph clustering algorithm based on kernel nonnegative matrix factorization

Xian CHEN, Liying HU, Xiaowei LIN, Lifei CHEN

2021, 41(12): 3447-3454. DOI: 10.11772/j.issn.1001-9081.2021061129

Asbtract ( )

HTML ( )

PDF (653KB) ( )

Figures and Tables | References | Related Articles | Metrics

Most of the existing directed graph clustering algorithms are based on the assumption of approximate linear relationship between nodes in vector space， ignoring the existence of non-linear correlation between nodes. To address this problem， a directed graph clustering algorithm based on Kernel Nonnegative Matrix Factorization （KNMF） was proposed. First， the adjacency matrix of a directed graph was projected to the kernel space by using a kernel learning method， and the node similarity in both the original and kernel spaces was constrained by a specific regularization term. Second， the objective function of graph regularization kernel asymmetric NMF algorithm was proposed， and a clustering algorithm was derived by gradient descent method under the non-negative constraints. The algorithm accurately reveals the potential structural information in the directed graph by modeling the non-linear relationship between nodes using kernel learning method， as well as considering the directivity of the links of nodes. Finally， experimental results on the Patent Citation Network （PCN） dataset show that compared with the comparison algorithm， when the number of clusters is 2， the proposed algorithm improves the Davies-Bouldin （DB） and Distance-based Quality Function （DQF） by about 0.25 and 8% respectively， achieving better clustering quality.

Graph learning regularized discriminative non-negative matrix factorization based face recognition

Han DU, Xianzhong LONG, Yun LI

2021, 41(12): 3455-3461. DOI: 10.11772/j.issn.1001-9081.2021060979

Asbtract ( )

HTML ( )

PDF (790KB) ( )

Figures and Tables | References | Related Articles | Metrics

The Non-negative Matrix Factorization （NMF） algorithm based on graph regularization makes full use of the assumption that high-dimensional data are usually located in a low-dimensional manifold space to construct the Laplacian matrix. The disadvantage of this algorithm is that the constructed Laplacian matrix is calculated in advance and will not be iterated during the multiplicative update process. In order to solve this problem， the self-representation method in subspace learning was combined to generate the representation coefficient， and the similarity matrix was further calculated to obtain the Laplacian matrix， and the Laplacian matrix was iterated during the update process. In addition， the label information of the training set was used to construct the class indicator matrix， and two different regularization items were introduced to reconstruct the category indicator matrix respectively. This algorithm was called Graph Learning Regularized Discriminative Non-negative Matrix Factorization （GLDNMF）， and the corresponding multiplicative update rules and the convergence proof of the objective function were given. Face recognition experimental results on two standard datasets show that the accuracy of the proposed algorithm for face recognition is increased by 1% - 5% compared to the existing classic algorithms， verifying the effectiveness of the proposed method.

Multi-kernel learning method based on neural tangent kernel

Mei WANG, Chuanhai XU, Yong LIU

2021, 41(12): 3462-3467. DOI: 10.11772/j.issn.1001-9081.2021060998

Asbtract ( )

HTML ( )

PDF (510KB) ( )

Figures and Tables | References | Related Articles | Metrics

Multi-kernel learning method is an important type of kernel learning method， but most of multi-kernel learning methods have the following problems： most of the basis kernel functions in multi-kernel learning methods are traditional kernel functions with shallow structure， which have weak representation ability when dealing with the problems of large data scale and uneven distribution； the generalization error convergence rates of the existing multi-kernel learning methods are mostly $O 1 / n$ ， and the convergence speeds are slow. Therefore， a multi-kernel learning method based on Neural Tangent Kernel （NTK） was proposed. Firstly， the NTK with deep structure was used as the basis kernel function of the multi-kernel learning method， so as to enhance the representation ability of the multi-kernel learning method. Then， a generalization error bound with a convergence rate of $O 1 / n$ was proved based on the measure of principal eigenvalue ratio. On this basis， a new multi-kernel learning algorithm was designed in combination with the kernel alignment measure. Finally， experiments were carried out on several datasets. Experimental results show that compared with classification algorithms such as Adaboost and K-Nearest Neighbor （KNN）， the newly proposed multi-kernel learning algorithm has higher accuracy and better representation ability， which also verifies the feasibility and effectiveness of the proposed method.

Multiple kernel clustering algorithm based on capped simplex projection graph tensor learning

Haoyun LEI, Zenwen REN, Yanlong WANG, Shuang XUE, Haoran LI

2021, 41(12): 3468-3474. DOI: 10.11772/j.issn.1001-9081.2021061393

Asbtract ( )

HTML ( )

PDF (6316KB) ( )

Figures and Tables | References | Related Articles | Metrics

Because multiple kernel learning can avoid selection of kernel functions and parameters effectively， and graph clustering can fully mine complex structural information between samples， Multiple Kernel Graph Clustering （MKGC） has received widespread attention in recent years. However， the existing MKGC methods suffer from the following problems： graph learning technique complicates the model， the high rank of graph Laplacian matrix cannot ensure the learned affinity graph to contain accurate c connected components （block diagonal property）， and most of the methods ignore the high-order structural information among the candidate affinity graphs， making it difficult to fully utilize the multiple kernel information. To tackle these problems， a novel MKGC method was proposed. First， a new graph learning method based on capped simplex projection was proposed to directly project the kernel matrices onto graph simplex， which reduced the computational complexity. Meanwhile， a new block diagonal constraint was introduced to keep the accurate block diagonal property of the learned affinity graphs. Moreover， the low-rank tensor learning was introduced in capped simplex projection space to fully mine the high-order structural information of multiple candidate affinity graphs. Compared with the existing MKGC methods on multiple datasets， the proposed method has less computational cost and high stability， and has great advantages in Accuracy （ACC） and Normalized Mutual Information （NMI）.

BNSL-FIM： Bayesian network structure learning algorithm based on frequent item mining

Xuanyi LI, Yun ZHOU

2021, 41(12): 3475-3479. DOI: 10.11772/j.issn.1001-9081.2021060898

Asbtract ( )

HTML ( )

PDF (542KB) ( )

Figures and Tables | References | Related Articles | Metrics

Bayesian networks can represent uncertain knowledge and perform inferential computational expressions， but due to the noise and size limitations of actual sample data and the complexity of network space search， Bayesian network structure learning will always have certain errors. To improve the accuracy of Bayesian network structure learning， a Bayesian network structure learning algorithm with the results of maximum frequent itemset and association rule analysis as the prior knowledge was proposed， namely BNSL-FIM （Bayesian Network Structure Learning algorithm based on Frequent Item Mining）. Firstly， the maximum frequent itemset was mined from data and the structure learning was performed on the itemset， then the association rule analysis results were used to correct it， thereby determining the prior knowledge based on frequent item mining and association rule analysis. Secondly， a Bayesian Dirichlet equivalent uniform （BDeu） scoring algorithm was proposed combining with prior knowledge for Bayesian network structure learning. Finally， experiments were carried out on 6 public standard datasets to compare the Hamming distance between the structure with/without prior and the original network structure. The results show that the proposed algorithm can effectively improve the structure learning accuracy of Bayesian network compared to the original BDue scoring algorithm.

Deep distance metric learning method based on optimized triplet loss

Zilong LI, Yong ZHOU, Rong BAO, Hongdong WANG

2021, 41(12): 3480-3484. DOI: 10.11772/j.issn.1001-9081.2021061107

Asbtract ( )

HTML ( )

PDF (581KB) ( )

Figures and Tables | References | Related Articles | Metrics

Focused on the issues that the single deep distance metric based on triplet loss has poor adaptability to the diversified datasets and easily leads to overfitting， a deep distance metric learning method based on optimized triplet loss was proposed. Firstly， by thresholding the relative distance of triplet training samples mapped by neural network， and a piecewise linear function was used as the evaluation function of relative distance. Secondly， the evaluation function was added to the Boosting algorithm as a weak classifier to generate a strong classifier. Finally， an alternating optimization method was used to learn the parameters of the weak classifier and neural network. Through the evaluation of various deep distance metric learning methods in the image retrieval task， it can be seen that the Recall@1 of the proposed method is 4.2， 3.2 and 0.6 higher than that of the previous best score on CUB-200-2011， Cars-196 and SOP datasets respectively. Experimental results show that the proposed method outperforms the comparison methods， while avoiding overfitting to a certain extent.

Label noise filtering method based on dynamic probability sampling

Zenghui ZHANG, Gaoxia JIANG, Wenjian WANG

2021, 41(12): 3485-3491. DOI: 10.11772/j.issn.1001-9081.2021061026

Asbtract ( )

HTML ( )

PDF (1379KB) ( )

Figures and Tables | References | Related Articles | Metrics

In machine learning， data quality has a far-reaching impact on the accuracy of system prediction. Due to the difficulty of obtaining information and the subjective and limited cognition of human， experts cannot accurately mark all samples. And some probability sampling methods proposed in resent years fail to avoid the problem of unreasonable and subjective sample division by human. To solve this problem， a label noise filtering method based on Dynamic Probability Sampling （DPS） was proposed， which fully considered the differences between samples of each dataset. By counting the frequency of built-in confidence distribution in each interval and analyzing the trend of information entropy of built-in confidence distribution in each interval， the reasonable threshold was determined. Fourteen datasets were selected from UCI classic datasets， and the proposed algorithm was compared with Random Forest （RF）， High Agreement Random Forest Filter （HARF）， Majority Vote Filter （MVF） and Local Probability Sampling （LPS） methods. Experimental results show that the proposed method shows high ability on both label noise recognition and classification generalization.

Manifold regularized nonnegative matrix factorization based on clean data

Hua LI, Guifu LU, Qinru YU

2021, 41(12): 3492-3498. DOI: 10.11772/j.issn.1001-9081.2021060962

Asbtract ( )

HTML ( )

PDF (663KB) ( )

Figures and Tables | References | Related Articles | Metrics

The existing Nonnegative Matrix Factorization （NMF） algorithms are often designed based on Euclidean distance， which makes the algorithms sensitive to noise. In order to enhance the robustness of these algorithms， a Manifold Regularized Nonnegative Matrix Factorization based on Clean Data （MRNMF/CD） algorithm was proposed. In MRNMF/CD algorithm， the low-rank constraints， manifold regularization and NMF technologies were seamlessly integrated， which makes the algorithm perform relatively excellent. Firstly， by adding the low-rank constraints， MRNMF/CD can recover clean data from noisy data and obtain the global structure of the data. Secondly， in order to use the local geometric structure information of the data， manifold regularization was incorporated into the objective function by MRNMF/CD. In addition， an iterative algorithm for solving MRNMF/CD was proposed， and the convergence of this solution algorithm was analyzed theoretically. Experimental results on ORL， Yale and COIL20 datasets show that MRNMF/CD algorithm has better accuracy than the existing algorithms including k-means， Principal Component Analysis （PCA）， NMF and Graph Regularized Nonnegative Matrix Factorization （GNMF）.

Staged variational autoencoder for heterogeneous one-class collaborative filtering

Xiancong CHEN, Weike PAN, Zhong MING

2021, 41(12): 3499-3507. DOI: 10.11772/j.issn.1001-9081.2021060894

Asbtract ( )

HTML ( )

PDF (785KB) ( )

Figures and Tables | References | Related Articles | Metrics

In recommender system field， most of the existing works mainly focus on the One-Class Collaborative Filtering （OCCF） problem with only one type of users’ feedback， e.g.， purchasing feedback. However， users’ feedback is usually heterogeneous in real applications， so it has become a new challenge to model the users’ heterogeneous feedback to capture their true preferences. Focusing on the Heterogeneous One-Class Collaborative Filtering （HOCCF） problem （including users’ purchasing feedback and browsing feedback）， a transfer learning solution named Staged Variational AutoEncoder （SVAE） model was proposed. Firstly， the latent feature vectors were generated via the Multinomial Variational AutoEncoder （Multi-VAE） with users’ browsing feedback auxiliary data. Then， the obtained latent feature vectors were transferred to another Multi-VAE to assist the modeling of users’ target data， i.e.， purchasing feedback by this Multi-VAE. Experimental results on three real-world datasets show that the performance of SVAE model on the important metrics such as Precision@5 and Normalized Discounted Cumulative Gain@5 （NDCG@5） is significantly better than the performance of the state-of-the-art recommendation algorithms in most cases， demonstrating the effectiveness of the proposed model.

Unbiased recommendation model based on improved propensity score estimation

Jinwei LUO, Dugang LIU, Weike PAN, Zhong MING

2021, 41(12): 3508-3514. DOI: 10.11772/j.issn.1001-9081.2021060910

Asbtract ( )

HTML ( )

PDF (567KB) ( )

Figures and Tables | References | Related Articles | Metrics

In reality， recommender systems usually suffer from various bias problems， such as exposure bias， position bias and selection bias. A recommendation model that ignores the bias problems cannot reflect the real performance of the recommender system， and may be untrustworthy for users. Previous works show that a recommendation model based on propensity score estimation can effectively alleviate the exposure bias problem of implicit feedback data in recommender systems， but only item information is usually considered to estimate propensity scores， which may lead to inaccurate estimation of propensity scores. To improve the accuracy of propensity score estimation， a Match Propensity Estimator （MPE） method was proposed. Specifically， a concept of users’ popularity preference was introduced at first， and then more accurate modeling of the sample exposure rate was achieved by calculating the matching degree of the user’s popularity preference and the item’s popularity. The proposed estimation method was integrated with a traditional recommendation model and an unbiased recommendation model， and the integrated models were compared to three baseline models including the above two models. Experimental results on a public dataset show that the models combining MPE method achieve significant improvement on three evaluation metrics such as recall， Discounted Cumulative Gain （DCG） and Mean Average Precision （MAP） compared with the corresponding baseline models respectively. In addition， experimental results demonstrate that a large part of the performance gain comes from long-tail items， showing that the proposed method is helpful to improve the diversity and coverage of recommended items.

Social collaborative ranking recommendation algorithm by exploiting both explicit and implicit feedback

Gai LI, Lei LI, Jiaqiang ZHANG

2021, 41(12): 3515-3520. DOI: 10.11772/j.issn.1001-9081.2021060908

Asbtract ( )

HTML ( )

PDF (631KB) ( )

Figures and Tables | References | Related Articles | Metrics

The traditional social collaborative filtering algorithms based on rating prediction have the inherent deficiency in which the prediction value does not match the real sort， and social collaborative ranking algorithms based on ranking prediction are more suitable to practical application scenarios. However， most existing social collaborative ranking algorithms focus on explicit feedback data only or implicit feedback data only， and not make full use of the information in the dataset. In order to fully exploit both the explicit and implicit scoring information of users’ social networks and recommendation objects， and to overcome the inherent deficiency of traditional social collaborative filtering algorithms based on rating prediction， a new social collaborative ranking model based on the newest xCLiMF model and TrustSVD model， namely SPR_SVD++， was proposed. In the algorithm， both the explicit and implicit information of user scoring matrix and social network matrix were exploited simultaneously and the learning to rank’s evaluation metric Expected Reciprocal Rank （ERR） was optimized. Experimental results on real datasets show that SPR_SVD++ algorithm outperforms the existing state-of-the-art algorithms TrustSVD， MERR_SVD++ and SVD++ over two different evaluation metrics Normalized Discounted Cumulative Gain （NDCG） and ERR. Due to its good performance and high expansibility， SPR_SVD++ algorithm has a good application prospect in the Internet information recommendation field.

Hybrid K-anonymous feature selection algorithm

Liu YANG, Yun LI

2021, 41(12): 3521-3526. DOI: 10.11772/j.issn.1001-9081.2021060980

Asbtract ( )

HTML ( )

PDF (619KB) ( )

Figures and Tables | References | Related Articles | Metrics

K-anonymous algorithm makes the data reached the condition of K-anonymity by generalizing and suppressing the data. It can be seen as a special feature selection method named K-anonymous feature selection which considers both data privacy and classification performance. In K-anonymous feature selection method， the characteristics of K-anonymity and feature selection are combined to use multiple evaluation criteria to select the subset of K-anonymous features. It is difficult for the filtered K-anonymous feature selection method to search all the candidate feature subsets satisfying the K-anonymous condition， and the classification performance of the obtained feature subset cannot be guaranteed to be optimal， and the wrapper feature selection method has very high-cost calculation. Therefore， a hybrid K-anonymous feature selection method was designed by combining the characteristics of filtered feature sorting and wrapper feature selection by improving the forward search strategy in the existing methods and thereby using classification performance as the evaluation criterion to select the K-anonymous feature subset with the best classification performance. Experiments were carried out on multiple public datasets， and the results show that the proposed algorithm can outperform the existing algorithms in classification performance and has less information loss.

Extractive and abstractive summarization model based on pointer-generator network

Wei CHEN, Yan YANG

2021, 41(12): 3527-3533. DOI: 10.11772/j.issn.1001-9081.2021060899

Asbtract ( )

HTML ( )

PDF (562KB) ( )

Figures and Tables | References | Related Articles | Metrics

As a hot issue in natural language processing， summarization generation has important research significance. The abstractive method based on Seq2Seq （Sequence-to-Sequence） model has achieved good results， however， the extractive method has the potential of mining effective features and extracting important sentences of articles， so it is a good research direction to improve the abstractive method by using extractive method. In view of this， a fusion model of abstractive method and extractive method was proposed. Firstly， incorporated with topic similarity， TextRank algorithm was used to extract significant sentences from the article. Then， an abstractive framework based on the Seq2Seq model integrating the semantics of extracted information was designed to implement the summarization task； at the same time， pointer-generator network was introduced to solve the problem of Out-Of-Vocabulary （OOV）. Based on the above steps， the final summary was obtained and verified on the CNN/Daily Mail dataset. The results show that on all the three indexes ROUGE-1， ROUGE-2 and ROUGE-L， the proposed model is better than the traditional TextRank algorithm； meanwhile， the effectiveness of fusing extractive method and abstractive method in the field of summarization is also verified.

Event detection without trigger words incorporating syntactic information

Cui WANG, Yafei ZHANG, Junjun GUO, Shengxiang GAO, Zhengtao YU

2021, 41(12): 3534-3539. DOI: 10.11772/j.issn.1001-9081.2021060928

Asbtract ( )

HTML ( )

PDF (697KB) ( )

Figures and Tables | References | Related Articles | Metrics

Event Detection （ED） is one of the most important tasks in the field of information extraction， aiming to identify instances of specific event types in text. Existing ED methods usually use adjacency matrix to express syntactic dependencies， however， the adjacency matrix often needs to be encoded with Graph Convolutional Network （GCN） to obtain syntactic information， which increases the complexity of the model. Therefore， an event detection method without trigger words incorporating syntactic information was proposed. After converting the dependent parent word and its context into a position marker vector， the word embedding of dependent sub-word was incorporated at the source end of the model in a parameter-free manner to strengthen the semantic representation of the context， without the need of GCN for encoding. In addition， for the time-consuming and laborious labeling of trigger words， a type perceptron based on the multi-head attention mechanism was designed， which was able to model the potential trigger words in the sentence to complete the event detection without trigger words. In order to verify the performance of the proposed method， experiments were conducted on the ACE2005 dataset and the low-resource Vietnamese dataset. Compared with the Event Detection Using Graph Transformer Network （GTN-ED） method， the F1-score of the proposed method was increased by 3.7% on the ACE2005 dataset； compared with the binary classification method Type-aware Bias Neural Network with Attention Mechanisms （TBNNAM）， the F1-score of the proposed method was increased by 9% on the Vietnamese dataset. The results show that the integration of syntactic information into Transformer can effectively connect the scattered event information in the sentence to improve the accuracy of event detection.

Rumor detection model based on user propagation network and message content

Haitao XUE, Li WANG, Yanjie YANG, Biao LIAN

2021, 41(12): 3540-3545. DOI: 10.11772/j.issn.1001-9081.2021060963

Asbtract ( )

HTML ( )

PDF (697KB) ( )

Figures and Tables | References | Related Articles | Metrics

Under the constrains of very short message content on social media platforms， a large number of empty forwards in the transmission structure， and the mismatch between user roles and contents， a rumor detection model based on user attribute information and message content in the propagation network， namely GMB_GMU， was proposed. Firstly， user propagation network was constructed with user attributes as nodes and propagation chains as edges， and Graph Attention neTwork （GAT） was introduced to obtain an enhanced representation of user attributes； meanwhile， based on this user propagation network， the structural representation of users was obtained by using node2vec， and it was enhanced by using mutual attention mechanism. In addition， BERT （Bidirectional Encoder Representations from Transformers） was introduced to establish the source post content representation of the source post. Finally， to obtain the final message representation， Gated Multimodal Unit （GMU） was used to integrate the user attribute representation， structural representation and source post content representation. Experimental results show that the GMB_GMU model achieves an accuracy of 0.952 on publicly available Weibo data and can effectively identify rumor events， which is significantly better than the propagation algorithms based on Recurrent Neural Network （RNN） and other neural network benchmark models.

Microblog rumor detection model based on heterogeneous graph attention network

Bei BI, Huiyao PAN, Feng CHEN, Jingyan SUI, Yang GAO, Yaojun WANG

2021, 41(12): 3546-3550. DOI: 10.11772/j.issn.1001-9081.2021060981

Asbtract ( )

HTML ( )

PDF (541KB) ( )

Figures and Tables | References | Related Articles | Metrics

Social media highly facilitates people’s daily communication and disseminating information， but it is also a breeding ground for rumors. Therefore， how to automatically monitor rumor dissemination in the early stage is of great practical significance， but the existing detection methods fail to take full advantage of the semantic information of the microblog information propagation graph. To solve this problem， based on Heterogeneous graph Attention Network （HAN）， a rumor monitoring model was built， namely MicroBlog-HAN. In the model， a hierarchical attention mechanism including node-level attention and semantic-level attention was adopted. First， the neighbors of microblog nodes were combined by the node-level attention to generate two groups of node embeddings with specific semantics. After that， different semantics were fused by the semantic-level attention to obtain the final node embeddings of microblog， which were then treated as the classifier’s input to perform the binary classification task. In the end， the classification result of whether the input microblog is rumor or not was given. Experimental results on two real-world microblog rumor datasets convincingly prove that MicroBlog-HAN model can accurately identify microblog rumors with an accuracy over 87%.

Vietnamese scene text detection based on modified Mask R-CNN

Yate FENG, Yimin WEN

2021, 41(12): 3551-3557. DOI: 10.11772/j.issn.1001-9081.2021050821

Asbtract ( )

HTML ( )

PDF (1209KB) ( )

Figures and Tables | References | Related Articles | Metrics

In view of the lack of training data for Vietnamese scene text detection and the incomplete detection of Vietnamese tone marks in the detection， a text detection algorithm for Vietnamese scenes based on a modified instance segmentation method Mask R-CNN was proposed. In order to segment Vietnamese scene text with tone marks accurately， only P2 feature layer was utilized to segment the text area， and the mask matrix size of the text area was adjusted from 14 × 14 to 14 × 28 to adapt the shape of most texts. Aiming at the problem that duplicate text detection boxes cannot be eliminated by the conventional Non-Maximum Suppression （NMS） algorithm， a filter module for the text areas named Text region filtering branch was designed and added after the detection module to effectively eliminate duplicate detection boxes. A model joint training method was used to train the network. The training process consists of two parts： the first part is the training of the Feature Pyramid Network （FPN） and the Region Proposal Network （RPN） of the model， which used large-scale open Latin text data for training to enhance the generalization ability of the model to detect text in different scenes； the second part is the training of the candidate box coordinate regression module and the segmentation module named Box branch and Mask branch， which used pixel-level labelled Vietnamese scene text data for training to enable the model to segment the Vietnamese text area including tone marks. Many cross-validation experiments and comparison experiments verify that the proposed algorithm has better precision and recall under different Intersection over Union （IoU） thresholds compared with Mask R-CNN.

Robot path planning based on B-spline curve and ant colony algorithm

Erchao LI, Kuankuan QI

2021, 41(12): 3558-3564. DOI: 10.11772/j.issn.1001-9081.2021060888

Asbtract ( )

HTML ( )

PDF (1368KB) ( )

Figures and Tables | References | Related Articles | Metrics

In view of the problems of ant colony algorithm in global path planning under static environment， such as being unable to find the shortest path， slow convergence speed， great blindness of path search and many inflection points， an improved ant colony algorithm was proposed. Taking the grid map as the running environment of the robot， the initial pheromones were distributed unevenly， so that the path search tended to be near the line between the starting point and the target point； the information of the current node， the next node and the target point was added into the heuristic function， and the dynamic adjustment factor was introduced at the same time， so as to achieve the purpose of strong guidance of the heuristic function in the early stage and strengthening the guidance of pheromone in the later stage； the pseudo-random transfer strategy was introduced to reduce the blindness of path selection and speed up finding the shortest path； the volatilization coefficient was adjusted dynamically to make the volatilization coefficient larger in the early stage and smaller in the later stage， avoiding premature convergence of the algorithm； based on the optimal solution， B-spline curve smoothing strategy was introduced to further optimize the optimal solution， resulting in shorter and smoother path. The sensitivity analysis of the main parameters of the improved algorithm was conducted， the feasibility and effectiveness of each improved step of the algorithm were tested， the simulations compared with the traditional ant colony algorithm and other improved ant colony algorithms under 20×20 and 50×50 environments were given， and the experimental results verified the feasibility， effectiveness and superiority of the improved algorithm.

Object tracking algorithm based on spatio-temporal context information enhancement

Jing WEN, Qiang LI

2021, 41(12): 3565-3570. DOI: 10.11772/j.issn.1001-9081.2021061034

Asbtract ( )

HTML ( )

PDF (915KB) ( )

Figures and Tables | References | Related Articles | Metrics

Making full use of the spatio-temporal context information in the video can significantly improve the performance of object tracking， but most of the current object tracking algorithms based on deep learning only use the feature information of the current frame to locate the object， without using the spatio-temporal context information of the same object in the video frames before and after the current frame， which leads to the tracking object being susceptible to the interference from the similar object nearby， so a potential cumulative error will be introduced during tracking and locating. In order to retain spatio-temporal context information， a short-term memory storage pool was introduced based on SiamMask algorithm to store features of the historical frames； meanwhile， an Appearance Saliency Boosting Module （ASBM） was proposed， which not only enhanced the saliency features of the tracking object， but also suppressed the interference from similar object around the tracking object. On the basis of the above， an object tracking algorithm based on spatio-temporal context information enhancement was proposed. To verify the performance of the proposed algorithm， experiments were carried out on four datasets， including VOT2016， VOT2018， DAVIS-2016 and DAVIS-2017. Experimental results show that compared with SiamMask algorithm， the proposed algorithm has the accuracy and Expected Average Overlap rate （EAO） increased by 4 percentage points and 2 percentage points respectively on VOT2016 dataset， and has the accuracy， robustness and EAO improved by 3.7 percentage points， 2.8 percentage points and 1 percentage point respectively on VOT2018 dataset， and has the decay of the regional similarity and contour accuracy indicators on DAVIS-2016 datasets both reduced by 0.2 percentage points， and has the decay of the regional similarity and contour progress indicators on DAVIS-2017 datasets reduced by 1.3 and 0.9 percentage points respectively.

Unsupervised salient object detection based on graph cut refinement and differentiable clustering

Xiaoyu LI, Tiyu FANG, Yingjie XIA, Jinping LI

2021, 41(12): 3571-3577. DOI: 10.11772/j.issn.1001-9081.2021061054

Asbtract ( )

HTML ( )

PDF (1317KB) ( )

Figures and Tables | References | Related Articles | Metrics

Concerning that the traditional saliency detection algorithm has low segmentation accuracy and the deep learning-based saliency detection algorithm has strong dependence on pixel-level manual annotation data， an unsupervised saliency object detection algorithm based on graph cut refinement and differentiable clustering was proposed. In the algorithm， the idea of “coarse” to “fine” was adopted to achieve accurate salient object detection by only using the characteristics of a single image. Firstly， Frequency-tuned algorithm was used to obtain the salient coarse image according to the color and brightness of the image itself. Then， the candidate regions of the salient object were obtained by binarization according to the image’s statistical characteristics and combination of the central priority hypothesis. After that， GrabCut algorithm based on single image for graph cut was used for segmenting the salient object finely. Finally， in order to overcome the difficulty of imprecise detection when the background was very similar to the object， the unsupervised differentiable clustering algorithm with good boundary segmentation effect was introduced to further optimize the saliency map. Experimental results show that compared with the existing seven algorithms， the optimized saliency map obtained by the proposed algorithm is closer to the ground truth， achieving an Mean Absolute Error （MAE） of 14.3% and 23.4% on ECSSD and SOD datasets， respectively.

Spatio-temporal hyper-relationship graph convolutional network for traffic flow forecasting

Yongkai ZHANG, Zhihao WU, Youfang LIN, Yiji ZHAO

2021, 41(12): 3578-3584. DOI: 10.11772/j.issn.1001-9081.2021060956

Asbtract ( )

HTML ( )

PDF (1112KB) ( )

Figures and Tables | References | Related Articles | Metrics

Traffic flow forecasting is an important research topic for the intelligent transportation system， however， this research is very challenging because of the complex local spatio-temporal relationships among traffic objects such as stations and sensors. Although some previous studies have made great progress by transforming the traffic flow forecasting problem into a spatio-temporal graph forecasting problem， in which the direct correlations across spatio-temporal dimensions among traffic objects are ignored. At present， there is still lack of a comprehensive modeling approach for the local spatio-temporal relationships. A novel spatio-temporal hypergraph modeling scheme was first proposed to address this problem by constructing a kind of spatio-temporal hyper-relationships to comprehensively model the complex local spatio-temporal relationships. Then， a Spatio-Temporal Hyper-Relationship Graph Convolutional Network （STHGCN） forecasting model was proposed to capture these relationships for traffic flow forecasting. Extensive comparative experiments were conducted on four public traffic datasets. Experimental results show that compared with the spatio-temporal forecasting models such as Attention based Spatial-Temporal Graph Convolutional Network （ASTGCN） and Spatial-Temporal Synchronous Graph Convolutional Network （STSGCN）， STHGCN achieves better results in Root Mean Square Error （RMSE）， Mean Absolute Error （MAE） and Mean Absolute Percentage Error （MAPE）； and the comparison of the running time of different models also shows that STHGCN has higher inference speed.

Multi-track music generative adversarial network based on Transformer

Tao WANG, Cong JIN, Xiaobing LI, Yun TIE, Lin QI

2021, 41(12): 3585-3589. DOI: 10.11772/j.issn.1001-9081.2021060909

Asbtract ( )

HTML ( )

PDF (639KB) ( )

Figures and Tables | References | Related Articles | Metrics

Symbolic music generation is still an unsolved problem in the field of artificial intelligence and faces many challenges. It has been found that the existing methods for generating polyphonic music fail to meet the marke requirements in terms of melody， rhythm and harmony， and most of the generated music does not conform to basic music theory knowledge. In order to solve the above problems， a new Transformer-based multi-track music Generative Adversarial Network （Transformer-GAN） was proposed to generate music with high musicality under the guidance of music rules. Firstly， the decoding part of Transformer and the Cross-Track Transformer （CT-Transformer） adapted on the basis of Transformer were used to learn the information within a single track and between multiple tracks respectively. Then， a combination of music rules and cross-entropy loss was employed to guide the training of the generative network， and the well-designed objective loss function was optimized while training the discriminative network. Finally， multi-track music works with melody， rhythm and harmony were generated. Experimental results show that compared with other multi-instrument music generation models， for piano， guitar and bass tracks， Transformer-GAN improves Prediction Accuracy （PA） by a minimum of 12%， 11% and 22%， improves Sequence Similarity （SS） by a minimum of 13%， 6% and 10%， and improves the rest index by a minimum of 8%， 4% and 17%. It can be seen that Transformer -GAN can effectively improve the indicators including PA and SS of music after adding CT-Transformer and music rule reward module， which leads to a relatively high overall improvement of the generated music.

Person re-identification method based on grayscale feature enhancement

Yunpeng GONG, Zhiyong ZENG, Feng YE

2021, 41(12): 3590-3595. DOI: 10.11772/j.issn.1001-9081.2021061011

Asbtract ( )

HTML ( )

PDF (932KB) ( )

Figures and Tables | References | Related Articles | Metrics

Whether the learned features have better invariance in the significant intra-class changes will determine the upper limit of performance of the Person Re-identification （ReID） model. Environmental light， image resolution change， motion blur and other factors may cause color deviation of pedestrian images， and these problems will cause overfitting of the model to color information of the data， thus limiting the performance of the model. By simulating the color information loss of the data samples and highlighting the structural information of the samples， the model was helped to learn more robust features. Specifically， during model training， the training batch was randomly selected according to the set probability， and then a rectangular area of the image or the entire image was randomly selected for each RGB image sample in the selected batch， and the pixels of the selected area was replaced with the pixels of the same rectangular area in the corresponding grayscale image， thus generating a training image with different grayscale areas. Experimental results demonstrate that compared with the benchmark model， the proposed method achieves a significant performance improvement of 3.3 percentage points at most on the evaluation index mean Average Precision （mAP）， and performs well on multiple datasets.

Cloth-changing person re-identification based on joint loss capsule network

Qian LIU, Hongyuan WANG, Liang CAO, Boyan SUN, Yu XIAO, Ji ZHANG

2021, 41(12): 3596-3601. DOI: 10.11772/j.issn.1001-9081.2021061090

Asbtract ( )

HTML ( )

PDF (610KB) ( )

Figures and Tables | References | Related Articles | Metrics

Current research on Person Re-Identification （Re-ID） mainly concentrates on short-term situations with person’s clothing usually unchanged. However， more common practical cases are long-term situations， in which a person has higher possibility to change his clothes， which should be considered by Re-ID models. Therefore， a method of person re-identification with cloth changing based on joint loss capsule network was proposed. The proposed method was based on ReIDCaps， a capsule network for cloth-changing person re-identification. In the method， vector-neuron capsules that contain more information than traditional scalar neurons were used. The length of the vector-neuron capsule was used to represent the identity information of the person， and the direction of the capsule was used to represent the clothing information of the person. Soft Embedding Attention （SEA） was used to avoid the model over-fitting. Feature Sparse Representation （FSR） mechanism was adopted to extract discriminative features. The joint loss of label smoothing regularization cross-entropy loss and Circle Loss was added to improve the generalization ability and robustness of the model. Experimental results on three datasets including Celeb-reID， Celeb-reID-light and NKUP prove that the proposed method has certain advantages compared with the existing person re-identification methods.

Smoking behavior detection algorithm based on human skeleton key points

Wanqing XU, Baodong WANG, Yimei HUANG, Jinping LI

2021, 41(12): 3602-3607. DOI: 10.11772/j.issn.1001-9081.2021061063

Asbtract ( )

HTML ( )

PDF (1345KB) ( )

Figures and Tables | References | Related Articles | Metrics

In view of the small target of cigarette butts in surveillance videos of public places and the easy divergence of smoke generated by smoking， it is difficult to determine the smoking behavior only by target detection algorithm. Considering that the algorithm of posture estimation using skeleton key points is becoming more and more mature， a smoking behavior detection algorithm was proposed by using the relationship between human skeleton key points and smoking behavior. Firstly， AlphaPose and RetinaFace were used to detect the key points of human skeleton and face respectively. According to the ratio of distance between wrist and middle point of two corners of mouth and between wrist and the eye on the same side， a method for calculating whether the Smoking Action Ratio （SAR） in humans falls within the Golden Ratio of Smoking Actions （GRSA） to distinguish smoking from non-smoking behaviors was proposed. Then， YOLOv4 was used to detect whether cigarette butts existed in the video. The results of GRSA determination and YOLOv4 were combined to determine the possibility of smoking behavior in the video and make a determination of whether smoking behavior was present. The self-recorded dataset test shows that the proposed algorithm can accurately detect smoking behavior with the accuracy reached 92%.

Prediction method of liver transplantation complications based on transfer component analysis and support vector machine

Hongliang CAO, Ying ZHANG, Bin WU, Fanyu LI, Xubo NA

2021, 41(12): 3608-3613. DOI: 10.11772/j.issn.1001-9081.2021060886

Asbtract ( )

HTML ( )

PDF (699KB) ( )

Figures and Tables | References | Related Articles | Metrics

Many machine learning algorithms can cope well with prediction and classification， but these methods suffer from poor prediction accuracy and F1 score when they are used on medical datasets with small samples and large feature spaces. To improve the accuracy and F1 score of liver transplantation complication prediction， a prediction and classification method of liver transplantation complications based on Transfer Component Analysis （TCA） and Support Vector Machine （SVM） was proposed. In this method， TCA was used for mapping and dimension reduction of the feature space， and the source domain and the target domain were mapped to the same reproducing kernel Hilbert space， thereby achieving the adaptivity of edge distribution. The SVM was trained in the source domain after transferring， and the complications were predicted in the target domain after training. In the liver transplantation complication prediction experiments for complication Ⅰ， complication Ⅱ， complication Ⅲa， complication Ⅲb， and complication Ⅳ， compared with the traditional machine learning and Heterogeneous Domain Adaptation （HDA）， the accuracy of the proposed method was improved by 7.8% to 42.8%， and the F1 score reached 85.0% to 99.0%， while the traditional machine learning and HDA had high accuracy but low recall due to the imbalance of positive and negative samples. Experimental results show that TCA combined with SVM can effectively improve the accuracy and F1 score of liver transplantation complication prediction.

Prediction model of lncRNA-encoded short peptides based on representation learning and deep forest

Tengqi JI, Jun MENG, Siyuan ZHAO, Hehuan HU

2021, 41(12): 3614-3619. DOI: 10.11772/j.issn.1001-9081.2021061082

Asbtract ( )

HTML ( )

PDF (891KB) ( )

Figures and Tables | References | Related Articles | Metrics

Small Open Reading Frames （sORFs） in long non-coding RNA （lncRNA） can encode short peptides with length no more than 100 amino acids. Aiming at the problem that the features of sORFs in lncRNA are not distinct and the data with high reliability are not enough in short peptide prediction research， a Deep Forest （DF） model based on representation learning was proposed. Firstly， the conventional lncRNA feature extraction method was used to encode the sORFs. Secondly， the AutoEncoder （AE） was used to perform representation learning to obtain highly efficient representation of the input data. Finally， a DF model was trained to predict the short peptides encoded by lncRNA. Experimental results show that the accuracy of this model can achieve 92.08% on Arabidopsis thalianadataset， which is higher than those of the traditional machine learning models ， deep learning models and combined models， and this model has better stability. In addition， the prediction accuracy of this method can reach 78.16% and 74.92% on Glycine max and Zea mays datasets respectively， verifying the good generalization ability of the proposed model.

IIoT hidden anomaly detection based on locality sensitive Bloom filter

Ruliang XIAO, Zhixia ZENG, Chenkai XIAO, Shi ZHANG

2021, 41(12): 3620-3625. DOI: 10.11772/j.issn.1001-9081.2021061115

Asbtract ( )

HTML ( )

PDF (580KB) ( )

Figures and Tables | References | Related Articles | Metrics

Damage to sensors in Industrial Internet of Things （IIoT） system due to continuous use and normal wear leads to hidden anomalies in the collected and recorded sensing data. To solve this problem， an anomaly detection algorithm based on Local Sensitive Bloom Filter （LSBF） model was proposed， namely LSBFAD. Firstly， the Spatial Partition based Fast Johnson-Lindenstrauss Transform （SP-FJLT） was used to perform hash mapping to the data， then the Mutual Competition （MC） strategy was used to reduce noise， and finally the Bloom filter was constructed by 0-1 coding. In simulation experiments conducted on three benchmark datasets including SIFT， MNIST and FMA， the false detection rate of LSBFAD algorithm is less than 10%. Experimental results show that compared with the current mainstream anomaly detection algorithms， the proposed anomaly detection algorithm based on LSBF has higher Detection Rate （DR） and lower False Alarm Rate （FAR） and can be effectively applied to anomaly detection of IIoT data.

Transfer learning based on graph convolutional network in bearing service fault diagnosis

Xueying PENG, Yongquan JIANG, Yan YANG

2021, 41(12): 3626-3631. DOI: 10.11772/j.issn.1001-9081.2021060974

Asbtract ( )

HTML ( )

PDF (561KB) ( )

Figures and Tables | References | Related Articles | Metrics

Deep learning methods are widely used in bearing fault diagnosis， but in actual engineering applications， real service fault data during bearing service are not easily collected and lack of data labels， which is difficult to train adequately. Focused on the difficulty of bearing service fault diagnosis， a transfer learning model based on Graph Convolutional Network （GCN） in bearing service fault diagnosis was proposed. In the model， the fault knowledge was learned from artificially simulated damage fault data with sufficient data and transferred to real service faults， so as to improve the diagnostic accuracy of service faults. Specifically， the original vibration signals of artificially simulated damage fault data and service fault data were converted into the time-frequency maps with both time and frequency information through wavelet transform， and the obtained maps were input into graph convolutional layers for learning， so as to effectively extract the fault feature representations in the source and target domains. Then the Wasserstein distance between the data distributions of source domain and target domain was calculated to measure the difference between two data distributions， and a fault diagnosis model that can diagnose bearing service faults was constructed by minimizing the difference in data distribution. A variety of different tasks were designed for experiments with different bearing failure data sets and different operating conditions. Experimental results show that the proposed model has the ability to diagnose bearing service faults and also can be transferred from one working condition to another， and perform fault diagnosis between different component types and different working conditions.

Stock index forecasting method based on corporate financial statement data

Jihou WANG, Peiguang LIN, Jiaqian ZHOU, Qingtao LI, Yan ZHANG, Muwei JIAN

2021, 41(12): 3632-3636. DOI: 10.11772/j.issn.1001-9081.2021061006

Asbtract ( )

HTML ( )

PDF (580KB) ( )

Figures and Tables | References | Related Articles | Metrics

All market activities of stock market participants combine to affect stock market changes， making stock market volatility fraught with complexity and making accurate prediction of stock prices a challenge. Among these activities that affect stock market changes， financial disclosure is an attractive and potentially financially rewarding means of predicting stock indexe changes. In order to deal with the complex changes in the stock market， a method of stock index prediction was proposed that incorporates data from financial statements disclosed by corporates. Firstly， the stock index historical data and corporate financial statement data were preprocessed， and the main task is dimension reduction of the high-dimensional matrix generated from corporate financial statement data， and then the dual-channel Long Short-Term Memory （LSTM） network was used to forecast and research the normalized data. Experimental results on SSE 50 and CSI 300 Index datasets show that the prediction effect of the proposed method is better than that using only historical data of stock indexes.

Chinese character relation extraction model based on pre-training and multi-level information

Bowen YAO, Biqing ZENG, Jian CAI, Meirong DING

2021, 41(12): 3637-3644. DOI: 10.11772/j.issn.1001-9081.2021010090

Asbtract ( )

HTML ( )

PDF (822KB) ( )

Figures and Tables | References | Related Articles | Metrics

Relation extraction task aims to extract the relationship between entity pairs from text， which is one of the hot directions in the field of Natural Language Processing （NLP）. In view of the problem that the grammar structure is complex and the semantic features of the text cannot be learned effectively in Chinese character relation extraction corpus， a Chinese Character Relation Extraction model based on Pre-training and Multi-level Information （CCREPMI） was proposed. Firstly， the word vectors were generated by using the powerful semantic representation ability of the pre-trained model. Then， the original sentence was divided into sentence level， entity level and entity adjacent level for feature extraction. Finally， the relation classification and prediction were performed by the information fusion of the sentence structure features， entity meanings and dependency between entities and adjacent words. Experimental results on the Chinese character relationship dataset show that the proposed model has the precision of 81.5%， the recall of 82.3%， and the F1 value of 81.9%， showing an improvement compared to the baseline models such as BERT （Bidirectional Encoder Representations from Transformers） and BERT-LSTM （BERT-Long Short-Term Memory）. Moreover， the F1 score of this model on SemEval2010-task8 English dataset reaches 81.2%， indicating its ability to generalize to the English corpus.

Improved subspace clustering model based on spectral clustering

Ran GAO, Huazhu CHEN

2021, 41(12): 3645-3651. DOI: 10.11772/j.issn.1001-9081.2021010081

Asbtract ( )

HTML ( )

PDF (1431KB) ( )

Figures and Tables | References | Related Articles | Metrics

The purpose of subspace clustering is to segment data from different subspaces into the corresponding low-dimensional subspaces which the data essentially belong to. The existing methods based on data self-representation and spectral clustering divide this problem into two consecutive stages： first， the affinity matrix of the data was learned from the high-dimensional data， and then the cluster membership of the data was inferred by applying spectral clustering to the learned affinity matrix. A new data adaptive sparse regularization term was defined and combined with Structural Sparse Subspace Clustering （SSSC） model and improved Sparse Spectral Clustering （SSpeC） model， and a new unified optimization model was proposed. In the new model， by using the mutual guidance of data similarity and clustering indicators， the blindness of SSpeC sparsity penalty was overcome and the similarity was made to be discriminative， which was conducive to dividing the data from different subspaces into different classes， and the defect that the SSSC model only forces the data from the same subspace to have the same labels was made up. Experimental results on common datasets show that the proposed model enhances the ability of clustering discrimination and is superior to some classical two-stage methods and SSSC model.

Service composition optimization based on improved krill herd algorithm

Shuicong LIAO, Peng SUN, Xingchen LIU, Yun ZHONG

2021, 41(12): 3652-3657. DOI: 10.11772/j.issn.1001-9081.2021040699

Asbtract ( )

HTML ( )

PDF (703KB) ( )

Figures and Tables | References | Related Articles | Metrics

In the Service Oriented Architecture （SOA）， an improved Krill Herd algorithm PRKH with adaptive crossover and random perturbation operator was proposed to solve the problem of easily falling into local optimum and high time cost in the process of service composition optimization. Firstly， a service composition optimization model was established based on Quality of Service （QoS）， and the QoS calculation formulas and normalization methods under different structures were given. Then， based on the Krill Herd （KH） algorithm， the adaptive crossover probability and the random disturbance based on the actual offset were added to achieve a good balance between the global search ability and the local search ability of krill herd. Finally， through simulation， the proposed algorithm was compared with KH algorithm， Particle Swarm Optimization （PSO） algorithm， Artificial Bee Colony （ABC） algorithm and Flower Pollination Algorithm （FPA）. Experimental results show that the PRKH algorithm can find better QoS composite services faster.

Multi-objective optimization based on dynamic mixed flow entry timeouts in software defined network

Xiaohang MA, Lingxia LIAO, Zhi LI, Bin QIN, Han-chieh CHAO

2021, 41(12): 3658-3665. DOI: 10.11772/j.issn.1001-9081.2021010079

Asbtract ( )

HTML ( )

PDF (1321KB) ( )

Figures and Tables | References | Related Articles | Metrics

Flow entries are forwarding rules generated by controllers and guide switches to process data packets in Software Defined Network （SDN）. Every flow entry is stored in the memory of switches and has timeout， which affects the bandwidth cost in SDN control channel， the memory consumption in switches， and the system’s resource management and performance. As most of the existing SDN performance optimization schemes only have single objective， and do not consider the impact of the types and time of the flow entry timeouts， a multi-objective optimization scheme was proposed based on the dynamic mixed timeouts of flow entries to simultaneously optimize the three objects： the detection of elephant flows， the memory consumption of flow entries in switches， and the control channel bandwidth occupation. In the dynamic mixed timeout， hard-timeout and idle-timeout， two timeout methods of flow entries were combined， and the timeout type and time of flow entries were adjusted in a two-dimensional dynamic way. The NSGA-Ⅱ algorithm was used to solve the proposed optimization problem and to evaluate the impact of different timeout methods and timeout time on the three optimization objectives. The solution set of specific timeouts was combined with the solution set of Bayesian multi-objective optimization algorithm to improve the quality of the solution set. The results show that the proposed scheme can provide a higher detection accuracy， a lower bandwidth occupation， and a smaller switch memory consumption. It significantly improves the overall performance of SDNs.

Single image super-resolution reconstruction method based on dense Inception

Haiyong WANG, Kaixin ZHANG, Weizheng GUAN

2021, 41(12): 3666-3671. DOI: 10.11772/j.issn.1001-9081.2021010070

Asbtract ( )

HTML ( )

PDF (740KB) ( )

Figures and Tables | References | Related Articles | Metrics

In recent years， the single image Super-Resolution （SR） reconstruction methods based on Convolutional Neural Network （CNN） have become mainstream. Under normal circumstances， the deeper network layers of the reconstruction model have， the more features are extracted， and the better reconstruction effect is. However， as the number of network layers increases， the reconstruction model will not only have the vanishing gradient problem， but also significantly increase the number of parameters and increase the difficulty of training. To solve the above problems， a single image SR reconstruction method based on dense Inception was proposed. In the proposed method， the image features were extracted by introducing the Inception-Residual Network （Inception-ResNet） structure， and the simplified dense network was adopted globally. And only the path that each module outputs to the reconstruction layer was constructed， avoiding the increase of computation amount caused by the generation of redundant data. When the magnification was 4， the dataset Set5 was used to test the model performance. The results show that， the Structural SIMilarity （SSIM） of the proposed model is 0.013 6 higher than that of accurate image Super-Resolution using Very Deep convolutional network （VDSR）， and the proposed method has the SSIM 0.002 9 higher and the model parameters 78% smaller than Multi-scale residual Network for Image Super-Resolution （MSRN）. The experimental results show that， under the premise of ensuring the depth and width of the model， the proposed method significantly reduces the number of parameters and the difficulty of training. In the meantime， the proposed method can achieve better Peak Signal-to-Noise Ratio （PSNR） and SSIM than the comparison methods.

Underwater image enhancement algorithm based on artificial under-exposure fusion and white-balancing technique

Ye TAO, Wenhai XU, Luqiang XU, Fucheng GUO, Haibo PU, Guangtong CHEN

2021, 41(12): 3672-3679. DOI: 10.11772/j.issn.1001-9081.2021010065

Asbtract ( )

HTML ( )

PDF (2675KB) ( )

Figures and Tables | References | Related Articles | Metrics

Acquisition of clear and accurate underwater images is an important prerequisite to help people explore the underwater world. However， compared with regular images， underwater images always have problems such as low contrast， detail loss and color distortion， resulting in bad visual effect. In order to solve the problems， a new underwater image enhancement algorithm based on Artificial Under-exposure Fusion and White-Balancing technique （AUF+WB） was proposed. Firstly， the Gamma correction operation was used to process the original underwater image and generate 5 corresponding under-exposure images. Then， the contrast， saturation and well-exposedness were employed as fusion weights， and the multi-scale fusion method was combined to generate the fused image. Finally， the images compensated by various color channels were combined with the Gray-World white balance assumption respectively to generate the corresponding white balance images， and these obtained white balance images were evaluated by using the Underwater Color Image Quality Evaluation （UCIQE） and the Underwater Image Quality Measure （UIQM）. With selecting different types of underwater images as experimental samples， the proposed AUF+WB algorithm was compared with the existing state-of-the-art underwater image defogging algorithms. The results show that， the proposed AUF+WB algorithm has better performance than the comparison algorithms on both qualitative and quantitative analysis of image quality. The proposed AUF+WB algorithm can effectively improve the visual quality of underwater images by removing color distortion， enhancing contrast， and recovering details of underwater images.

Real-time binocular foreground depth estimation algorithm based on sparse convolution

Zhehan QIU, Yang LI

2021, 41(12): 3680-3685. DOI: 10.11772/j.issn.1001-9081.2021010076

Asbtract ( )

HTML ( )

PDF (1709KB) ( )

Figures and Tables | References | Related Articles | Metrics

To improve the computational efficiency of stereo matching on foreground disparity estimation tasks， aiming at the disadvantage that the general networks use the complete binocular image as input and the input information redundancy is large due to the small proportion of the foreground space in the scene， a real-time target stereo matching algorithm based on sparse convolution was proposed. In order to realize and improve the sparse foreground disparity estimation of the algorithm， firstly， the sparse foreground mask and scene semantic features were obtained by the segmentation algorithm at the same time. Secondly， the sparse convolution was used to extract the spatial features of the foreground sparse region， and scene semantic features were fused with them. Then， the fused features were input into the decoding module for disparity regression. Finally， the foreground truth graph was used as the loss to generate the disparity graph. The test results on ApolloScape dataset show that the accuracy and real-time performance of the proposed algorithm are better than those of the state-of-the-art algorithms PSMNet （Pyramid Stereo Matching Network） and GANet （Guided Aggregation Network）， and the single run time of the algorithm is as low as 60.5 ms. In addition， the proposed algorithm has certain robustness to the foreground occlusion， and can be used for the real-time depth estimation of targets.

Single image dehazing based on conditional generative adversarial network with enhanced generator

Yang ZHAO, Bo LI

2021, 41(12): 3686-3691. DOI: 10.11772/j.issn.1001-9081.2021010092

Asbtract ( )

HTML ( )

PDF (947KB) ( )

Figures and Tables | References | Related Articles | Metrics

The presence of particles such as smoke in the atmosphere can lead to reduced visibility in scenes captured by the naked eye. Most traditional dehazing methods estimate the transmissivity and atmospheric light of the haze scene， and restore the image without haze by using atmospheric scattering model. Although these methods have made significant progresses， due to excessive reliance on harsh prior conditions， the dehazing effect in the absence of corresponding prior conditions is not ideal. Therefore， an end-to-end integrated dehazing network was proposed， in which the Conditional Generative Adversarial Network （CGAN） with enhanced generator was used to directly restore the image without haze. In the generator side， U-Net was used as the basic structure， and a simple and effective enhanced decoder was used through the promotion strategy of “integration-enhance-subtraction” to enhance the recovery of features in the decoder. In addition， the Multi-Scale Structural SIMilarity （MS-SSIM） loss function was added to enhance the restoration of the edge details of the image. In experiments on synthetic and real datasets， the model was significantly better than the traditional dehazing models such as Dark Channel Prior （DCP）， All-in-One Dehazing Network （AOD-Net）， Progressive Feature Fusion Network （PFFNet） and Conditional Wasserstein Generative Adversarial Network （CWGAN） in Peak Signal-to-Noise Ratio （PSNR） and Structural Similarity （SSIM）. Experimental results show that compared with the comparison algorithms， the proposed network can recover images without haze closer to the ground truth with better dehazing effect.

Dynamic testing resource allocation algorithm based on software architecture and generalized differential evolution

Zhisheng SHAO, Guofu ZHANG, Zhaopin SU, Lei LI

2021, 41(12): 3692-3701. DOI: 10.11772/j.issn.1001-9081.2021010095

Asbtract ( )

HTML ( )

PDF (717KB) ( )

Figures and Tables | References | Related Articles | Metrics

Testing resource allocation is one of the basic problems in software testing. However， most existing studies focus on the parallel-series modular software models but rarely consider the architecture-based software models. To this end， firstly， aiming at the test environment with dynamic changes of reliability and error number， a multi-stage and multi-objective testing resource allocation model was constructed based on the architecture. Then， a multi-stage and multi-objective testing resource allocation algorithm for dynamic reliability and error number was designed on the basis of parameter re-estimation， population re-initialization， generalized differential evolution， and weighted normalized sum. Finally， in the simulation experiments， compared with the existing Multi-Objective Differential Evolution based on Weighted Normalized Sum （WNS-MODE） algorithm， the proposed algorithm was able to obtain better solution sets on the architecture-based software model instances with different structures. Specifically， the capacity values increased by about 16 times， the coverage values increased by about 84 percentage points， and the hypervolume values increased by about 6 times. Experimental results demonstrate that the proposed algorithm can better adapt to the dynamic changes of reliability and error number， and can provide more and better testing resource allocation schemes for the dynamic testing of architecture-based software models.

B-cell epitope prediction model with overlapping subgraph mining based on L-Metric

Chuang GAO, Mian TANG, Liang ZHAO

2021, 41(12): 3702-3706. DOI: 10.11772/j.issn.1001-9081.2021010017

Asbtract ( )

HTML ( )

PDF (499KB) ( )

Figures and Tables | References | Related Articles | Metrics

Existing epitope prediction methods have poor performance on overlapping epitope prediction of antigen. In order to slove the problem， a novel epitope prediction model with the overlapping subgraph mining algorithm based on Local Metric （L-Metric） was proposed. Firstly， an atom graph was constructed based on surface atoms of antigen and upgraded to an amino acid residue graph subsequently. Then， the amino acid residue graph was divided into non-overlapping seed subgraphs by the information flow based graph partitioning algorithm， and these seed subgraphs were expanded to obtain overlapping subgraphs by using the L-Metric based overlapping subgraph mining algorithm. Finally， these expanded graphs were classified into epitopes and non-epitopes by using a classification model constructed based on Graph Convolutional Network （GCN） and Fully Connected Network （FCN）. Experimental results show that， the $F 1$ -score of the proposed model is increased by 267.3%， 57.0%， 65.4% and 3.5% compared to those of the existing epitope prediction models such as Discontinuous epiTope prediction 2 （DiscoTope 2）， Ellipsoid and Protrusion （ElliPro）， Epitope Prediction server （EpiPred） and overlapping Graph cLustering-based B-cell epitope predictor （Glep） respectively in the same dataset. At the same time， the ablation experimental results show that the proposed overlapping subgraph mining algorithm can improve the prediction performance effectively， and the model with the proposed algorithm has the $F 1$ -score increased by 19.2% compared to the model without the proposed algorithm.

Table of Content