Table of Content

    10 August 2023, Volume 43 Issue 8
    The 19th International Conference on Web Information Systems and Applications (WISA 2022)
    Maximal clique searching algorithm for hypergraphs
    Lantian XU, Ronghua LI, Yongheng DAI, Guoren WANG
    2023, 43(8):  2319-2324.  DOI: 10.11772/j.issn.1001-9081.2022091334
    Asbtract ( )   HTML ( )   PDF (1332KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Most of entity relationships in the real world cannot be represented by simple binary relations, and hypergraph can represent the n-ary relations among entities well. Therefore, definitions of hypergraph clique and maximal clique were proposed, and the exact algorithm and approximation algorithm for searching hypergraph maximal clique were given. First, the reason why the existing maximal clique searching algorithms on ordinary graphs cannot be applied to hypergraphs directly was analyzed. Then, based on the characteristics of hypergraph and the definition of maximal clique, a novel data structure for preserving the adjacency relations among hyperpoints was proposed, and an accurate maximal clique searching algorithm on hypergraph was proposed. As the running of the exact algorithm is slow, the pruning idea of pivots was combined with, the number of recursive layers was reduced, and an approximation maximal clique searching algorithm on hypergraph was proposed. Experimental results on multiple real hypergraph datasets show that under the premise finding most maximal cliques, the proposed approximation algorithm improves the search speed. When the number of test hypergraph cliques on 3-uniform hypergraph is 22, the acceleration ratio reaches over 1 000.

    Few-shot object detection algorithm based on Siamese network
    Junjian JIANG, Dawei LIU, Yifan LIU, Yougui REN, Zhibin ZHAO
    2023, 43(8):  2325-2329.  DOI: 10.11772/j.issn.1001-9081.2022121865
    Asbtract ( )   HTML ( )   PDF (1472KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Deep learning based algorithms such as YOLO (You Only Look Once) and Faster Region-Convolutional Neural Network (Faster R-CNN) require a huge amount of training data to ensure the precision of the model, and it is difficult to obtain data and the cost of labeling data is high in many scenarios. And due to the lack of massive training data, the detection range is limited. Aiming at the above problems, a few-shot object Detection algorithm based on Siamese Network was proposed, namely SiamDet, with the purpose of training an object detection model with certain generalization ability by using a few annotated images. Firstly, a Siamese network based on depthwise separable convolution was proposed, and a feature extraction network ResNet-DW was designed to solve the overfitting problem caused by insufficient samples. Secondly, an object detection algorithm SiamDet was proposed based on Siamese network, and based on ResNet-DW, Region Proposal Network (RPN) was introduced to locate the interested objects. Thirdly, binary cross entropy loss was introduced for training, and contrast training strategy was used to increase the distinction among categories. Experimental results show that SiamDet has good object detection ability for few-shot objects, and SiamDet improves AP50 by 4.1% on MS-COCO 20-way 2-shot and 2.6% on PASCAL VOC 5-way 5-shot compared with the suboptimal algorithm DeFRCN (Decoupled Faster R-CNN).

    Spatial-temporal co-occurrence pattern mining algorithm for video data
    Xiaoyu ZHANG, Ziqiang YU, Chengdong LIU, Bohan LI, Changfeng JING
    2023, 43(8):  2330-2337.  DOI: 10.11772/j.issn.1001-9081.2022101566
    Asbtract ( )   HTML ( )   PDF (5225KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Spatial-temporal co-occurrence patterns refer to the video object combinations with spatial-temporal correlations. In order to mine the spatial-temporal co-occurrence patterns meeting the query conditions from a huge volume of video data quickly, a spatial-temporal co-occurrence pattern mining algorithm with a triple-pruning matching strategy — Multi-Pruning Algorithm (MPA) was proposed. Firstly, the video objects were extracted in a structured way by the existing video object detection and tracking models. Secondly, the repeated occurred video objects extracted from a sequence of frames were stored and compressed, and an index of the objects was created. Finally, a spatial-temporal co-occurrence pattern mining algorithm based on the prefix tree was proposed to discover the spatial-temporal co-occurrence patterns that meet query conditions. Experimental results on real and synthetic datasets show that the proposed algorithm improves the efficiency by about 30% compared with Brute Force Algorithm (BFA), and the greater the data volume, the more obvious the efficiency improvement. Therefore, the proposed algorithm can discover the spatial-temporal co-occurrence patterns satisfying the query conditions from a large volume of video data quickly.

    Attribute network representation learning with dual auto-encoder
    Jinghong WANG, Zhixia ZHOU, Hui WANG, Haokang LI
    2023, 43(8):  2338-2344.  DOI: 10.11772/j.issn.1001-9081.2022091337
    Asbtract ( )   HTML ( )   PDF (956KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    On the premise of ensuring the properties of nodes in the network, the purpose of attribute network representation learning is to learn the low-dimensional dense vector representation of nodes by combining structure and attribute information. In the existing attribute network representation learning methods, the learning of attribute information in the network is ignored, and the interaction of attribute information with the network topology is insufficient, so that the network structure and attribute information cannot be fused efficiently. In response to the above problems, a Dual auto-Encoder Network Representation Learning (DENRL) algorithm was proposed. Firstly, the high-order neighborhood information of nodes was captured through a multi-hop attention mechanism. Secondly, a low-pass Laplacian filter was designed to remove the high-frequency signals and iteratively obtain the attribute information of important neighbor nodes. Finally, an adaptive fusion module was constructed to increase the acquisition of important information through the consistency and difference constraints of the two kinds of information, and the encoder was trained by supervising the joint reconstruction loss function of the two auto-encoders. Experimental results on Cora, Citeseer, Pubmed and Wiki datasets show that DENRL algorithm has the highest clustering accuracy and the lowest algorithm running time on three citation network datasets compared with DeepWalk, ANRL (Attributed Network Representation Learning) and other algorithms, achieves these two indicators of 0.775 and 0.460 2 s respectively on Cora datasets, and has the highest link prediction precision on Cora and Citeseer datasets, reaching 0.961 and 0.970 respectively. It can be seen that the fusion and interactive learning of attribute and structure information can obtain stronger node representation capability.

    Constrained multi-objective evolutionary algorithm based on multi-stage search
    Saijuan XU, Zhenyu PEI, Jiawei LIN, Genggeng LIU
    2023, 43(8):  2345-2351.  DOI: 10.11772/j.issn.1001-9081.2022091355
    Asbtract ( )   HTML ( )   PDF (1529KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Constraint handling strategies of the existing constrained multi-objective algorithms fail to solve the problems with large infeasible regions effectively, resulting in population stagnation at the edge of infeasible regions. Besides, the higher requirements are proposed for the global search ability and the maintenance of diversity of the algorithms by the discontinuous problems with constraints. To solve the above problems, a Constrained Multi-Objective Evolutionary Algorithm based on Multi-Stage Search (CMOEA-MSS) was proposed, with different search strategies used in three stages. To make the population across large infeasible regions and approximate Pareto front quickly, in the first stage, a convergence indicator was used to guide the population search without considering the constraints. In the second stage, a set of uniformly distributed weight vectors were utilized to maintain the population diversity, and an improved epsilon constraint handling strategy was presented to retain high-quality solutions in infeasible regions. In the third stage, the constraint dominance principle was adopted, and the search preference would focus on the feasible regions to ensure the feasibility of the final solution set. CMOEA-MSS was compared with NSGA-Ⅱ+ARSBX (Nondominated Sorting Genetic Algorithm Ⅱ using Adaptive Rotation-based Simulated Binary crossover) and other algorithms on MW and DASCMOP test sets. Experimental results show that CMOEA-MSS obtains the best IGD (Inverted Generational Distance) values on seven test problems and the best HV (HyperVolume) values on five test problems on MW test set, and obtains the best IGD values on three test problems, the second best IGD values on two test problems and the best HV values on five test problems on DASCMOP test set. It can be seen that CMOEA-MSS has obvious advantages in solving discontinuous and multi-modal constrained multi-objective problems.

    Spectrum combinatorial auction mechanism based on random walk algorithm
    Jingyi WANG, Chao LI, Heng SONG, Di LI, Junwu ZHU
    2023, 43(8):  2352-2357.  DOI: 10.11772/j.issn.1001-9081.2022091351
    Asbtract ( )   HTML ( )   PDF (1187KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    How to allocate spectra to users efficiently and improve the revenue of providers are popular research topics recently. To address the problem of low revenue of providers in spectrum combinatorial auctions, Random Walk for Spectrum Combinatorial Auctions (RWSCA) mechanism was designed to maximize the revenue of spectrum providers by combining the characteristics of asymmetric distribution of user valuations. First, the idea of virtual valuation was introduced, the random walk algorithm was used to search for a set of optimal parameters in the parameter space, and the valuations of buyers were linearly mapped according to the parameters. Then, VCG (Vickrey-Clarke-Groves) mechanism based on virtual valuation was run to determine the users who won the auction and calculate the corresponding payments. Theoretical analysis proves that the proposed mechanism is incentive compatible and individually rational. In spectrum combinatorial auction simulation experiments, the RWSCA mechanism increases the provider’s revenue by at least 16.84%.

    Hierarchical and phased attention network model for personalized course recommendation
    Yuan LIU, Yongquan DONG, Rui JIA, Haolin YANG
    2023, 43(8):  2358-2363.  DOI: 10.11772/j.issn.1001-9081.2022091336
    Asbtract ( )   HTML ( )   PDF (979KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    With the widespread applications of Massive Open Online Courses (MOOCs) platforms, an effective method is needed for personalized course recommendation. In view of the existing course recommendation methods, which usually use the course learning records to establish the overall static representation for users’ learning interests, while ignoring the dynamic changes of learning interests and users’ short-term learning interests, a Hierarchical and Phased Attention Network (HPAN) was proposed to carry out personalized course recommendation. In the first layer of the network, the attention network was used to obtain the user’s long- and short-term learning interests. In the second layer of the network, the user’s long- and short-term learning interests and short-term interaction sequence were combined to obtain the user’s interest vector through the attention network, then the preference value of the user’s interest vector with each course vector was calculated, and courses were recommended for the user according to the result. Experimental results on public dataset XuetangX show that, compared with the second best SHAN (Sequential Hierarchical Attention Network) model, HPAN model has the Recall@5 increased by 12.7%; compared with FPMC (Factorizing Personalized Markov Chains) model, HPAN model has the MRR@20 increased by 15.6%. HPAN model has better recommendation effect than the comparison models, and can be used for practical personalized course recommendation.

    Structured deep text clustering model based on multi-layer semantic fusion
    Shengwei MA, Ruizhang HUANG, Lina REN, Chuan LIN
    2023, 43(8):  2364-2369.  DOI: 10.11772/j.issn.1001-9081.2022091356
    Asbtract ( )   HTML ( )   PDF (1642KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In recent years, due to the advantages of the structural information of Graph Neural Network (GNN) in machine learning, people have begun to combine GNN into deep text clustering. The current deep text clustering algorithm combined with GNN ignores the important role of the decoder on semantic complementation in the fusion of text semantic information, resulting in the lack of semantic information in the data generation part. In response to the above problem, a Structured Deep text Clustering Model based on multi-layer Semantic fusion (SDCMS) was proposed. In this model, a GNN was utilized to integrate structural information into the decoder, the representation of text data was enhanced through layer-by-layer semantic complement, and better network parameters were obtained through triple self-supervision mechanism.Results of experiments carried out on 5 real datasets Citeseer, Acm, Reutuers, Dblp and Abstract show that compared with the current optimal Attention-driven Graph Clustering Network (AGCN) model, SDCMS in accuracy, Normalized Mutual Information (NMI ) and Average Rand Index (ARI) has increased by at most 5.853%, 9.922% and 8.142%.

    DDDC: deep dynamic document clustering model
    Hui LU, Ruizhang HUANG, Jingjing XUE, Lina REN, Chuan LIN
    2023, 43(8):  2370-2375.  DOI: 10.11772/j.issn.1001-9081.2022091354
    Asbtract ( )   HTML ( )   PDF (1962KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    The rapid development of Internet leads to the explosive growth of news data. How to capture the topic evolution process of current popular events from massive news data has become a hot research topic in the field of document analysis. However, the commonly used traditional dynamic clustering models are inflexible and inefficient when dealing with large-scale datasets, while the existing deep document clustering models lack a general method to capture the topic evolution process of time series data. To address these problems, a Deep Dynamic Document Clustering (DDDC) model was designed. In this model, based on the existing deep variational inference algorithms, the topic distributions incorporating the content of previous time slices on different time slices were captured, and the evolution process of event topics was captured from these distributions through clustering. Experimental results on real news datasets show that compared with Dynamic Topic Model (DTM), Variational Deep Embedding (VaDE) and other algorithms, DDDC model has the clustering accuracy and Normalized Mutual Information (NMI) improved by at least 4 percentage points averagely and at least 3 percentage points respectively in each time slice on different datasets, verifying the effectiveness of DDDC model.

    Hierarchical storyline generation method for hot news events
    Dong LIU, Chuan LIN, Lina REN, Ruizhang HUANG
    2023, 43(8):  2376-2381.  DOI: 10.11772/j.issn.1001-9081.2022091377
    Asbtract ( )   HTML ( )   PDF (1333KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    The development of hot news events is very rich, and each stage of the development has its own unique narrative. With the development of events, a trend of hierarchical storyline evolution is presented. Aiming at the problem of poor interpretability and insufficient hierarchy of storyline in the existing storyline generation methods, a Hierarchical Storyline Generation Method (HSGM) for hot news events was proposed. First, an improved hotword algorithm was used to select the main seed events to construct the trunk. Second, the hotwords of branch events were selected to enhance the branch interpretability. Third, in the branch, a storyline coherence selection strategy fusing hotword relevance and dynamic time penalty was used to enhance the connection of parent-child events, so as to build hierarchical hotwords, and then a multi-level storyline was built. In addition, considering the incubation period of hot news events, a hatchery was added during the storyline construction process to solve the problem of neglecting the initial events due to insufficient hotness. Experimental results on two real self-constructed datasets show that in the event tracking process, compared with the methods based on singlePass and k-means respectively, HSGM has the F score increased by 4.51% and 6.41%, 20.71% and 13.01% respectively; in the storyline construction process, HSGM performs well in accuracy, comprehensibility and integrity on two self-constructed datasets compared with Story Forest and Story Graph.

    Artificial intelligence
    Transfer learning model based on improved domain separation network
    Zexi JIN, Lei LI, Ji LIU
    2023, 43(8):  2382-2389.  DOI: 10.11772/j.issn.1001-9081.2022071103
    Asbtract ( )   HTML ( )   PDF (1973KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In order to further improve the feature recognition and extraction efficiency of transfer learning, reduce negative transfer and enhance the learning performance of the model, a transfer learning model based on improved Domain Separation Network (DSN) — AMCN-DSN (Attention Mechanism Capsule Network-DSN) was proposed. Firstly, the extraction and reconstruction of feature information in the source and target domains were accomplished by using Multi-Head Attention CapsNet (MHAC), the feature information was filtered effectively based on the attention mechanism, and the capsule network was adopted to improve the extraction quality of deep information. Secondly, a dynamic adversarial factor was introduced to optimize the reconstruction loss function, so that the reconstructor was able to dynamically measure the relative importance of the source and target domain information to improve the robustness and convergence speed of transfer learning. Finally, a multi-head self-attention mechanism was incorporated into the classifier to enhance the semantic understanding of the public features and improve the classification performance. In the sentiment analysis experiments, compared to other transfer learning models, the proposed model can transfer the learned knowledge to tasks with less data but high similarity with the least degradation of classification performance and good transfer performance. In the intent recognition experiments, the proposed model improves the precision, recall and F1 score by 4.5%, 4.3% and 4.4% respectively, compared to the model with suboptimal classification performance — Capsule Network improved Domain Adversarial Neural Network (DANN+CapsNet) model, showing certain advantages of the proposed model in dealing with small data problems and personalization problems. In comparison with DSN, AMCN-DSN has the F1 scores on the target domain in the above-mentioned two types of experiments improved by 6.0% and 12.4% respectively, further validating the effectiveness of the improved model.

    Graph to equation tree model based on expression layer-by-layer aggregation and dynamic selection
    Bin LIU, Qian ZHANG, Yaqin WEI, Xueying CUI, Hongying ZHI
    2023, 43(8):  2390-2395.  DOI: 10.11772/j.issn.1001-9081.2022071054
    Asbtract ( )   HTML ( )   PDF (2057KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Existing tree decoder is only suitable for solving single variable problems, but has no good effect of solving multivariate problems. At the same time, most mathematical solvers select truth expression wrongly, which leads to learning deviation occurred in training. Aiming at the above problems, a Graph to Equation Tree (GET) model based on expression level-by-level aggregation and dynamic selection was proposed. Firstly, text semantics was learned through the graph encoder. Then, subexpressions were obtained by aggregating quantities and unknown variables iteratively from bottom of the equation tree layer by layer. Finally, combined with the longest prefix of output expression, truth expression was selected dynamically to minimize the deviation. Experimental results show that the precision of proposed model reaches 83.10% on Math23K dataset, which is 5.70 percentage points higher than that of Graph to Tree (Graph2Tree) model. Therefore, the proposed model can be applied to solution of complex multivariate mathematical problems, and can reduce influence of learning deviation on experimental results.

    General text classification model combining attention and cropping mechanism
    Yumeng CUI, Jingya WANG, Xiaowen LIU, Shangyi YAN, Zhizhong TAO
    2023, 43(8):  2396-2405.  DOI: 10.11772/j.issn.1001-9081.2022071071
    Asbtract ( )   HTML ( )   PDF (1774KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Focused on the issue that current classification models are generally effective on texts of one length, and a large number of long and short texts occur in actual scenes in a mixed way, a General Long and Short Text Classification Model based on Hybrid Neural Network (GLSTCM-HNN) was proposed. Firstly, BERT (Bidirectional Encoder Representations from Transformers) was applied to encode texts dynamically. Then, convolution operations were used to extract local semantic information, and a Dual Channel ATTention mechanism (DCATT) was built to enhance key text regions. Meanwhile, Recurrent Neural Network (RNN) was utilized to capture global semantic information, and a Long Text Cropping Mechanism (LTCM) was established to filter critical texts. Finally, the extracted local and global features were fused and input into Softmax function to obtain the output category. In comparison experiments on four public datasets, compared with the baseline model (BERT-TextCNN) and the best performing comparison model BERT, GLSTCM-HNN has the F1 scores increased by up to 3.87 and 5.86 percentage points respectively. In two generality experiments on mixed texts, compared with the generality model — CNN-BiLSTM/BiGRU hybrid text classification model based on Attention (CBLGA) proposed by existing research, GLSTCM-HNN has the F1 scores increased by 6.63 and 37.22 percentage points respectively. Experimental results show that the proposed model can improve the accuracy of text classification task effectively, and has generality of classification on texts with different lengths from training data and on long and short mixed texts.

    Cross-lingual zero-resource named entity recognition model based on sentence-level generative adversarial network
    Xiaoyan ZHANG, Zhengyu DUAN
    2023, 43(8):  2406-2411.  DOI: 10.11772/j.issn.1001-9081.2022071124
    Asbtract ( )   HTML ( )   PDF (963KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    To address the problem of lack of labeled data in low-resource languages, which prevents the use of existing mature deep learning methods for Named Entity Recognition (NER), a cross-lingual NER model based on sentence-level Generative Adversarial Network (GAN), namely SLGAN-XLM-R (Sentence Level GAN based on XLM-R), was proposed. Firstly, the labeled data of the source language was used to train the NER model on the basis of the pre-trained model XLM-R (XLM-Robustly optimized BERT pretraining approach). At the same time, the linguistic adversarial training was performed on the embedding layer of XLM-R model by combining the unlabeled data of the target language. Then, the soft labels of the unlabeled data of the target language were predicted by using the NER model, Finally the labeled data of the source language and the target language was mixed to fine-tune the model again to obtain the final NER model. Experiments were conducted on four languages, English, German, Spanish, and Dutch, in two datasets, CoNLL2002 and CoNLL2003. The results show that with English as the source language, the F1 scores of SLGAN-XLM-R model on the test sets of German, Spanish, and Dutch are 72.70%, 79.42%, and 80.03%, respectively, which are 5.38, 5.38, and 3.05 percentage points higher compared to those of the direct fine-tuning on XLM-R model.

    Knowledge enhanced aspect word interactive graph neural network
    Hongjun HENG, Dingcheng YANG
    2023, 43(8):  2412-2419.  DOI: 10.11772/j.issn.1001-9081.2022071041
    Asbtract ( )   HTML ( )   PDF (1210KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Existing aspect-based sentiment analysis methods do not use enough information of syntactic dependency trees, ignore the associations between multiple aspect words, and lack the use of external knowledge. Aiming at these problems, a Knowledge Enhanced Aspect word Interactive Graph neural network (KEAIG) model was proposed. Firstly, BERT-PT (Bidirectional Encoder Representation from Transformers with Post-Train) fused with domain knowledge was used to encode text, and the knowledge graph was used to add sentiment information to the syntactic trees. The information contained in the syntactic dependency tree was extracted by the model in two parts: in the first part, the association relationships in the syntactic dependency tree and the part-of-speech tag of each word were used to extract sentence features, and in the second part, the feature extraction was performed on the syntactic dependency tree combined with the knowledge graph. Afterwards, the fusion gated unit was used to fuse the association features of multiple aspect words. Finally, the two parts of the sentence representations were concatenated together as the final classification basis. Experimental results on four datasets show that compared with the benchmark model Relational Graph Attention Network (RGAT), the proposed model improves the accuracy by 2.17%, 5.54%, 2.60%, and 2.83%, respectively, and the F1 score (Macro?F1) by 2.69% and 6.87%, 8.77%, and 14.70%, respectively, fully demonstrating the effectiveness of using syntactic trees, introducing external knowledge and extracting multi-aspect word associations.

    Citation recommendation algorithm fusing knowledge graph and graph attention network
    Haiwei FAN, Xinsiyu LU, Limiao ZHANG, Yisheng AN
    2023, 43(8):  2420-2425.  DOI: 10.11772/j.issn.1001-9081.2022071110
    Asbtract ( )   HTML ( )   PDF (1853KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aiming at problems of data sparseness and cold start in traditional Collaborative Filtering (CF) and problem that meta-path and random walk algorithms do not fully utilize node information, a Citation Recommendation Algorithm Fusing Knowledge Graph and Graph Attention Network (C-KGAT) was proposed. Firstly, knowledge graph information was mapped into low-dimensional dense vectors by using TransR algorithm to obtain embedded feature representation of the nodes. Secondly, through multi-channel fusion mechanism, graph attention network was used to aggregate neighbor node information to enrich semantics of target nodes and capture high-order connectivity between nodes. Thirdly, without affecting depth or width of network, dynamic convolutional layer was introduced to aggregate information of neighbor nodes dynamically to improve expression ability of the model. Finally, the interaction probabilities of users and citations were calculated through the prediction layer. Experimental results on public datasets AAN (ACL Anthology Network) and DataBase systems and Logic Programming (DBLP) show that the proposed algorithm performs better than all comparison models. The MRR (Mean Reciprocal Rank) of the proposed algorithm is increased by 6.0 and 3.4 percentage points respectively compared with that of the suboptimal model NNSelect, and the Precision and Recall indicators of the proposed algorithm also have different degrees of improvement, which verifies the effectiveness of the algorithm.

    Relation extraction method based on negative training and transfer learning
    Kezheng CHEN, Xiaoran GUO, Yong ZHONG, Zhenping LI
    2023, 43(8):  2426-2430.  DOI: 10.11772/j.issn.1001-9081.2022071004
    Asbtract ( )   HTML ( )   PDF (922KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In relation extraction tasks, distant supervision is a common method for automatic data labeling. However, this method will introduce a large amount of noisy data, which affects the performance of the model. In order to solve the problem of noisy data, a relation extraction method based on negative training and transfer learning was proposed. Firstly, a noisy data recognition model was trained through negative training method. Then, the noisy data were filtered and relabeled according to the predicted probability value of the sample, Finally, a transfer learning method was used to solve the domain shift problem existing in distant supervision tasks, and the precision and recall of the model were further improved. Based on Thangka culture, a relation extraction dataset with national characteristics was constructed. Experimental results show that the F1 score of the proposed method reaches 91.67%, which is 3.95 percentage points higher than that of SENT (Sentence level distant relation Extraction via Negative Training) method, and is much higher than those of the relation extraction methods based on BERT (Bidirectional Encoder Representations from Transformers), BiLSTM+ATT(Bi-directional Long Short-Term Memory and Attention), and PCNN (Piecewise Convolutional Neural Network).

    Data science and technology
    Enhancement and expansion of full-text search in relational databases based on lightweight caching strategy
    Ting YANG, Ruoyu MO, Xiujuan ZHANG, Zhousen ZHU
    2023, 43(8):  2431-2438.  DOI: 10.11772/j.issn.1001-9081.2022071108
    Asbtract ( )   HTML ( )   PDF (1891KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aiming at the problems of low efficiency and high resource consumption in the existing full-text search schemes of Relational DataBase (RDB), a lightweight full-text search model for relational databases with enhanced secondary cache was proposed. Firstly, an inverted index based on Redis was built in the proposed model and cache index was used to reduce the search scope, which solved the I/O bottleneck of relational database with efficient data processing capacity in memory, and the overall performance of the system was improved. Secondly, in order to ensure the accuracy and real time performance of the search results, the index synchronization strategy was further proposed, and the incremental index component was designed and implemented to hide the index processing details, so as to improve the usability and universality of the model. Finally, an index update mechanism based on access heat was provided for hotspot data to reduce memory usage of the inverted index. Experimental results show that on the premise of ensuring the response speed and accuracy of full-text search in relational databases, the space resource consumption of the proposed model is 48.8% - 60.9% lower than that of MySQL full-text index and 85.2% - 96.2% lower than that of Elasticsearch, verifying that the proposed model is feasible and effective in practical applications.

    Adaptive social recommendation based on negative similarity
    Yinying ZHOU, Yunsheng ZHOU, Dunhui YU, Jun SUN
    2023, 43(8):  2439-2447.  DOI: 10.11772/j.issn.1001-9081.2022071003
    Asbtract ( )   HTML ( )   PDF (3245KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Social recommendation aims to improve recommendation effect of traditional recommendation algorithms by integrating social relations. Currently, social recommendation algorithms based on Network Embedding (NE) face two problems: one is that inconsistency between objects is not considered when constructing network, and algorithms are often restricted by positive objects that are difficult to obtain and have many constraints; the other is that the elimination of overfitting in algorithm training process based on the number of ratings cannot be realized by these algorithms. Therefore, an Adaptive Social Recommendation algorithm based on Negative Similarity (ASRNS) was proposed. Firstly, homogeneous networks with positive correlations were constructed by consistency analysis. Then, embedded vectors were obtained by combining weighted random walk with Skip-Gram algorithm. Next, similarities were calculated, and Matrix Factorization (MF) algorithm was constrained from the perspective of negative similarity. Finally, the number of ratings was mapped to the ideal rating range based on adaptive mechanism, and different penalties were imposed on bias terms of the algorithm. Experiments were conducted on FilmTrust and CiaoDVD datasets. The results show that compared with algorithms such as Collaborative User Network Embedding (CUNE) algorithm and Consistent neighbor aggregation for Recommendation (ConsisRec) algorithm, ASRNS has the Root Mean Square Error (RMSE) reduced by at least 2.60% and 5.53% respectively, and the Mean Absolute Error (MAE) reduced by at least 1.47% and 2.46% respectively. It can be seen that ASRNS can not only reduce rating prediction error effectively, but also improve over-fitting problem in algorithm training process significantly, and has good robustness for objects with different ratings.

    Hybrid point-of-interest recommendation model based on geographic preference ranking
    Shijie PENG, Hongmei CHEN, Lizhen WANG, Qing XIAO
    2023, 43(8):  2448-2455.  DOI: 10.11772/j.issn.1001-9081.2022071029
    Asbtract ( )   HTML ( )   PDF (1284KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    With the development of Location-Based Social Network (LBSN) Point-Of-Interest (POI) recommendation, an effective way to alleviate information overload, has attracted much attention. As user check-in data are implicit feedback data and very sparse, a hybrid POI recommendation model based on geographic preference ranking was proposed to effectively capture the user preference for POIs from check-in data. First, considering the implicit feedback characteristics of check-in data and the spatial constraint of user activities, by calculating the influence of POI distances on POI ranking based on the traditional Bayesian personalized Ranking (BPR) model, a weighted BPR model named GWBPR (Geo-Weighted Bayesian Personalized Ranking) was proposed. Then, aiming at the sparsity of user check-in data, by further integrating Logistic Matrix Factorization (LMF) model with GWBPR model, a hybrid model GWBPR-LMF (GWBPR with LMF) was proposed. Experimental results on two real datasets, Foursquare and Gowalla, show that GWBPR-LMF model outperforms the comparison models like BPR, LMF and SAE-NAD (Self-Attentive Encoder and Neighbor-Aware Decoder). Compared with the relatively good-performance model SAE-NAD, GWBPR-LMF model improves the precision, recall, F1 score, mean Average Precision (mAP) and Normalized Discounted Cumulative Gain (NDCG) by 44.9%, 57.1%, 78.4%, 55.3%, and 40.0% averagely and respectively on Foursquare dataset, and 3.0%, 6.4%, 4.6%, 11.7%, and 4.2% averagely and respectively on Gowalla dataset.

    Point-of-interest category representation model with spatial and textual information
    Zelin XU, Min YANG, Meng CHEN
    2023, 43(8):  2456-2461.  DOI: 10.11772/j.issn.1001-9081.2022071037
    Asbtract ( )   HTML ( )   PDF (2357KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Representing Point-Of-Interest (POI) categories (e.g., universities, restaurants) accurately is the key to understand urban space and assist urban computing. Existing models for POI category representation usually only mine users’ mobility behaviors among POIs and learn sequential features, while ignoring spatial and textual semantic features of POI data. In order to solve the above problems, a POI category representation learning model incorporating spatial and textual information — Cat2Vec was proposed. Firstly, a POI category co-occurrence Point-wise Mutual Information (PMI) matrix was constructed by using the spatial co-occurrence relationships of POIs. Then, the text semantic features of POIs were learnt by a pre-trained text representation model. Finally, a new mapping matrix was introduced, and based on the matrix factorization technology, the PMI matrix was decomposed into an inner product of a POI category representation matrix, a text semantic feature matrix and a mapping matrix. In the evaluation of semantic overlapping of POIs on two real-world datasets Yelp and AMap, compared to Doc2Vec, the best model among baselines, the proposed model has the performance improved by 5.53% and 8.17% averagely and respectively. Experimental results show that the proposed model can embed the semantics of POIs more effectively.

    Cyber security
    Conditional differential cryptanalysis method of KATAN48 algorithm based on neural distinguishers
    Dongdong LIN, Manman LI, Shaozhen CHEN
    2023, 43(8):  2462-2470.  DOI: 10.11772/j.issn.1001-9081.2022060886
    Asbtract ( )   HTML ( )   PDF (2057KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aiming at the security analysis problem of KATAN48 algorithm, a conditional differential cryptanalysis method of KATAN48 algorithm based on neural distinguishers was proposed. First, the basic principle of multiple output differences neural distinguishers was studied and applied to KATAN48 algorithm. According to the data format of KATAN48 algorithm, the input format and hyperparameters of the deep residual neural network were adjusted. Then, the Mixed-Integer Linear Programming (MILP) model of KATAN48 algorithm was established to search the prepended differential paths and the corresponding constraint conditions. At last, using the multiple output differences neural distinguishers, at most 80-round of the practical key recovery attack results of KATAN48 algorithm were given. Experimental results show that in the single key setting, the number of practical attack rounds of KATAN48 algorithm is increased by 10 rounds, the number of recoverable key bits of KATAN48 algorithm is increased by 22 bit and the data complexity and time complexity of KATAN48 algorithm are reduced from 234 and 234 to 216.39 and 219.68 respectively. Compared to the previous practical attack at the single-key setting, the proposed method can effectively increase the number of attack rounds and recoverable key bits, and reduces the computational complexity of attack.

    Encrypted traffic classification method based on improved Inception-ResNet
    Xiang GUO, Wengang JIANG, Yuhang WANG
    2023, 43(8):  2471-2476.  DOI: 10.11772/j.issn.1001-9081.2022071030
    Asbtract ( )   HTML ( )   PDF (1743KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Most classification models in deep learning-based encrypted traffic classification methods have deep and straight structure with the problem of vanishing gradient, and the increase of the number of network layers leads to significant increase of model structure and computational complexity. Based on these, an encrypted traffic classification method based on improved Inception-ResNet was proposed. In the method, the classification model was constructed by improving the Inception module and embedding it into the convolutional neural network as a residual block in a residual structural connection way. In addition, the loss function of the classification model was improved, and the effectiveness of the proposed method was verified by using VPN-nonVPN dataset. Experimental results show that the proposed method achieves the precision, recall, and F1 score of more than 94.21%, 92.53%, and 93.31%, respectively, in the classification experiments of two senerios. In the comparison experiments with other methods, taking the 12-class classification experiment, which is the most difficult one, as an example, the proposed method is higher than C4.5 decision tree algorithm and 1D-CNN (1 Dimensional-Convolutional Neural Network) by 13.91 and 9.50 percentage points higher in precision and by 14.87 and 1.59 percentage points in recall. Compared with the algorithms such as CAE (Convolutional Auto Encoding) and SAE (Stacked Auto Encoder), the proposed method not has obvious improvement on the indicators, but has significant shorter single training time, fully demonstrating that the proposed method is a state-of-the-art method.

    Efficient collaborative defense scheme against distributed denial of service attacks in software defined network
    Chenyang GE, Qinrang LIU, Xue PEI, Shuai WEI, Zhengbin ZHU
    2023, 43(8):  2477-2485.  DOI: 10.11772/j.issn.1001-9081.2022060940
    Asbtract ( )   HTML ( )   PDF (3501KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aiming at the problem that traditional defense schemes against Distributed Denial of Service (DDoS) attacks in Software Defined Network (SDN) tend to ignore the importance of reducing the workload of SDN, as well as do not consider the timeliness of attack mitigation, an efficient collaborative defense scheme against DDoS attacks in SDN was proposed. Firstly, the overhead of the control plane was reduced and the data plane’s resources were entirely used by offloading some of the defense tasks into the data plane. Then, if an anomaly was detected, eXpress Data Path (XDP) rules were generated to mitigate the attack promptly, and the statistical information of the data plane was handed over to the control plane to further detect and mitigate the attack, thereby improving the accuracy and further reducing the controller overhead. Finally, the rules of XDP were updated according to the anomaly source determined by the control plane. To validate the effectiveness of the proposed scheme, the Hyenae attack tool was used to generate three different types of attack data. Compared with the Support Vector Machine (SVM) scheme that relies on the control plane, the new architecture defense scheme, and the cross-plane collaborative defense scheme, the proposed scheme has the timeliness of defense improved by 33.33%, 28.57%, and 21.05%, respectively; the proposed scheme has the Central Processing Unit (CPU) consumption reduced by 33, 11, and 4 percentage points. Experimental results show that the proposed scheme can defend against DDoS attacks well and has a low performance overhead.

    Advanced computing
    Acceleration and optimization of quantum computing simulator implemented on new Sunway supercomputer
    Xinmin SHI, Yong LIU, Yaojian CHEN, Jiawei SONG, Xin LIU
    2023, 43(8):  2486-2492.  DOI: 10.11772/j.issn.1001-9081.2022091456
    Asbtract ( )   HTML ( )   PDF (2000KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Two optimization methods for quantum simulator implemented on Sunway supercomputer were proposed aiming at the problems of gradual scaling of quantum hardware and insufficient classical simulation speed. Firstly, the tensor contraction operator library SWTT was reconstructed by improving the tensor transposition strategy and computation strategy, which improved the computing kernel efficiency of partial tensor contraction and reduced redundant memory access. Secondly, the balance between complexity and efficiency of path computation was achieved by the contraction path adjustment method based on data locality optimization. Test results show that the improvement method of operator library can improve the simulation efficiency of the "Sycamore" quantum supremacy circuit by 5.4% and the single-step tensor contraction efficiency by up to 49.7 times; the path adjustment method can improve the floating-point efficiency by about 4 times with the path computational complexity inflated by a factor of 2. The two optimization methods have the efficiencies of single-precision and mixed-precision floating-point operations for the simulation of Google’s 53-bit, 20-layer quantum chip random circuit with a million amplitude sampling improved from 3.98% and 1.69% to 18.48% and 7.42% respectively, and reduce the theoretical estimated simulation time from 470 s to 226 s for single-precision and 304 s to 134 s for mixed-precision, verifying that the two methods significantly improve the quantum computational simulation speed.

    Quantum K-Means algorithm based on Hamming distance
    Jing ZHONG, Chen LIN, Zhiwei SHENG, Shibin ZHANG
    2023, 43(8):  2493-2498.  DOI: 10.11772/j.issn.1001-9081.2022091469
    Asbtract ( )   HTML ( )   PDF (1623KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    The K-Means algorithms typically utilize Euclidean distance to calculate the similarity between data points when dealing with large-scale heterogeneous data. However, this method has problems of low efficiency and high computational complexity. Inspired by the significant advantage of Hamming distance in handling data similarity calculation, a Quantum K-Means Hamming (QKMH) algorithm was proposed to calculate similarity. First, the data was prepared and made into quantum state, and the quantum Hamming distance was used to calculate similarity between the points to be clustered and the K cluster centers. Then, the Grover’s minimum search algorithm was improved to find the cluster center closest to the points to be clustered. Finally, these steps were repeated until the designated number of iterations was reached or the clustering centers no longer changed. Based on the quantum simulation computing framework QisKit, the proposed algorithm was validated on the MNIST handwritten digit dataset and compared with various traditional and improved methods. Experimental results show that the F1 score of the QKMH algorithm is improved by 10 percentage points compared with that of the Manhattan distance-based quantum K-Means algorithm and by 4.6 percentage points compared with that of the latest optimized Euclidean distance-based quantum K-Means algorithm, and the time complexity of the QKMH algorithm is lower than those of the above comparison algorithms.

    Generalized polar complex exponential transform with more uniform zeros distribution
    Zezhi ZENG, Jianwei YANG
    2023, 43(8):  2499-2504.  DOI: 10.11772/j.issn.1001-9081.2022071020
    Asbtract ( )   HTML ( )   PDF (5046KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In order to address the information suppression problem of Polar Complex Exponential Transform (PCET), which is caused by nonuniform zeros distribution in the real part and imaginary part of PCET’s radial function, a generalized PCET with more uniform zeros distribution was proposed. First, PCET was modified, and the exponential part of PCET’s radial function was generalized to a more general constructor. And recently proposed Exponent-Fourier Moment (EFM), fractional-order polar harmonic transform, generic polar complex exponential transform and the modified generic polar complex exponential transform are all the special cases of the proposed generalized PCET. Second, a constructor was chosen to make the zeros distribution in the real part and imaginary part of radial function of generalized PCET more uniform. And the proof of this property was given. Image reconstruction experiments were conducted on the selected Chinese character image, Coil-20 and COREL databases, and the rotation invariance and anti-noise performance of generalized PCET were tested. When the noise intensity is 0, both the recognition rates of PCET and generalized PCET are 100%, verifying the rotation invariance of PCET and generalized PCET. Compared with PCET, the proposed generalized PCET has lower reconstruction error and higher recognition rate. Theoretical analysis and experimental results show that the proposed generalized PCET with zeros distribution more uniform than PCET also has rotation invariance and orthogonality, and its reconstruction performance and anti-noise performance are better than those of PCET, which solves the information suppression problem of PCET to a certain extent, and is numerically stable at the origin.

    Temporal network motif discovery method based on null model
    Boren HU, Zhongmin PEI, Zhangkai LUO, Jie DING
    2023, 43(8):  2505-2510.  DOI: 10.11772/j.issn.1001-9081.2022071033
    Asbtract ( )   HTML ( )   PDF (2277KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In temporal networks with time attributes, conventional network motif discovery methods based on frequent subgraph statistics are easily affected by the differences in network size and structure. And an accurate benchmark for characteristic mining of empirical network can be provided by the null model network with same scale and some properties of the empirical network. Therefore, a temporal network motif discovery method based on null model was proposed to use relative values after comparing the features of the two network subgraphs to identify the subgraphs with significant structural meaning in temporal networks. At the same time, in order to determine when null model network reached stability, the method of successful scrambling times was adopted to improve the temporal network’s null model construction methods based on time scrambling or time randomization. In experiment stage, simulations were conducted on 46-node Global Positioning System (GPS) constellation containing satellites and ground stations, the number of successful scrambles times when the subgraph features of null model network reached stability was determined. Ten null model networks were constructed and compared with the satellite network. It was found that the number of occurrences of subgraph reflecting the continuity characteristics of node connection is only 1/34 of that of the subgraph with the highest frequency, but the former subgraph is the most important motif in the satellite network. Experimental results show that the temporal network’s motif discovery method with null model as reference can identify motifs that reflects network structural characteristics and dynamic change process more accurately.

    Network and communications
    Hybrid beamforming for multi-user mmWave relay networks using deep learning
    Xiaolin LI, Songjia YANG
    2023, 43(8):  2511-2516.  DOI: 10.11772/j.issn.1001-9081.2022081231
    Asbtract ( )   HTML ( )   PDF (1678KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In order to solve the problem of high computational complexity of traditional multi-user mmWave relay system beamforming methods, a Singular Value Decomposition (SVD) method based on Deep Learning (DL) was proposed to design hybrid beamforming for the optimization of the transmitter, relay and receiver. Firstly, DL method was used to design the beamforming matrix of transmitter and relay to maximize the achievable spectral efficiency. Then, the beamforming matrix of relay and receiver was designed to maximize the equivalent channel gain. Finally, a Minimum Mean Square Error (MMSE) filter was designed at the receiver to eliminate the inter-user interference. Theoretical analysis and simulation results show that compared with Alternating Maximization (AltMax) and the traditional SVD method, the hybrid beamforming method based on DL reduces the computational complexity by 12.5% and 23.44% respectively in the case of high dimensional channel matrix and many users, and has the spectral efficiency improved by 2.277% and 21.335% respectively with known Channel State Information (CSI), and the spectral efficiency improved by 11.452% and 43.375% respectively with imperfect CSI.

    Computer software technology
    Source code vulnerability detection based on hybrid code representation
    Kun ZHANG, Fengyu YANG, Fa ZHONG, Guangdong ZENG, Shijian ZHOU
    2023, 43(8):  2517-2526.  DOI: 10.11772/j.issn.1001-9081.2022071135
    Asbtract ( )   HTML ( )   PDF (1958KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Software vulnerabilities pose a great threat to network and information security, and the root of vulnerabilities lies in software source code. Existing traditional static detection tools and deep learning based detection methods do not fully represent code features, and simply use word embedding method to transform code representation, so that their detection results have low accuracy and high false positive rate or high false negative rate. Therefore, a source code vulnerability detection method based on hybrid code representation was proposed to solve the problem of incomplete code representation and improve detection performance. Firstly, source code was compiled into Intermediate Representation (IR), and the program dependency graph was extracted. Then, structural features were obtained through program slicing based on data flow and control flow analysis. At the same time, unstructural features were obtained by embedding node statements using doc2vec. Next, Graph Neural Network (GNN) was used to learn the hybrid features. Finally, the trained GNN was used for prediction and classification. In order to verify the effectiveness of the proposed method, experimental evaluation was performed on Software Assurance Reference Dataset (SARD) and real-world datasets, and the F1 score of detection results reached 95.3% and 89.6% respectively. Experimental results show that the proposed method has good vulnerability detection ability.

    Model repair method based on behavioral profile and logical Petri nets
    Haoyu ZHANG, Lili WANG
    2023, 43(8):  2527-2536.  DOI: 10.11772/j.issn.1001-9081.2022070980
    Asbtract ( )   HTML ( )   PDF (4583KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In the circumstance of the real business process changing constantly, the original business process model needs to be repaired to better represent the real business process. The key step of model repair is to analyze the deviation between the real log and the model. However, the current methods to find the deviation mainly use the alignment repetition technique, and do not quantitatively analyze the abstract structure from the perspective of behavior. Therefore, a method of analyzing deviation between log and model by behavioral profile was proposed, and based on the above, a model repair method was further proposed on the basis of logical Petri nets. Firstly, based on the behavioral profile, the compliance between the log and the model was calculated to identify the deviation trace. Secondly, the logic transitions were selected from deviant activities through the deviant triple set in the deviation trace. Finally, the logic function was set based on the logic transitions, and the original model was repaired by adding new branches or reconstructing new structures. The fitness and precision of the repair models were verified. Simulation results show that when the all finesses are 1, the repair model obtained by the proposed repair method has higher precision compared with Fahland method and Goldratt method, on the basis of maintaining the similarity between the repair model and original model as much as possible.

    Multimedia computing and computer simulation
    Research progress on motion segmentation of visual localization and mapping in dynamic environment
    Dongying ZHU, Yong ZHONG, Guanci YANG, Yang LI
    2023, 43(8):  2537-2545.  DOI: 10.11772/j.issn.1001-9081.2022070972
    Asbtract ( )   HTML ( )   PDF (2687KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Visual localization and mapping system is affected by dynamic objects in a dynamic environment, so that it has increase of localization and mapping errors and decrease of robustness. And motion segmentation of input images can significantly improve the performance of visual localization and mapping system in dynamic environment. Dynamic objects in dynamic environment can be divided into moving objects and potential moving objects. Current dynamic object recognition methods have problems of chaotic moving subjects and poor real-time performance. Therefore, motion segmentation strategies of visual localization and mapping system in dynamic environment were reviewed. Firstly, the strategies were divided into three types of methods according to preset conditions of the scene: methods based on static assumption of image subject, methods based on prior semantic knowledge and multi-sensor fusion methods without assumption. Then, these three types of methods were summarized, and their accuracy and real-time performance were analyzed. Finally, aiming at the difficulty of balancing accuracy and real-time performance of motion segmentation strategy of visual localization and mapping system in dynamic environment, development trends of the motion segmentation methods in dynamic environment were discussed and prospected.

    Review of object pose estimation in RGB images based on deep learning
    Yi WANG, Jie XIE, Jia CHENG, Liwei DOU
    2023, 43(8):  2546-2555.  DOI: 10.11772/j.issn.1001-9081.2022071022
    Asbtract ( )   HTML ( )   PDF (858KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    6 Degree of Freedom (DoF) pose estimation is a key technology in computer vision and robotics, and has become a crucial task in the fields such as robot operation, automatic driving, augmented reality by estimating 6 DoF pose of an object from a given input image, that is, 3 DoF translation and 3 DoF rotation. Firstly, the concept of 6 DoF pose and the problems of traditional methods based on feature point correspondence, template matching, and three-dimensional feature descriptors were introduced. Then, the current mainstream 6 DoF pose estimation algorithms based on deep learning were introduced in detail from different angles of feature correspondence-based, pixel voting-based, regression-based and multi-object instances-oriented, synthesis data-oriented, and category level-oriented. At the same time, the datasets and evaluation indicators commonly used in pose estimation were summarized and sorted out, and some algorithms were evaluated experimentally to show their performance. Finally, the challenges and the key research directions in the future of pose estimation were given.

    Fine-grained image recognition based on mid-level subtle feature extraction and multi-scale feature fusion
    Ailing QI, Xuanlin WANG
    2023, 43(8):  2556-2563.  DOI: 10.11772/j.issn.1001-9081.2022071090
    Asbtract ( )   HTML ( )   PDF (2539KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In the field of fine-grained visual recognition, due to subtle differences between highly similar categories, precise extraction of subtle image features has a crucial impact on recognition accuracy. It has become a trend for the existing related hot research algorithms to use attention mechanism to extract categorical features, however, these algorithms ignore the subtle but distinguishable features, and isolate the feature relationships between different discriminative regions of objects. Aiming at these problems, a fine-grained image recognition algorithm based on mid-level subtle feature extraction and multi-scale feature fusion was proposed. First, the salient features of image were extracted by using the weight variance measures of channel and position information fused mid-level features. Then, the mask matrix was obtained through the channel average pooling to suppress salient features and enhance the extraction of subtle features in other discriminative regions. Finally, channel weight information and pixel complementary information were used to obtain multi-scale fusion features of channels and pixels to enhance the diversity and richness of different discriminative regional features. Experimental results show that the proposed algorithm achieves 89.52% Top-1 accuracy and 98.46% Top-5 accuracy on dataset CUB-200-211, and 94.64% Top-1 accuracy and 98.62% Top-5 accuracy on dataset Stanford Cars, and 93.20% Top-1 accuracy and 97.98% Top-5 accuracy on dataset Fine-Grained Visual Classification of Aircraft (FGVC-Aircraft). Compared with recurrent collaborative attention feature learning network PCA-Net (Progressive Co-Attention Network) algorithm, the proposed algorithm has the Top-1 accuracy increased by 1.22, 0.34 and 0.80 percentage points respectively, and the Top-5 accuracy increased by 1.03, 0.88 and 1.12 percentage points respectively.

    Robotic grasp detection in low-light environment by incorporating visual feature enhancement mechanism
    Gan LI, Mingdi NIU, Lu CHEN, Jing YANG, Tao YAN, Bin CHEN
    2023, 43(8):  2564-2571.  DOI: 10.11772/j.issn.1001-9081.2023050586
    Asbtract ( )   HTML ( )   PDF (2821KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Existing robotic grasping operations are usually performed under well-illuminated conditions with clear object details and high regional contrast. At the same time, for low-light conditions caused by night and occlusion, where the objects’ visual features are weak, the detection accuracies of existing robotic grasp detection models decrease dramatically. In order to improve the representation ability of sparse and weak grasp features in low-light scenarios, a grasp detection model incorporating visual feature enhancement mechanism was proposed to use the visual enhancement sub-task to impose feature enhancement constraints on grasp detection. In grasp detection module, the U-Net like encoder-decoder structure was adopted to achieve efficient feature fusion. In low-light enhancement module, the texture and color information was respectively extracted from local and global level, thereby balancing the object details and visual effect in feature enhancement. In addition, two low-light grasp datasets called low-light Cornell dataset and low-light Jacquard dataset were constructed as new benchmark dataset of low-light grasp and used to conduct the comparative experiments. Experimental results show that the accuracies of the proposed low-light grasp detection model are 95.5% and 87.4% on the benchmark datasets respectively, which are 11.1, 1.2 percentage points higher on low-light Cornell dataset and 5.5, 5.0 percentage points higher on low-light Jacquard dataset than those of the existing grasp detection models, including Generative Grasping Convolutional Neural Network (GG-CNN), and Generative Residual Convolutional Neural Network (GR-ConvNet), indicating that the proposed model has good grasp detection performance.

    ShuffaceNet: face recognition neural network based on ThetaMEX global pooling
    Kansong CHEN, Yuan ZHENG, Lijun XU, Zhouyu WANG, Zhe ZHANG, Fujuan YAO
    2023, 43(8):  2572-2580.  DOI: 10.11772/j.issn.1001-9081.2022070985
    Asbtract ( )   HTML ( )   PDF (3354KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Focused on the issue that the current large-scale networks are not suitable to be applied on resource-starved mobile devices like smart phones and tablet computers, and the pooling layer will lead to the sparsity of feature map, which ultimately affect the recognition accuracy of the neural network, a lightweight face recognition neural network namely ShuffaceNet was proposed, a smooth nonlinear Log-Mean-Exp function ThetaMEX was designed, and an end-to-end trainable ThetaMEX Global Pool Layer (TGPL) was proposed, so as to reduce network parameters and improve computing speed while ensuring the accuracy of the algorithm, achieving the purpose that the network can be effectively deployed on mobile devices with limited resources. ShuffaceNet has about 3 600 parameters, and the model size is only 3.5 MB. The recognition test results on LFW (Labled Faces in the Wild), AgeDB-30 (Age Database-30) and CFP (Celebrities in Frontal Profile) face datasets show that the accuracy of ShuffaceNet reaches 99.32%, 93.17%, 94.51% respectively. Compared with the traditional networks such as MobileNetV1, SqueezeNet and Xception, the proposed network has the size reduced by 73.1%, 82.1% and 78.5% respectively, and the accuracy on AgeDB-30 dataset improved by 5.0%, 6.3% and 6.7% respectively. It can be seen that the proposed network based on ThetaMEX global pooling can improve the model accuracy.

    Skeleton-based action recognition based on feature interaction and adaptive fusion
    Doudou LI, Wanggen LI, Yichun XIA, Yang SHU, Kun GAO
    2023, 43(8):  2581-2587.  DOI: 10.11772/j.issn.1001-9081.2022071105
    Asbtract ( )   HTML ( )   PDF (2179KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    At present, in skeleton-based action recognition task, there still are some shortcomings, such as unreasonable data preprocessing, too many model parameters and low recognition accuracy. In order to solve the above problems, a skeleton-based action recognition method based on feature interaction and adaptive fusion, namely AFFGCN(Adaptively Feature Fusion Graph Convolutional Neural Network), was proposed. Firstly, an adaptive pooling method for data preprocessing to solve the problems of uneven data frame distribution and poor data frame representation was proposed. Secondly, a multi-information feature interaction method was introduced to mine deeper features, so as to improve performance of the model. Finally, an Adaptive Feature Fusion (AFF) module was proposed to fuse graph convolutional features, thereby further improving the model performance. Experimental results show that the proposed method increases 1.2 percentage points compared with baseline method Lightweight Multi-Information Graph Convolutional Neural Network (LMI-GCN) on NTU-RGB+D 60 dataset in both Cross-Subject (CS) and Cross-View (CV) evaluation settings. At the same time, the CS and Cross-Setup (SS) evaluation settings of the proposed method on NTU-RGB+D 120 dataset are increased by 1.5 and 1.4 percentage points respectively compared with those of baseline method LMI-GCN. And the experimental results on single-stream and multi-stream networks show that compared with current mainstream skeleton-based action recognition methods such as Semantics-Guided Neural network (SGN), the proposed method has less parameters and higher accuracy of the model, showing obvious advantages of the model, and that the model is more suitable for mobile device deployment.

    Face liveness detection algorithm based on GhostNet and feature fusion
    Chungang HAN, Yonghui LIU
    2023, 43(8):  2588-2592.  DOI: 10.11772/j.issn.1001-9081.2022071100
    Asbtract ( )   HTML ( )   PDF (2929KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    The wide application of face recognition technology not only brings convenience to users, but also brings problems such as face spoofing and presentation attacks. Aiming at the frequent presentation attacks and print attacks, a face liveness detection algorithm based on GhostNet and feature fusion was proposed. Firstly, the feature extraction process of GhostNet model was divided into three different stages, namely, low-level feature, medium-level feature and high-level feature. Then, the feature map information of each stage was output respectively. Finally, the feature maps with different semantic information were sent into the feature fusion module for adaptive weighted fusion, so as to obtain more discriminative feature mapping. Experiments were conducted on public datasets NUAA and CelebA-Spoof. The results show that the accuracy of the proposed algorithm is 99.97% and 93.41% respectively, which is increased by 8.00 and 9.20 percentage points respectively compared with the algorithm of direct training of GhostNet model. Compared with Heterogeneous Kernel-Convolutional Neural Network (HK-CNN), lightweight convolutional neural network FeatherNet, block based multi-stream network FaceBageNet and other algorithms, the proposed algorithm shows better performance on NUAA and CelebA-Spoof datasets. And, as GhostNet is a lightweight network model, the proposed algorithm only takes 3.6 ms on single image inference on CelebA-Spoof dataset.

    Classification and recognition method of copper alloy metallograph based on feature aggregation
    Xueyu HUANG, Huaiyu HE, Huimin LIN, Jinshui CHEN
    2023, 43(8):  2593-2601.  DOI: 10.11772/j.issn.1001-9081.2022060893
    Asbtract ( )   HTML ( )   PDF (5579KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Focusing on the issue of long delay in detection of copper alloy composition, a classification and recognition method of copper alloy metallograph based on feature aggregation was proposed. Firstly,in the feature extraction stage, the Gray-Level Co-occurrence Matrix (GLCM) and the Residual Network (ResNet) model based on convolutional block attention module were constructed to extract the global and local features of the image, respectively. Secondly, in the feature aggregation stage, the extracted features were normalized and then cascaded in a simple way. Finally, in the classification and recognition stage, a Support Vector Machine (SVM) was used for accurate classification. Experimental results show that the proposed method achieves the accuracy of 98.963% and macro-F1 of 98.996%, which are better than those of machine learning methods based on single feature. It can be seen that the features extracted by different methods can describe the texture and edge information of copper alloy metallographs more comprehensively after aggregation, and the proposed method can identify different copper alloys by metallographs, which improves the accuracy of identification and has good robustness.

    Leukocyte detection method based on twice-fusion-feature CenterNet
    Huan LIU, Lianghong WU, Lyu ZHANG, Liang CHEN, Bowen ZHOU, Hongqiang ZHANG
    2023, 43(8):  2602-2610.  DOI: 10.11772/j.issn.1001-9081.2022071009
    Asbtract ( )   HTML ( )   PDF (4702KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Leukocyte detection is difficult due to different shapes and degrees of staining of leukocytes during real detection process in complex scenarios. To solve the problem, a dual feature fusion CenterNet based leukocyte detection method TFF-CenterNet (Twice-Fusion-Feature CenterNet) was proposed. Firstly, the features of the backbone network were fused with the features of deconvolution layers through Feature Pyramid Network (FPN). In this way, the feature extraction ability of the method was improved to solve the problems of individual differences and different degrees of staining of leukocytes. Then, aiming at the problem of severe imbalance between the image area of leukocytes and the background image area, the heatmap loss function was improved to enhance the focus on positive samples of leukocyte and improve detection mean Average Precision (mAP). Finally, for the characteristics of the tiny target, random location, and cell adhesion of leukocyte images, coordinate attention and coordinate convolution were introduced to improve the attention and sensitivity of leukocyte location information. For leukocytes in complex scenarios, TFF-CenterNet achieves the mAP of 97.01% and the detection speed of 167 frame/s, which are 3.24 percentage points higher and 42 frame/s faster than those of CenterNet respectively. Experimental results show that the proposed method can improve the mAP of leukocyte detection in complex situations while achieving real-time requirements, and improves the robustness, so that this method can provide technical support for rapid automatic leukocyte detection in complementary medical diagnosis.

    Small target detection algorithm for train operating environment image based on improved YOLOv3
    Meijia LIANG, Xinwu LIU, Xiaopeng HU
    2023, 43(8):  2611-2618.  DOI: 10.11772/j.issn.1001-9081.2022091343
    Asbtract ( )   HTML ( )   PDF (5709KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Train assisted driving depends on the real-time detection of train operating environment. There are abundant small targets in the images of train operating environment. Compared with large and medium targets, small targets with the proportion of less than 1% of original image have problems of high missed detection and poor detection accuracy due to low resolution. Therefore, a target detection algorithm based on improved YOLOv3 in train operating environment was proposed, namely YOLOv3-TOEI (YOLOv3-Train Operating Environment Image). Firstly, k-means clustering algorithm was used to optimize the anchor to speed up the convergence of the network. Then, dilated convolution was embedded in DarkNet-53 to expand the receptive field, and Dense convolutional Network (DenseNet) was introduced to obtain richer low-level details of the image. Finally, the unidirectional feature fusion structure of original YOLOv3 was improved to bidirectional and adaptive feature fusion structure, which realized the effective combination of deep and shallow features and improved the detection effect of the network on multi-scale targets (especially small targets). Experimental results show that compared with original YOLOv3 algorithm, YOLOv3-TOEI algorithm has the mean Average Precision (mAP)@0.5 reached 84.5%, which increased by 12.2%, and the Frames Per Second (FPS) of 83, verifying that this algorithm has better detection ability of small targets in images of train operating environment.

    Dam surface disease detection algorithm based on improved YOLOv5
    Shengwei DUAN, Xinyu CHENG, Haozhou WANG, Fei WANG
    2023, 43(8):  2619-2629.  DOI: 10.11772/j.issn.1001-9081.2022081207
    Asbtract ( )   HTML ( )   PDF (7862KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    For the current water conservancy dams mainly rely on manual on-site inspections, which have high operating costs and low efficiency, an improved detection algorithm based on YOLOv5 was proposed. Firstly, a modified multi-scale visual Transformer structure was used to improve the backbone, and the multi-scale global information associated with the multi-scale Transformer structure and the local information extracted by Convolutional Neural Network (CNN) were used to construct the aggregated features, thereby making full use of the multi-scale semantic information and location information to improve the feature extraction capability of the network. Then, coordinate attention mechanism was added in front of each feature detection layer of the network to encode features in the height and width directions of the image, and long-distance associations of pixels on the feature map were constructed by the encoded features to enhance the target localization ability of the network in complex environments. The sampling algorithm of the positive and negative training samples of the network was improved to help the candidate positive samples to respond to the prior frames of similar shape to themselves by constructing the average fit and difference between the prior frames and the ground-truth frames, so as to make the network converge faster and better, thus improving the overall performance of the network and the network generalization. Finally, the network structure was lightened for application requirements and was optimized by pruning the network structure and structural re-parameterization. Experimental results show that on the current adopted dam disease data, compared with the original YOLOv5s algorithm, the improved network has the mAP (mean Average Precision)@0.5 improved by 10.5 percentage points, the mAP@0.5:0.95 improved by 17.3 percentage points; compared to the network before lightening, the lightweight network has the number of parameters and the FLOPs(FLoating point Operations Per second) reduced by 24% and 13% respectively, and the detection speed improved by 42%, verifying that the network meets the requirements for precision and speed of disease detection in current application scenarios.

    Frontier and comprehensive applications
    Iterative learning output consensus of multi-agent systems with feedback control
    Jiaxin WANG, Chenglin LIU
    2023, 43(8):  2630-2635.  DOI: 10.11772/j.issn.1001-9081.2022070976
    Asbtract ( )   HTML ( )   PDF (3046KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    To improve the learning process of multi-agent system and the robustness of the system to external disturbances, an iterative learning consensus control algorithm with feedback control was proposed. Firstly, the learning process of agents was improved by sharing the control input information among agents, and the robustness of the system was improved by designing a feedback controller when there were non-iterative repetitive disturbances outside the system. Then, by using the contraction mapping method, the system consensus was analyzed, and the convergence condition of the algorithm was derived strictly. Finally, the correctness and effectiveness of the algorithm was verified through simulations. Compared with the P-type algorithm, the improved algorithm has higher convergence speed and smoother convergence curve in the presence of external disturbances.

    Air combat maneuver decision-making of unmanned aerial vehicle based on guided Minimax-DDQN
    Yu WANG, Tianjun REN, Zilin FAN
    2023, 43(8):  2636-2643.  DOI: 10.11772/j.issn.1001-9081.2022071069
    Asbtract ( )   HTML ( )   PDF (5213KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    A guided Minimax-DDQN (Minimax-Double Deep Q-Network) algorithm was designed to solve the problems of unpredictable enemy aircraft maneuver strategy and low winning rate, which are caused by the complex environment information and strong confrontation of Unmanned Aerial Vehicle (UAV) in air combat. Firstly, on the basis of Minimax decision-making method, a guided strategy exploration mechanism was proposed. Then, combined with the guided Minimax strategy, a type of DDQN (Double Deep Q-Network) algorithm was designed to improve the update efficiency of Q-network. Finally, an advanced three-stage network training method was proposed. And through confrontation training between different decision models, better optimized decision model was obtained. Experimental results show that compared with Minimax-DQN (Minimax-DQN), Minimax-DDQN and other algorithms, the proposed algorithm has the success rate of chasing straight target improved by 14% to 60% and the winning rate against DDQN algorithm over 60%. It can be seen that compared with algorithms such as DDQN and Minimax-DDQN, the proposed algorithm has stronger decision-making capability and better adaptability in high confrontation combat environment.

    Cooperative obstacle avoidance algorithm based on improved artificial potential field and consensus protocol
    Zhongyuan ZHANG, Wei DAI, Guangyu LI, Xiaoqing CHEN, Qibo DENG
    2023, 43(8):  2644-2650.  DOI: 10.11772/j.issn.1001-9081.2022070967
    Asbtract ( )   HTML ( )   PDF (3281KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Cooperative obstacle avoidance is one of the key technologies of Unmanned Aerial Vehicle (UAV) system. While there exist problems of formation loss, mission failure, and increasing energy consumption during the obstacle avoidance of UAV swarm. For solving these problems, a cooperative obstacle avoidance algorithm based on improved artificial potential field and consensus protocol was proposed. First, according to the control law of multi-rotor UAVs, a control protocol to keep speed and position consistency was designed, and the artificial potential field force was scaled and transformed by normalization and high-order exponents, thereby solving the problem of oscillation failure caused by the excessive variation of the potential field force. Then, the artificial potential field force was introduced to modify the expectation formation of consensus protocol for solving the control conflict problem of the combination algorithm of artificial potential field method and consensus protocol. The proposed algorithm was simulated and compared with the formation division obstacle avoidance algorithm and dynamic window obstacle avoidance algorithm in complex obstacle environment. The results show that the proposed algorithm has the average formation loss degree reduced by 82.60% and 64.38% respectively, the average failure degree of task decreased by 98.66% and 86.01% respectively, and the total length of flight path reduced by 9.95% and 17.63% respectively. It can be seen that the proposed algorithm is suitable for the complex flight environment with multiple obstacles.

2024 Vol.44 No.6

Current Issue
Honorary Editor-in-Chief: ZHANG Jingzhong
Editor-in-Chief: XU Zongben
Associate Editor: SHEN Hengtao XIA Zhaohui
Domestic Post Distribution Code: 62-110
Foreign Distribution Code: M4616
No. 9, 4th Section of South Renmin Road, Chengdu 610041, China
Tel: 028-85224283-803
Website: www.joca.cn
E-mail: bjb@joca.cn
Join CCF