Loading...

Table of Content

    10 October 2024, Volume 44 Issue 10
    Artificial intelligence
    Survey of neural architecture search
    Renke SUN, Zhiyu HUANGFU, Hu CHEN, Zhongnian LI, Xinzheng XU
    2024, 44(10):  2983-2994.  DOI: 10.11772/j.issn.1001-9081.2023101374
    Asbtract ( )   HTML ( )   PDF (3686KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In recent years, deep learning has made breakthroughs in many fields due to its powerful representation capability, and the architecture of neural network is crucial to the final performance. However, the design of high-performance neural network architecture heavily relies on the priori knowledge and experience of the researchers. Because there are a lot of parameters for neural networks, it is difficult to design optimal neural network architecture. Therefore, automated Neural Architecture Search (NAS) gains significant attention. NAS is a technique that uses machine learning to automatically search for optimal network architecture without the need for a lot of human effort, and is an important means of future neural network design. NAS is essentially a search optimization problem, by designing search space, search strategy and performance evaluation strategy, NAS can automatically search the optimal network structure. Detailed and comprehensive analysis, comparison and summary for the latest research progress of NAS were provided from three aspects: search space, search strategy, and performance evaluation strategy, which facilitates readers to quickly understand the development process of NAS. And the future research directions of NAS were proposed.

    Overview of deep metric learning
    Wenze CHAI, Jing FAN, Shukui SUN, Yiming LIANG, Jingfeng LIU
    2024, 44(10):  2995-3010.  DOI: 10.11772/j.issn.1001-9081.2023101415
    Asbtract ( )   HTML ( )   PDF (3329KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    With the rise of deep neural network, Deep Metric Learning (DML) has attracted widespread attention. To gain a deeper understanding of deep metric learning, firstly, the limitations of traditional metric learning methods were organized and analyzed. Secondly, DML was discussed from three types, including types based on sample pairs, proxies, and classification. Divergence methods, ranking methods and methods based on Generative Adversarial Network (GAN) were introduced in detail of the type based on sample pairs. Proxy-based types was mainly discussed in terms of proxy samples and categories. Cross-modal metric learning, intra-class and inter-class margin problems, hypergraph classification, and combinations with other methods (such as reinforcement learning-based and adversarial learning-based methods) were discussed in the classification-based type. Thirdly, various metrics for evaluating the performance of DML were introduced, and the applications of DML in different tasks, including face recognition, image retrieval, and person re-identification, were summarized and compared. Finally, the challenges faced by DML were discussed and some possible solution strategies were proposed.

    Automatically adjusted clustered federated learning for double-ended clustering
    Chunyong YIN, Yongcheng ZHOU
    2024, 44(10):  3011-3020.  DOI: 10.11772/j.issn.1001-9081.2023101475
    Asbtract ( )   HTML ( )   PDF (2248KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Federated Learning (FL) is a distributed machine learning method that aims to jointly train a global model, but the global model is difficult to handle multi-data distribution situations. To deal with the multi-distribution challenge, clustered federated learning was introduced to optimize shared multiple models in a client grouping manner. Among them, server-side clustering was difficult to correct classification errors, while client-side clustering was crucial to the selection of the initial model. To solve these problems, an Automatically Adjusted Clustered Federated Learning (AACFL) framework was proposed, which used double-ended clustering to integrate server-side and client-side clustering. Firstly, double-ended clustering was used to divide client ends into adjustable clusters. Then, local client end identities were adjusted automatically. Finally, the correct client clusters were obtained. AACFL was evaluated on three classical federated datasets under non-independent and identically distributed conditions. Experimental results show that AACFL can obtain correct clusters through adjustment when there are errors in the double-ended clustering results. Compared with FedAvg (Federated Averaging) algorithm, CFL (Clustered Federated Learning), IFCA (Iterative Federated Clustering Algorithm) and other methods, AACFL can effectively improve the model convergence speed and the speed of obtaining correct clustering results, and has the accuracy improved by 0.20-23.16 percentage points on average with the number of clusters is 4 and the number of clients is 100. Therefore, the proposed framework can cluster efficiently and improve model convergence speed and accuracy.

    Agent model for hyperparameter self-optimization of deep classification model
    Rui ZHANG, Junming PAN, Xiaolu BAI, Jing HU, Rongguo ZHANG, Pengyun ZHANG
    2024, 44(10):  3021-3031.  DOI: 10.11772/j.issn.1001-9081.2023091313
    Asbtract ( )   HTML ( )   PDF (2779KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    To further improve the efficiency of hyperparameter multi-objective adaptive optimization of deep classification models, a Filter Enhanced Dropout Agent (FEDA) model was proposed. Firstly, a dual-channel Dropout neural network with enhanced point-to-point mutual information constraint was constructed, to enhance the fitting of high-dimensional hyperparameter deep classification model, and the selection of candidate solution sets was accelerated by combining the aggregation solution selection strategy. Secondly, an FEDA model-A novel preference-based dominance Relation for Multi-Objective Evolutionary Algorithm (FEDA-ARMOEA) combined with model management strategy was designed to balance the convergence and diversity of population individuals, and to assist FEDA in improving the efficiency of deep classification model training and hyperparameter self optimization. Comparative experiments were conducted between FEDA-ARMOEA, EDN-ARMOEA (Efficient Dropout neural Network-assisted AR-MOEA), HeE-MOEA (Heterogeneous Ensemble-based infill criterion for Multi-Objective Evolutionary Algorithm), and other algorithms. Experimental results show that FEDA-ARMOEA performs well on 41 sets in all 56 sets of testing problems. Experiments on industrial application weld data set MTF and public data set CIFAR-10 show that the accuracy of FEDA-ARMOEA optimized classification model is 96.16% and 93.79%, respectively, and the training time is decreased by 6.94%-47.04% and 4.44%-39.07% compared with the contrast algorithms, respectively. All of them are superior to those of the contrast algorithms, which verifies the effectiveness and generalization of the proposed algorithm.

    Robust weight matrix combination selection method of broad learning system
    Han WANG, Yuan WAN, Dong WANG, Yiming DING
    2024, 44(10):  3032-3038.  DOI: 10.11772/j.issn.1001-9081.2023101422
    Asbtract ( )   HTML ( )   PDF (3288KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Broad Learning System (BLS) has excellent computational efficiency and prediction accuracy. However, in the traditional BLS framework, the weight matrix is randomly generated with the risk of unstable learning results. Therefore, a Robust Weight matrix Selection of BLS(RWS-BLS) method was proposed. Firstly, the significant difference of the randomized weight matrix in the overall training error of the samples was revealed by the validation of four sets of function data. Secondly, the combination forms of the weight matrices were studied, the strict optimal restriction of the screening conditions was relaxed, the optimal restriction was converted into the better restriction, the minimum value of the error was limited in a specified range, and the conditions such as the elite combinations were defined. Finally, the combinations of the reliable weight matrices were obtained, so that the influence of randomness was effectively reduced and a robust model could be established. The experimental results show that on 16 sets of simulated data, NORB dataset and 5 sets of UCI regression dataset, with data replacement or subjected to noise perturbation, the proposed method decreases the Mean Square Error (MSE) by 7.32%, 8.73% and 1.63% compared with the BLS method. RWS-BLS provides a direction for model smoothness study for BLS, which can improve the efficiency and stability of models with stochastic parameters, and is useful for other machine learning methods with stochastic parameters.

    Knowledge tracing based on personalized learning and deep refinement
    Linhao LI, Xiaoqian ZHANG, Yao DONG, Xu WANG, Yongfeng DONG
    2024, 44(10):  3039-3046.  DOI: 10.11772/j.issn.1001-9081.2023101452
    Asbtract ( )   HTML ( )   PDF (2200KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In response to the problems that Knowledge Tracing (KT) models do not consider differences between students and explore the high matching between knowledge states and exercises, a two-layer network architecture was proposed — Knowledge Tracing based on Personalized Learning and Deep Refinement (PLDRKT). Firstly, the attention enhancement mechanism was used to obtain a deep refined representation of the exercises. Then, personalized modeling of the initial knowledge state was conducted from the perspectives of different students’ perceptions of difficulty and learning benefits of the exercises. Finally, the initial knowledge states and the deep exercise representations were used to obtain the students’ deep knowledge states and predict their future answering conditions. Comparative experiments were conducted on Statics2011, ASSIST09, ASSIST15, and ASSIST17 datasets among PLDRKT model and seven models such as enhancing Adversarial Training based Knowledge Tracing (ATKT) and ENsemble Knowledge Tracing (ENKT). Experimental results show that the Area Under the Curve (AUC) of PLDRKT model is increased by 0.61, 1.32, 5.29, and 0.19 percentage points, respectively, compared to the optimal baseline models without considering exercise embedding on four datasets. It can be seen that PLDRKT model can model students’ knowledge states and predict answers effectively.

    Reasoning question answering model of complex temporal knowledge graph with graph attention
    Wenjuan JIANG, Yi GUO, Jiaojiao FU
    2024, 44(10):  3047-3057.  DOI: 10.11772/j.issn.1001-9081.2023101391
    Asbtract ( )   HTML ( )   PDF (2228KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In the task of Temporal Knowledge Graph Question Answering (TKGQA), it is a challenge for models to capture and utilize the implicit temporal information in the questions to enhance the complex reasoning ability of the models. To address this problem, a Graph Attention mechanism-integrated Complex Temporal knowledge graph Reasoning question answering (GACTR) model was proposed. The proposed model was pretrained on a temporal Knowledge Base (KB) in the form of quadruples, and a Graph Attention neTwork (GAT) was introduced to effectively capture implicit temporal information in the question. The relationship representation trained by Robustly optimized Bidirectional Encoder Representations from Transformers pretraining approach (RoBERTa) was integrated to enhance the temporal relationship representation of the question. This representation was combined with the pretrained Temporal Knowledge Graph (TKG) embedding, and the final prediction result was the entity or timestamp with the highest score. On the largest benchmark dataset CRONQUESTIONS, compared to the baseline models, Knowledge Graph Question Answering on CRONQUESTIONS(CRONKGQA), the GACTR model achieved improvements of 34.6 and 13.2 percentage points in handling complex question and time answer types, respectively; compared to the Temporal Question Reasoning (TempoQR) model, the improvements were 8.3 and 2.8 percentage points, respectively.

    Dual-branch real-time semantic segmentation network based on detail enhancement
    Qiumei ZHENG, Weiwei NIU, Fenghua WANG, Dan ZHAO
    2024, 44(10):  3058-3066.  DOI: 10.11772/j.issn.1001-9081.2023101424
    Asbtract ( )   HTML ( )   PDF (2649KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Real-time semantic segmentation methods often use dual-branch structures to store shallow spatial information and deep semantic information of images respectively. However, current real-time semantic segmentation methods based on dual-branch structure focus on mining semantic features and ignore the maintenance of spatial features, which make the network unable to accurately capture detailed features such as boundaries and textures of objects in the image, and the final segmentation effect not good. To solve the above problems, a Dual-Branch real-time semantic segmentation Network based on Detail Enhancement (DEDBNet) was proposed to enhance spatial detail information in multiple stages. First, a Detail-Enhanced Bidirectional Interaction Module (DEBIM) was proposed. In the interaction stage between branches, a lightweight spatial attention mechanism was used to enhance the ability of high-resolution feature maps to express detailed information, and promote the flow of spatial detail features on the high and low branches, improving the network’s ability to learn detailed information. Second, a Local Detail Attention Feature Fusion (LDAFF) module was designed to model the global semantic information and local spatial information at the same time in the process of feature fusion at the ends of the two branches, so as to solve the problem of discontinuity of details between feature maps at different levels. In addition, boundary loss was introduced to guide the learning of object boundary information by the network shallow layers without affecting the speed of the model. The proposed network achieved a mean Intersection over Union (mIoU) of 78.2% on the Cityscapes validation set at a speed of 92.3 frame/s, and an mIoU of 79.2% on the CamVid test set at a speed of 202.8 frame/s; compared with Deep Dual Resolution Network (DDRNet-23-slim), the mIoU of the proposed network increased by 1.1 and 4.5 percentage points respectively. The experimental results show that DEDBNet can accurately segment scene images and meet real-time requirements.

    Large language model-driven stance-aware fact-checking
    Yushan JIANG, Yangsen ZHANG
    2024, 44(10):  3067-3073.  DOI: 10.11772/j.issn.1001-9081.2023101407
    Asbtract ( )   HTML ( )   PDF (1036KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    To address the issues of evidence stance imbalance and neglect of stance information in the field of Fact-Checking (FC), a Large Language Model-driven Stance-Aware fact-checking (LLM-SA) method was proposed. Firstly, a series of dialectical claims that differed from the original claim were generated by using a large language model, to capture different perspectives for fact-checking. Secondly, through semantic similarity calculations, the relevances of each piece of evidence sentence to the original claim and the dialectical claim were separately assessed, and the top k sentences with the highest semantic similarity to each of them were selected as the evidence to either support or oppose the original claim, which obtained evidences representing different stances, and helped the fact-checking model integrate information from multiple perspectives and evaluate the veracity of the claim more accurately. Finally, the BERT-StuSE (BERT-based Stance-infused Semantic Encoding network) model was introduced to fully incorporate the semantic and stance information of the evidence through the multi-head attention mechanism and make a more comprehensive and objective judgment on the relationship between the claim and the evidence. The experimental results on the CHEF dataset show that, compared to the BERT method, the Micro F1 value and Macro F1 value of the proposed method on the test set were improved by 3.52 and 3.90 percentage points, respectively, achieving a good level of performance. The experimental results demonstrate the effectiveness of the proposed method, and the significance of considering evidence from different stances and leveraging the stance information of the evidence for enhancing fact-checking performance.

    Contradiction separation super-deduction method and application
    Feng CAO, Xiaoling YANG, Jianbing YI, Jun LI
    2024, 44(10):  3074-3080.  DOI: 10.11772/j.issn.1001-9081.2023101404
    Asbtract ( )   HTML ( )   PDF (1422KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    As a common inference mechanism in the current automated theorem prover, the traditional hyper-resolution method based on binary deduction is limited to only two clauses involved in each deduction step. The separated deduction steps lead to the lack of guidance and prediction of the binary chain deduction, and its deduction efficiency needs to be improved. To improve the efficiency of deduction, in theory, the idea of multi-clause deduction was introduced into the traditional method of super-resolution, the definition and method of the contradiction separation super-deduction were proposed,which had the deduction characteristics of multi-clause, dynamics and guidance. In the implementation of the algorithm, considering that the clause participation in deduction had multi-clause and synergized characteristics, and flexibly setting the deduction conditions, a contradiction separation super-deduction algorithm with backtracking mechanism was proposed. The proposed algorithm was applied to Eprover3.1 prover, taking the International Automated Theorem Prover Competition 2023 and the most difficult problems with a difficulty rating of 1 in the TPTP (Thousands of Problems for Theorem Provers) benchmark database as the test objects. Within 300 s, the Eprover3.1 prover with the proposed algorithm solved 15 theorems more than the original Eprover3.1 prover, and the average proof time was reduced by 1.326 s with the same total number of solved theorems, and 7 theorems with the rating of 1 could be solved. The test results show that the proposed algorithm can be effectively applied to automated theorem proving in first-order logic, improving the proof capability and efficiency of automated theorem prover.

    Complex causal relationship extraction based on prompt enhancement and bi-graph attention network
    Jinke DENG, Wenjie DUAN, Shunxiang ZHANG, Yuqing WANG, Shuyu LI, Jiawei LI
    2024, 44(10):  3081-3089.  DOI: 10.11772/j.issn.1001-9081.2023101486
    Asbtract ( )   HTML ( )   PDF (2643KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    A complex causal relationship extraction model based on prompt enhancement and Bi-Graph ATtention network (BiGAT) — PE-BiGAT (Prompt Enhancement and Bi-Graph Attention Network) was proposed to address the issues of insufficient external information and information transmission forgetting caused by the high density and long sentence patterns of complex causal sentences. Firstly, the result entities from the sentence were extracted and combined with the prompt learning template to form the prompt information, and the prompt information was enhanced through an external knowledge base. Then, the prompt information was input into the BiGAT, the attention layer was combined with syntax and semantic dependency graphs, and the biaffine attention mechanism was used to alleviate feature overlapping and enhance the model’s perception of relational features. Finally, all causal entities in the sentence were predicted iteratively by the classifier, and all causal pairs in the sentence were analyzed through a scoring function. Experimental results on SemEval-2010 task 8 and AltLex datasets show that compared with RPA-GCN (Relationship Position and Attention?Graph Convolutional Network), the proposed model improves the F1 score by 1.65 percentage points, with 2.16 and 4.77 percentage points improvements in chain causal and multi-causal sentences, which confirming that the proposed model has an advantage in dealing with complex causal sentences.

    Data science and technology
    Sequential recommendation based on hierarchical filter and temporal convolution enhanced self-attention network
    Xingyao YANG, Hongtao SHEN, Zulian ZHANG, Jiong YU, Jiaying CHEN, Dongxiao WANG
    2024, 44(10):  3090-3096.  DOI: 10.11772/j.issn.1001-9081.2023091352
    Asbtract ( )   HTML ( )   PDF (1877KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aiming at the problem of noise arising from user’s unexpected interactions in practical recommendation scenarios and the challenge of capturing short-term demand biases due to the dispersed attention in self-attention mechanism, a model namely FTARec (sequential Recommendation based on hierarchical Filter and Temporal convolution enhanced self-Attention network) was proposed. Firstly, hierarchical filter was used to filter noise in the original data. Then, user embeddings were obtained by combining temporal convolution enhanced self-attention networks with decoupled hybrid location encoding. The deficiencies in modeling short-term dependencies among items were supplemented by enhancing the self-attention network with temporal convolution in this process. Finally, contrastive learning was incorporated to refine user embeddings and predictions were made based on the final user embeddings. Compared to existing sequential recommendation models such as the Self-Attentive Sequential Recommendation (SASRec) and the Filter-enhanced Multi-Layer Perceptron approach for sequential Recommendation (FMLP-Rec), FTARec achieves higher Hit Rate (HR) and Normalized Discounted Cumulative Gain (NDCG) on three publicly available datasets: Beauty, Clothing, and Sports. Compared with the suboptimal DuoRec, FTARec has the HR@10 increased by 7.91%, 13.27%, 12.84%, and the NDCG@10 increased by 5.52%, 8.33%, 9.88%, respectively, verifying the effectiveness of the proposed model.

    Fuzzy multi-granularity anomaly detection for incomplete mixed data
    Yuhao TANG, Dezhong PENG, Zhong YUAN
    2024, 44(10):  3097-3104.  DOI: 10.11772/j.issn.1001-9081.2023101419
    Asbtract ( )   HTML ( )   PDF (827KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In view of the inadequacy problem of most existing anomaly detection methods in effectively handling incomplete mixed data, a fuzzy multi-granularity anomaly detection algorithm for incomplete mixed data ADFIIS (Anomaly Detection in Fuzzy Incomplete Information System) was designed, which took into account the presence of missing values in both nominal and numeric attributes,and could handle mixed attribute data. The fuzzy similarity between attributes was defined and then the fuzzy entropy of each attribute was calculated. Based on the entropy values, a multi-granularity approach was employed to construct multiple attribute sequences. Subsequently,the outliers of each sample were calculated to characterize its degree of anomaly. Finally, the corresponding ADFIIS algorithm was designed, and its complexity was analyzed. Experiments were conducted on publicly available datasets, and the proposed algorithm was compared with some mainstream outlier detection algorithms such as ILGNI (Incomplete Local and Global Neighborhood Information network). Experimental results show that ADFIIS has better Receiver Operating Characteristic (ROC) curve performance on incomplete mixed datasets. On average, the Area Under the ROC Curve (AUC) of ADFIIS is better than 90% of the comparison methods. Compared with ILGNI, which can also handle incomplete mixed data, the average AUC of ADFIIS is improved by 7 percentage points. In the proposed algorithm, the model expansion method is used to detect anomalies in incomplete datasets without changing the original datasets, which expands the application scope of anomaly detection.

    Incomplete ordinal preference prediction using mixture of Plackett-Luce models
    Shengmin ZHENG, Xiaodong FU
    2024, 44(10):  3105-3113.  DOI: 10.11772/j.issn.1001-9081.2023101378
    Asbtract ( )   HTML ( )   PDF (995KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    When aggregating the preferences of different users, the problem of inconsistent evaluation criteria among users can be solved based on ordinal preferences. However, users are unable to provide complete ordinal preferences due to the large number of candidate programs and high communication costs, which affects the reliability and accuracy of aggregation results in scenarios such as online service reputation measurement and group decision making. Therefore, there is a need to predict users’ complete ordinal preferences, but existing prediction methods do not fully consider the diversity of user group preference distribution. To address this problem, a Mixture of Plackett-Luce (PL) Preference Prediction for incomplete ordinal preference (MixPLPP) was proposed. First, the linear extensions were sampled based on the user’s existing preferences. Then, a mixture of PL models was learned using the sampled linear extensions. Next, a model selection strategy based on maximization of posterior probability was designed to select a model for the user. Finally, the user’s complete preferences were predicted based on the selected model. The experimental results on the public dataset Movielens show that the proposed method improves the prediction accuracy and Kendall rank Correlation Coefficient (Kendall CC) by 5.0% and 9.2% compared to VSRank (Vector Similarity Rank) algorithm; 1.5% and 3.5% compared to Certainty-based Preference Completion (CPC); 0.9% and 2.2% compared to BayesMallows-4. The experimental results verify that the proposed method has good prediction ability and shows better prediction effect on multiple datasets and multiple measurements.

    Cyber security
    Review of histogram publication methods based on differential privacy
    Xuebin CHEN, Liyang SHAN, Rumin GUO
    2024, 44(10):  3114-3121.  DOI: 10.11772/j.issn.1001-9081.2023101520
    Asbtract ( )   HTML ( )   PDF (1422KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In the era of digital economy, data publication plays a crucial role in data sharing. Histogram data publication is a common method for data publication. However, histogram data publication faces privacy leakage issues. To address this concern, research has been conducted on histogram data publication methods based on Differential Privacy (DP). Firstly, a brief description of DP and histogram properties, as well as the research on histogram publication methods for both static datasets and streaming data in the past five years both at home and abroad, was provided, and the balance among the grouping number and types of histograms, noise and grouping errors in static data, as well as privacy budget allocation problem, were discussed. Secondly, the issues of data sampling, data prediction, and sliding windows for dynamic data grouping were explored. Additionally, for the DP histogram publication methods oriented to interval tree structures were investigated, the original data was transformed into tree structures, and the discussions about tree-structured data noise addition, tree-structure based optimization, and privacy budget allocation for tree structures were conducted. Moreover, the feasibility and privacy aspects of published histogram data, as well as the issues of query range and accuracy of published histogram data, were discussed. Finally, comparative analysis was conducted on relevant algorithms and their advantages and disadvantages were summarized, quantitative analysis and applicable scenarios for some algorithms were provided, and the future research directions of DP-based histograms in various data scenarios were prospected.

    Patient-centric medical information sharing scheme based on IPFS and blockchain
    Xiaoyu DU, Shuaiqi LIU, Zhijie HAN, Zhenxiang HUO, Yujing WANG
    2024, 44(10):  3122-3133.  DOI: 10.11772/j.issn.1001-9081.2023101398
    Asbtract ( )   PDF (4391KB) ( )  
    References | Related Articles | Metrics

    The storage and sharing of Electronic Medical Records (EMR) among healthcare institutions plays a crucial role in achieving cross-hospital diagnosis and hierarchical treatment, effectively reducing the burden on patients and avoiding redundant examinations. To address the difficulty of securely storing and sharing EMR, a patient-centric scheme for secure storage and efficient sharing of EMR, based on the InterPlanetary File System (IPFS) and blockchain, named Patient-Centric Medical Information Sharing based on IPFS and Blockchain (PCIB-MIS), was proposed. First, a hybrid encryption strategy was employed, securely storing and sharing EMR while reducing the time for encryption and decryption. Then, the ciphertext index of EMR was stored using blockchain technology. Next, a combination of consortium and private blockchains was utilized to decrease storage pressure, with EMR indices being stored on hospitals’ private chains. Finally, EMR ciphertext was stored on IPFS, ensuring data security and immutability. When retrieval of EMR across hospitals was needed, cross-chain calls and proxy re-encryption centered around the consortium chain were conducted. Security analysis and experimental results demonstrate that only authorized physicians could access patient records. Compared to the public key encryption algorithm RSA (Rivest-Shamir-Adleman) scheme, the encryption and decryption time are reduced to milliseconds, and storing EMR on this system saves 98.8% of block storage space compared to storing it solely on the blockchain. The proposed scheme effectively achieves secure storage and sharing of medical records, substantially compresses EMR encryption and decryption time, and alleviates blockchain storage pressure.

    Construction and benchmark detection of multimodal partial forgery dataset
    Shengyou ZHENG, Yanxiang CHEN, Zuxing ZHAO, Haiyang LIU
    2024, 44(10):  3134-3140.  DOI: 10.11772/j.issn.1001-9081.2023101506
    Asbtract ( )   HTML ( )   PDF (1323KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aiming at the lack of multimodal forgery scenarios and partial forgery scenarios in existing video forgery datasets, a multimodal partial forgery dataset with adjustable forgery ratios — PartialFAVCeleb was constructed by using a wide varieties of audio and video forgery methods. The proposed dataset was based on the FakeAVCeleb multimodal forgery dataset and was with the real and forged data spliced, in which the forgery data were generated by four methods, that is, FaceSwap, FSGAN (Face Swapping Generative Adversarial Network), Wav2Lip (Wave to Lip), and SV2TTS (Speaker Verification to Text-To-Speech). In the splicing process, probabilistic methods were used to generate the locations of the forgery segments in the time domain and modality, then the boundary was randomized to fit the actual forged scenario. And, the phenomenon of background hopping was avoided through material screening. The finally obtained dataset contains forgery videos of different ratios, with one ratio corresponding to 3 970 video data. In the benchmark detection, several audio and video feature extractors were used. And the data was tested in strong supervised and weakly-supervised conditions respectively, and Hierarchical Multi-Instance Learning (HMIL) method was used to realize the latter condition. As the test results indicate, for each test model, the performance on data with low forgery ratio is significantly inferior to that on data with high forgery ratio, and the performance under weakly-supervised condition is significantly inferior to that under strong supervised condition. The difficulty of weakly-supervised detection of proposed partial forgery dataset is verified. Experimental results show that the multimodal partial forgery scenario represented by the proposed dataset has sufficient research value.

    Advanced computing
    Quantum intermediate representation and translation based on power-of-two matrix
    Wenxuan TAO, Gang CHEN
    2024, 44(10):  3141-3150.  DOI: 10.11772/j.issn.1001-9081.2023091358
    Asbtract ( )   HTML ( )   PDF (970KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    All quantum gates, states and measurement operators in qubit systems can be represented as power-of-two matrices. However, existing quantum programming frameworks have not taken it into account. Therefore, a power-of-two matrix type system was proposed, and a corresponding quantum intermediate representation was designed. Firstly, the power-of-two matrix system was implemented using recursive dual structure in the theorem prover Coq. It accurately described quantum gates, states and measurement operators. Then, a quantum intermediate representation was designed as a programming tool, capable of automatically translating quantum programs into power-of-two matrix expressions. Finally, the writing and translation process of the quantum Fourier transform was demonstrated. The power-of-two matrix system provides a more accurate and concise type system for quantum programming frameworks. The quantum intermediate representation facilitates a transition from power-of-two matrices to programming languages, serving as an effective means to write quantum programs in the power-of-two matrix system.

    Dynamic surface asymptotic compensation algorithm for multi-agent systems
    Antai SUN, Ye LIU, Dongmei XU
    2024, 44(10):  3151-3157.  DOI: 10.11772/j.issn.1001-9081.2023101414
    Asbtract ( )   HTML ( )   PDF (2187KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aiming at a class of cooperative control problems for multi-agent systems with hysteresis inputs, an asymptotic control compensation algorithm of neural network finite-time performance based on a dynamic surface was designed. Firstly, Funnel control was combined with a finite-time performance function to ensure that the consensus error could enter a predefined range in finite time. Second, the unfavorable effects of unknown nonlinear functions within the system and unknown external perturbations were eliminated using Radial Basis Function Neural Network (RBFNN) as well as inequality transformations. In addition, by estimating the upper bounds of some unknown variables, the number of adaptive laws required in the design process was greatly reduced. At the same time, a nonlinear filter with hyperbolic tangent function was proposed to avoid the problem of “differential explosion” in the traditional backstepping control, and eliminate the filter error. Finally, a hysteresis pseudo-inverse compensation signal was designed based on the proposed nonlinear filter to effectively compensate the unknown hysteresis without constructing the hysteresis inverse. Using the Lyapunov stability theory, it is verified that all signals within the closed-loop system are bounded and the consensus error converges to zero asymptotically. Simulation examples also show the effectiveness of the proposed algorithm.

    Symbolic regression method for integer sequence based on self-learning
    Kaiming SUN, Dongfeng CAI, Yu BAI
    2024, 44(10):  3158-3166.  DOI: 10.11772/j.issn.1001-9081.2023101427
    Asbtract ( )   HTML ( )   PDF (1820KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aiming at the problem that existing symbolic regression methods are difficult to effectively generalize to sequences in the On-line Encyclopedia of Integer Sequences (OEIS), a symbolic regression method for integer sequence based on Self-Learning (SL) was proposed. Firstly, a variety of learning data were constructed through programs, and integrated into high-order linear recursive data according to the characteristics of OEIS data, and the OEIS initial term was used to generate recursive sequences. Secondly, the learning data were converted into OEIS data, and a strategy of fusing multiple OEIS data as the data of initial iteration was proposed. Finally, the formulas of the OEIS sequences were gradually discovered through self-learning iteration. The iteration process was divided into four stages: Learn, Search, Check and Select. Experimental results show that the proposed method is better than the Deep Symbolic Regression (DSR) method and Mathematica’s built-in function. Compared with the DSR on the three test sets of Easy, Sign and Base, the accuracy of the proposed method improved by 9.66, 4.17, and 5.14 percentage points respectively. A total of 27 433 formulas of the OEIS sequence were found. The newly discovered formulas can assist mathematicians in conducting related theoretical research.

    Computer software technology
    Predictive business process monitoring method based on concept drift
    Hua HUANG, Ziyi YANG, Xiaolong LI, Chuang LI
    2024, 44(10):  3167-3176.  DOI: 10.11772/j.issn.1001-9081.2023101460
    Asbtract ( )   HTML ( )   PDF (2286KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    A Predictive business Process Monitoring (PPM) method based on concept drift was proposed to solve the problems of model accuracy decreasing over time and poor real-time performance in existing Business Process Monitoring (BPM) methods. Firstly, the event log data was preprocessed and encoded. Secondly, a Bidirectional Long Short-Term Memory (Bi-LSTM) network model was used to capture enough sequence information from both forward and backward directions in order to build the business process model. At the same time, the attention mechanism was utilized to fully consider the contributions of different events to the model’s prediction results, and assign different weights to event logs to reduce the influence of noise on the prediction results. Finally, the executing instances were input into the constructed model to obtain the predicted execution results, and the results were used as historical data to fine-tune the model. By testing on 8 publicly available and real datasets, the results show that the proposed method has an average prediction accuracy improvement of 5.4%-23.8% compared to existing BPM methods such as Support Vector Machine (SVM), Logistic Regression (LR), and Random Forest (RF), and the proposed method also outperforms existing research methods in terms of earliness and timeliness.

    Multimedia computing and computer simulation
    Robust splicing forensic algorithm against high-intensity salt-and-pepper noise
    Pengbo WANG, Wuyang SHAN, Jun LI, Mao TIAN, Deng ZOU, Zhanfeng FAN
    2024, 44(10):  3177-3184.  DOI: 10.11772/j.issn.1001-9081.2023101462
    Asbtract ( )   HTML ( )   PDF (2871KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In the field of image forensics, image splicing detection technology can identify splicing and locate the splicing area through the analysis of image content. However, in common scenarios like transmission and scanning, salt-and-pepper (s&p) noise appears randomly and inevitably, and as the intensity of the noise increases, the current splicing forensic methods lose effectiveness progressively and might ultimately fail, thereby significantly impacting the effect of existing splicing forensic methods. Therefore, a splicing forensic algorithm against high-intensity s&p noise was proposed. The proposed algorithm was divided into two main parts: preprocessing and splicing forensics. Firstly, in the preprocessing part, a fusion of the ResNet32 and median filter was employed to remove s&p noise from the image, and the damaged image content was restored through the convolutional layer, so as to minimize the influence of s&p noise on splicing forensic part and restore image details. Then, in the splicing forensics part, based on the Siamese network structure, the noise artifacts associated with the image’s uniqueness were extracted, and the spliced area was identified through inconsistency assessment. Experimental results on widely used tampering datasets show that the proposed algorithm achieves good results on both RGB and grayscale images. In a 10% noise scenario, the proposed algorithm increases the Matthews Correlation Coefficient (MCC) value by over 50% compared to FS(Forensic Similarity) and PSCC-Net(Progressive Spatio-Channel Correlation Network) forensic algorithms, validating the effectiveness and advancement of the proposed algorithm in forensic analysis of tampered images with noise.

    ORB-SLAM2 algorithm based on dynamic feature point filtering and optimization of keyframe selection
    Xukang KAN, Gefei SHI, Xuerong YANG
    2024, 44(10):  3185-3190.  DOI: 10.11772/j.issn.1001-9081.2023101465
    Asbtract ( )   HTML ( )   PDF (3326KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    The Simultaneous Localization And Mapping (SLAM) algorithm suffers from a decrease in localization accuracy when moving targets appear. Introducing instance segmentation and other algorithms can handle dynamic scenes, but it is difficult to ensure the real-time performance of SLAM algorithm. Additionally, camera shake during motion may lead to inaccurate keyframe selection and tracking loss. In response to the issues, an ORB-SLAM2 algorithm based on dynamic feature point filtering and optimization of keyframe selection was proposed to ensure the real-time performance of SLAM algorithm, and reduce the influence of dynamic feature points on the positioning accuracy of SLAM algorithm effectively. And simultaneously, the issue of inaccurate keyframe selection caused by camera shake was addressed. In the proposed algorithm, YOLOv5 algorithm was introduced on the basis of ORB-SLAM2 algorithm to identify moving targets. In the tracking thread, dynamic target feature points were filtered out, thereby achieving a balance between real-time performance and positioning accuracy of the algorithm. At the same time, a discriminative criterion based on inter-frame relative motion quantity was proposed for keyframe selection, thereby enhancing the accuracy of keyframe selection. Experimental results on freiburg3_walking_xyz dataset indicate that compared to ORB-SLAM2 algorithm, the proposed algorithm has a 38.54% reduction in average processing time and a 95.2% improvement in Root Mean Square Error (RMSE) accuracy of absolute trajectory error. It can be seen that the proposed algorithm can address the issues mentioned above effectively, enhance the positioning accuracy and precision of SLAM algorithm, and then improve the usability of the maps.

    YOLOv7-MSBP target location algorithm for character recognition of power distribution cabinet
    Cheng WANG, Yang WANG, Yingjiao RONG
    2024, 44(10):  3191-3199.  DOI: 10.11772/j.issn.1001-9081.2023101496
    Asbtract ( )   HTML ( )   PDF (3829KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Accurately locating the instrument position of power distribution cabinet through machine vision is the key to realize intelligent identification of instruments. Aiming at the problem of low target positioning accuracy caused by complex background of power distribution cabinet, various character scales and small camera pixels, a YOLOv7-MSBP target location algorithm for character recognition of power distribution cabinet was proposed. Firstly, a Micro-branch detection branch was designed and the initial anchor box laying interval was changed to improve the detection accuracy for small targets. Secondly, Bi-directional Feature Pyramid Network (BiFPN) was introduced to fuse the feature values of different layers across scales, thereby improving the situations of the loss of detailed features and insufficient feature fusion caused by downsampling. Meanwhile, Synchronous Convolutional Block Attention Module (Syn-CBAM) was designed, channel and spatial attention features were fused with weights, then the feature extraction ability of the algorithm was improved. And a Partial Convolution (PConv) module was introduced in the backbone network to reduce model redundancy and delay, and increase detection speed. Finally, the positioning results of YOLOv7-MSBP were sent to Paddle OCR (Optical Character Recognition) model for character recognition. Experimental results show that the mean Average Precision (mAP) of YOLOv7-MSBP algorithm reaches 93.2%, which is 4.3 percentage points higher than that of YOLOv7 algorithm. It can be seen that the proposed algorithm can locate and recognize the characters of the power distribution cabinet quickly and accurately, which verifies the effectiveness of the proposed algorithm.

    Few-shot cervical cell classification combining weighted prototype and adaptive tensor subspace
    Li XIE, Weiping SHU, Junjie GENG, Qiong WANG, Hailin YANG
    2024, 44(10):  3200-3208.  DOI: 10.11772/j.issn.1001-9081.2023101416
    Asbtract ( )   HTML ( )   PDF (2195KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Deep learning image classification algorithms rely on a large amount of training data typically. However, for cervical cell classification tasks in the medical field, collecting large amount of image data is difficult. To accurately classify cervical cells with a limited number of image samples, a few-shot classification algorithm Combining Weighted Prototype and Adaptive Tensor Subspace (CWP-ATS) was proposed. Firstly, the pre-training technique was combined with meta-learning to ensure that the feature extraction network learned more priori knowledge from the meta-training set. Subsequently, the maximum mean discrepancy algorithm was adopted in the prototype computation procedure to assign appropriate weight to each support set sample, and the transductive learning algorithm was further employed for making corrections and obtaining more accurate prototypes. Finally, the multilinear principal component analysis algorithm was utilized to project each class of samples into their respective low-dimensional tensor subspaces, enabling efficient adaptive subspace classifiers in the low-dimensional space to be learned without compromising the natural structures of the original tensor features. In the 2-way 10-shot and 3-way 10-shot classification tasks of few-shot Herlev cervical cell images, compared with the DeepBDC (Deep Brownian Distance Covariance) algorithm, CWP-ATS improved classification accuracy by 2.43 and 3.23 percentage points, respectively; when 30% samples of the meta-test set were interfered by noise, in comparison with the prototype network, the classification accuracy of CWP-ATS was improved by more than 20 percentage points. The experimental results demonstrate that the proposed algorithm can effectively improve the classification accuracy and robustness of few-shot cervical cells.

    Fish image classification based on positional overlapping patch embedding and multi-scale channel interactive attention
    Wen ZHOU, Yuzhang CHEN, Zhiyuan WEN, Shiqi WANG
    2024, 44(10):  3209-3216.  DOI: 10.11772/j.issn.1001-9081.2023101466
    Asbtract ( )   HTML ( )   PDF (2604KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Underwater fish image classification is a highly challenging task. The traditional Vision Transformer (ViT) network backbone is limited to process local continuous features, and it does not perform well in fish classification with lower image quality. To solve this problem, a Transformer-based image classification network based on Overlapping Patch Embedding (OPE) and Multi-scale Channel Interactive Attention (MCIA), called PIFormer (Positional overlapping and Interactive attention transFormer), was proposed. PIFormer was built in a multi-layer format with each layer stacked at different times to facilitate the extraction of features at different depths. Firstly, the deep Positional Overlapping Patch Embedding (POPE) module was introduced to overlap and slice the feature map and edge information, so as to retain the local continuous features of the fish body. At the same time, position information was added for sorting, thereby helping PIFormer integrate the detailed features and build the global map. Then, the MCIA module was proposed to process the local and global features in parallel, and establish the long-distance dependencies of different parts of the fish body. Finally, the high-level features were processed by Group Multi-Layer Perceptron (GMLP) to improve the efficiency of the network and realize the final fish classification. To verify the effectiveness of PIFormer, a self-built dataset of freshwater fishes in East Lake was proposed, and the public datasets Fish4Knowledge and NCFM (Nature Conservancy Fisheries Monitoring) were used to ensure experimental fairness. Experimental results demonstrate that the Top-1 classification accuracy of the proposed network on each dataset reaches 97.99%, 99.71% and 90.45% respectively. Compared with ViT, Swin Transformer and PVT (Pyramid Vision Transformer) of the same depth, the proposed network has the number of parameters reduced by 72.62×106, 14.34×106 and 11.30×106 respectively, and the FLoating point Operation Per second (FLOPs) saved by 14.52×109, 2.02×109 and 1.48×109 respectively. It can be seen that PIFormer has strong fish image classification capability with reduced computational burden, achieving superior performance.

    Dual branch synthetic speech detection based on attention and squeeze-excitation inception
    Han WANG, Lasheng ZHAO, Qiang ZHANG, Yinqing CHENG, Zepeng QIU
    2024, 44(10):  3217-3222.  DOI: 10.11772/j.issn.1001-9081.2023101458
    Asbtract ( )   HTML ( )   PDF (1218KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Synthetic speech attacks can pose a significant threat to people’s lives. To address the issues of the existing models’ lack of the ability to extract key information from redundant data and the limitations of a single model in using the advantages of multiple detection models, a synthetic speech detection model based on Dual branch with Attention Branch and Squeeze-Excitation Inception (SE-Inc) Branch (Dual-ABIB) was proposed. Firstly, the initial feature maps extracted by Sinc-based Convolutional Neural Network (SincNet) were utilized to train the attention branch of the synthetic speech detection model, and the attention maps were output. Secondly, the attention maps were multiplied and superposed with the original feature maps, and the result was trained as the input for the SE-Inc branch. Finally, classification scores obtained by the two branches were processed through decision-level weighted fusion to achieve synthetic speech detection. Experimental results show that the proposed model achieves a minimum tandem Detection Cost Function (min t-DCF) of 0.033 2 and an Equal Error Rate (EER) of 1.15% on ASVspoof2019 dataset when the number of parameters is 539×103. Compared with SE-ResABNet (Squeeze-Excitation ResNet Attention Branch Network), when the number of parameters of the proposed model is only 56% of that of SE-ResABNet, the proposed model has the min t-DCF and EER reduced by 34.5% and 39.2% respectively. At the same time, the proposed model shows better generalization ability on ASVspoof2015 and ASVspoof2021 datasets. The above results verify that Dual-ABIB can obtain lower min t-DCF and EER with less of parameters.

    Frontier and comprehensive applications
    Industrial chain risk assessment and early warning model combining hierarchical graph neural network and long short-term memory
    Xiaoyu HUA, Dongfen LI, You FU, Kejun BI, Shi YING, Ruijin WANG
    2024, 44(10):  3223-3231.  DOI: 10.11772/j.issn.1001-9081.2023101387
    Asbtract ( )   HTML ( )   PDF (2113KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Industrial chain risk assessment and early warning are essential measures to effectively protect the interests of upstream and downstream companies in the industrial chain and mitigate company risks. However, existing methods often need to pay more attention to the propagation effects between upstream and downstream companies in the industrial chain and the opacity of company information, resulting in inaccurate risk assessment of companies and the failure to perceive risks in advance for early warning. To address the above problems, the Hierarchical Graph Neural Network (HiGNN), an industrial chain risk assessment and early warning model that combined Hierarchical Graph (HG) neural network and Long Short-Term Memory (LSTM), was proposed. Firstly, an “industrial chain-investment” HG was constructed based on the relationships between upstream and downstream companies and investment activities. Then, a financial feature extraction module was utilized to extract features from multi-quarter financial data of companies, while an investment feature extraction module was utilized to extract features from the investment relationship graph. Finally, an attention mechanism was employed to integrate the financial features with the investment features, enabling risk classification of company nodes through graph representation learning methods. The experimental results on a real integrated circuit manufacturing dataset showed that compared with the Graph ATtention network (GAT) model and the Recurrent Neural Network (RNN) model, the accuracy of the proposed model increased by 14.87% and 22.10%, and the F1-score increased by 12.63% and 16.67% with the 60% training ratio. The proposed model can effectively capture the contagion effect in the industrial chain and improve risk identification capability, which is superior to traditional machine learning methods and graph neural network methods.

    Joint optimization of UAV swarm path planning and task allocation balance in earthquake scenarios
    Jian SUN, Baoquan MA, Zhuiwei WU, Xiaohuan YANG, Tao WU, Pan CHEN
    2024, 44(10):  3232-3239.  DOI: 10.11772/j.issn.1001-9081.2023101432
    Asbtract ( )   HTML ( )   PDF (1573KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Unmanned Aerial Vehicle (UAV) swarm path planning and task allocation are the cores of UAV swarm rescue applications. However, traditional methods solve path planning and task allocation separately, resulting in uneven resource allocation. In order to solve the above problem, combined with the physical attributes and application environmental factors of UAV swarm, the Ant Colony Optimization (ACO) was improved, and a Joint Parallel ACO (JPACO) was proposed. Firstly, the pheromone was updated by the hierarchical pheromone enhancement coefficient mechanism to improve the performance of JPACO task allocation balance and energy consumption balance. Secondly, the path balance factor and dynamic probability transfer factor were designed to optimize the ant colony model, which is easy to fall into local convergence, so as to improve the global search capability of JPACO. Finally, the cluster parallel processing mechanism was introduced to reduce the time consumption of JPACO operation. JPACO was compared with Adaptive Dynamic ACO (ADACO), Scanning Motion ACO (SMACO), Greedy Strategy ACO (GSACO) and Intersecting ACO (IACO) in terms of optimal path, task allocation balance, energy consumption balance and operation time on the open dataset CVRPLIB. Experimental results show that the average value of the optimal paths of JPACO is 7.4% and 16.3% lower than of IACO and ADACO respectively in processing small-scale operations. Compared with GSACO and ADACO, JPACO has the solution time reduced by 8.2% and 22.1% in large-scale operations. It is verified that JPACO can improve the optimal path when dealing with small-scale operations, and is obviously superior to the comparison algorithms in terms of task allocation balance, energy consumption balance, and operation time consumption when processing large-scale operations.

    Double auction carbon trading based on consortium blockchain
    Chaoying YAN, Ziyi ZHANG, Yingnan QU, Qiuyu LI, Dixiang ZHENG, Lijun SUN
    2024, 44(10):  3240-3245.  DOI: 10.11772/j.issn.1001-9081.2023101433
    Asbtract ( )   HTML ( )   PDF (1654KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Carbon trading is an important way to reduce greenhouse gas emission and develop a low-carbon economy. In the process of traditional carbon trading, there are mainly problems such as wide distribution of subjects, poor data interoperability and low efficiency. Taking the consortium blockchain with access mechanism as the infrastructure of carbon trading, the security and traceability of transaction data can be ensured. Therefore, a double auction transaction algorithm based on consortium blockchain was proposed, which was divided into two phases considering user satisfaction. In the first phase, the price ranges of quotation were proposed by all nodes, and the transactions were concluded immediately by the nodes meeting the conditions in this phase. Then, the remaining transaction nodes entered the second phase, where the transaction volume matching degree was calculated, and the overall user satisfaction was taken as the optimization goal, so that, the result with the maximum overall satisfaction was output. The proposed algorithm was compared with the Hungarian algorithm and the Gale-Shapley (GS) algorithm. Experimental results showed that the proposed algorithm improved user satisfaction, and the average matching time was reduced by 26.2% and 36.0%, respectively. With the HLF (HyperLedger Fabric) used to deploy a double auction algorithm smart contract to calculate and process user transaction requests automatically, and the transaction results recorded on the channel ledger of the consortium blockchain through consensus, the proposed algorithm can achieve stable transaction throughput under different block sizes and transaction requests.

    Self-recovery adaptive Monte Carlo localization algorithm based on support vector machine
    Enbao QIAO, Xiangyang GAO, Jun CHENG
    2024, 44(10):  3246-3251.  DOI: 10.11772/j.issn.1001-9081.2023101389
    Asbtract ( )   HTML ( )   PDF (2828KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    The localization technology of robots is crucial for the efficient, precise, and safe operation of intelligent robots. However, in actual localization processes, robots often encounter the “kidnapping” problem. In response to this challenge, a Self-Recovery Adaptive Monte Carlo Localization (AMCL) algorithm based on Support Vector Machine (SVM-SRAMCL) was proposed. Firstly, a detection model was constructed to identify the “kidnapping” state of the robot, known as the Kidnapping Detection Model based on SVM (SVM-KDM). Then, particle characteristic values were calculated from the particle set obtained through the AMCL algorithm and used as inputs for SVM-KDM. Once a “kidnapping” event was detected, an Extended Kalman Filter (EKF) was employed to fuse data from the Inertial Measurement Unit (IMU) and Odometry (Odom) to estimate the robot’s new pose. Finally, the AMCL algorithm was utilized for particle prediction, update, and resampling, ultimately achieving the robot’s relocalization. Compared to the Self-Recovery Monte Carlo Localization (SR-MCL) algorithm, the proposed algorithm reduced 4.1 required updates for post-kidnapping recovery and increased the success rate of relocalization by 3 percentage points. The experimental results validate the higher efficiency and success rate of the proposed algorithm when addressing the “kidnapping” issue in the localization of mobile robots.

    The 40th CCF National Database Conference (NDBC 2023)
    Recommendation method using knowledge graph embedding propagation
    Beijing ZHOU, Hairong WANG, Yimeng WANG, Lisi ZHANG, He MA
    2024, 44(10):  3252-3259.  DOI: 10.11772/j.issn.1001-9081.2023101508
    Asbtract ( )   HTML ( )   PDF (1719KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    According to the richness of user and item information in Knowledge Graph (KG), the existing recommendation methods with graph embedding propagation can be summarized into three categories: user embedding propagation, item embedding propagation, and hybrid embedding propagation. The user embedding propagation method focuses on using items interacted with users and KG to learn user representations; the item embedding propagation method uses entities in KG to represent items; the hybrid embedding propagation method integrates user-item interaction information and KG, addressing the issue of insufficient information utilization in the first two methods. The technical characteristics of these three methods were deeply compared by specifically analyzing the key technologies of the three core tasks in the recommendation methods with graph embedding propagation: graph construction, embedding propagation, and prediction. At the same time, by replicating mainstream models in each category of methods on general datasets such as MovieLens, Booking-Crossing, and Last.FM, and comparing their effects using the CTR (Click-Through Rate) metric, it is found that the recommendation method with hybrid embedding propagation has the best recommendation performance. It combines the advantages of user and item embedding propagation methods, utilizing interaction information and KG to enhance the representations of both users and items. Additionally, a comparative analysis of various categories of methods was performed, their advantages and disadvantages were elaborated, and the future research work was also proposed.

    Information diffusion prediction model of prototype-aware dual-channel graph convolutional neural network
    Nengqiang XIANG, Xiaofei ZHU, Zhaoze GAO
    2024, 44(10):  3260-3266.  DOI: 10.11772/j.issn.1001-9081.2023101557
    Asbtract ( )   HTML ( )   PDF (1549KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aiming at the problem that existing information diffusion prediction models are difficult to mine users’ dependency on cascades, a Prototype-aware Dual-channel Graph Convolutional neural Network (PDGCN) information diffusion prediction model was proposed. Firstly, HyperGraph Convolutional Network (HGCN) was used to learn user representation and cascade representation based on cascade hypergraph level, while Graph Convolutional Network (GCN) was used to learn user representation based on dynamic friendship forwarding graph. Secondly, for a given target cascade, the user representations that met the current cascade were found from the above two levels of user representations, and the two representations were fused together. Thirdly, the prototype of cascade representation was obtained through clustering algorithm. Finally, the most matching prototype for the current cascade was found, and this prototype was integrated into each user representation in the current cascade to calculate the diffusion probability of candidate users. Compared with Memory-enhanced Sequential HyperGraph ATtention network (MS-HGAT), PDGCN improved Hits@100 by 1.17% and MAP@100 by 5.02% on Twitter dataset, and improved Hits@100 by 3.88% and MAP@100 by 0.72% on Android dataset. Experimental results show that the proposed model outperforms the comparison model in information diffusion prediction task and has better prediction performance.

    Multi-view clustering network guided by graph contrastive learning
    Yunhua ZHU, Bing KONG, Lihua ZHOU, Hongmei CHEN, Chongming BAO
    2024, 44(10):  3267-3274.  DOI: 10.11772/j.issn.1001-9081.2023101481
    Asbtract ( )   HTML ( )   PDF (2492KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Multi-view clustering has attracted much attention due to its ability to utilize information from multiple perspectives. However, current multi-view clustering algorithms generally suffer from the following issues: 1) they focus on either attribute features or structural features of the data without fully integrating both to improve the quality of the latent embeddings; 2) methods based on graph neural networks can simultaneously utilize attribute and structural data, but the models based on graph convolution or graph attention tend to produce over-smoothed results when the network becomes too deep. To address these problems, a Multi-view Clustering Network guided by Graph Contrastive Learning (MCNGCL) was proposed. Firstly, the private representation of each view was captured using a multi-view autoencoder module. Secondly, a common representation was constructed through adaptively weighted fusion. Thirdly, the graph contrastive learning module was incorporated to make adjacent nodes more easily partitioned into the same cluster during clustering, while also alleviating the over-smoothing problem when aggregating neighbor node information. Finally, a self-supervised clustering module was used to optimize the common representation and private representations of views towards more favorable clustering directions. The experimental results demonstrate that MCNGCL achieves promising performance on multiple datasets. For instance, on the 3sources dataset, compared with the sub-optimal Consistent Multiple Graph Embedding for multi-view Clustering (CMGEC), the accuracy of MCNGCL improved by 2.83 percentage points and the Normalized Mutual Information (NMI) improved by 3.70 percentage points. The effectiveness of MCNGCL was also confirmed by the results of ablation experiments and parameter sensitivity analysis.

    Aspect sentiment triplet extraction integrating semantic and syntactic information
    Yanbo LI, Qing HE, Shunyi LU
    2024, 44(10):  3275-3280.  DOI: 10.11772/j.issn.1001-9081.2023101479
    Asbtract ( )   HTML ( )   PDF (1353KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aspect Sentiment Triplet Extraction (ASTE) is a challenging subtask in aspect-based sentiment analysis, which aims at extracting aspect terms, opinion terms, and corresponding sentiment polarities from a given sentence. Existing models for ASTE tasks are divided into pipeline models and end-to-end models. To address the issues of error propagation in pipeline models and most end-to-end models overlooking the rich semantic information in sentences, a model called Semantic and Syntax Enhanced Dual-channel model for ASTE (SSED-ASTE) was proposed. First, BERT (Bidirectional Encoder Representation from Transformers) encoder was used to encode context. Then, a Bi-directional Long Short-Term Memory (Bi-LSTM) network was used to capture context semantic dependencies. Next, two parallel Graph Convolution Networks (GCN) were utilized to extract the semantic features and the syntax features using self-attention mechanism and dependency syntactic parsing, respectively. Finally, the Grid Tagging Scheme (GTS) was used for triplet extraction. Experimental analysis was conducted on four public datasets, and compared with the GTS-BERT model, the F1 values of the proposed model increased by 0.29, 1.50, 2.93, and 0.78 percentage points, respectively. The experimental results demonstrate that the proposed model effectively utilizes implicit semantic and syntactic information in sentences, achieving more accurate triplet extraction.

    Semi-supervised stance detection based on category-aware curriculum learning
    Zhaoze GAO, Xiaofei ZHU, Nengqiang XIANG
    2024, 44(10):  3281-3287.  DOI: 10.11772/j.issn.1001-9081.2023101558
    Asbtract ( )   HTML ( )   PDF (1303KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Pseudo-label generation emerges as an effective strategy in semi-supervised stance detection. In practical applications, variations are observed in the quality of generated pseudo-labels. However, in the existing working, the quality of these labels is regarded as equivalent. Furthermore, the influence of category imbalance on the quality of pseudo-label generation is not fully considered. To address these issues, a Semi-supervised stance Detection model based on Category-aware curriculum Learning (SDCL) was proposed. Firstly, a pre-trained classification model was employed to generate pseudo-labels for unlabeled tweets. Then, tweets were sorted by category based on the quality of pseudo-labels, and the top k high-quality tweets for each category were selected. Finally, the selected tweets from each category were merged, re-sorted, and input into the classification model with pseudo-labels, thereby further optimizing the model parameters. Experimental results indicate that compared to the best-performing baseline model, SANDS (Stance Analysis via Network Distant Supervision), the proposed model demonstrates improvements in Mac-F1 (Macro-averaged F1) scores on StanceUS dataset by 2, 1, and 3 percentage points respectively under three different splits (with 500, 1 000, and 1 500 labeled tweets). Similarly, on StanceIN dataset, the proposed model exhibits enhancements in Mac-F1 scores by 1 percentage point under the three splits, thereby validating the effectiveness of the proposed model.

    Machine reading comprehension event detection based on relation-enhanced graph convolutional network
    Wanting JI, Wenyi LU, Yuhang MA, Linlin DING, Baoyan SONG, Haolin ZHANG
    2024, 44(10):  3288-3293.  DOI: 10.11772/j.issn.1001-9081.2023101542
    Asbtract ( )   HTML ( )   PDF (996KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aiming at the problem that existing machine reading comprehension-based event detection models are difficult to mine the long-distance dependencies between keywords when facing long text context with complex syntactic relations, a Machine Reading Comprehension event detection model based on Relation-Enhanced Graph Convolutional Network (MRC-REGCN) was proposed. Firstly, a pre-trained language model was utilized to jointly encode the question and the text to obtain word vector representations incorporating the priory information. Secondly, the dynamic relationship was introduced to enhance the label information, and an REGCN was utilized to deeply learn syntactic dependencies between words and enhance the model’s ability to perceive the syntactic structure of long text. Finally, the probability distributions of the textual words under all event types were obtained by the multi-classifier. Experimental results on the ACE2005 English corpus show that the F1 score of the proposed model on trigger classification is improved by 2.49% and 1.23% compared to the comparable machine reading comprehension-based model named EEQA (Event Extraction by Answering (almost) natural Questions) and the best baseline model named DEGREE (Data-Efficient GeneRation-based Event Extraction) respectively, which verify that the MRC-REGCN has a better performance in performing event detection.

    Symmetric positive definite autoencoder method for multivariate time series anomaly detection
    Hui JIANG, Qiuyan YAN, Zhujun JIANG
    2024, 44(10):  3294-3299.  DOI: 10.11772/j.issn.1001-9081.2023101521
    Asbtract ( )   HTML ( )   PDF (2874KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Detecting abnormal patterns in multivariate time series is of great importance for the normal operation of complex systems in industrial production, Internet services, and other scenarios. Multidimensional data on continuous time has both temporal and spatial relationships, but most existing methods are deficient in modeling spatial relationships between dimensions. Due to the complexity of the spatial topology structure constructed by multidimensional data, traditional neural network models have difficulty in preserving well-modeled spatial relationships. To address these problems, an SPDAE (Symmetric Positive Definite AutoEncoder) method for multivariate time series anomaly detection was proposed. Gaussian kernel function was used to calculate the mutual relationship between two dimensions of the original data, multi-step and multi-window SPD (Symmetric Positive Definite) matrices were generated to capture the spatiotemporal features of multivariate time series. At the same time, a convolution-like AutoEncoder (AE) network was designed. The SPD feature matrix was taken as input at encoder stage, and an attention mechanism was introduced at the decoder stage to aggregate multi-step data obtained by each layer of the encoder to achieve multi-scale spatiotemporal feature reconstruction. In particular, in order to preserve the spatial structure of the input data, a convolution-like operation that conforms to the manifold topology was used by each layer of the encoder and the decoder to update model weights and a Log-Euclidean metric was used to calculate the reconstruction error. Experimental results on a private dataset show that the SPDAE method improves the precision by 2.3 percentage points compared to the suboptimal baseline model MSCRED (Multi-Scale Convolutional Recurrent Encoder-Decoder) and the F1 score by 3.0 percentage points compared to the suboptimal baseline model LSTM-ED (Long Short-Term Memory network based Encoder-Decoder). At the same time, due to the use of SPD matrices to represent spatial relationships between multidimensional data, according to the difference value of its reconstructed matrix, preliminary positioning of abnormal dimensions can be achieved.

    Multiscale time series anomaly detection incorporating wavelet decomposition
    Lishuo YE, Zhixue HE
    2024, 44(10):  3300-3306.  DOI: 10.11772/j.issn.1001-9081.2023101480
    Asbtract ( )   HTML ( )   PDF (2412KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Time series anomaly detection is one of the important tasks in time series analysis, but there are problems such as complex time patterns and difficult representation learning in real world multi-dimensional tasks. A WMAD (Wavelet transform for Multiscale time series Anomaly Detection) method incorporating wavelet decomposition was proposed. Specifically, the multi-temporal pattern extraction capability was enhanced through fusing the temporal patterns of time series uniformly into 2D stacked time windows in a multi-temporal window approach. At the same time, wavelet transform was introduced from the frequency domain perspective to decompose the original sequence into time-varying patterns with different frequency components to capture complex time patterns from the viewpoints of long-term trend changes and short-term transient changes. Based on the feature extraction capability of convolutional networks, a multiscale convolutional network was used to adaptively aggregate time series features at different scales. By adding the attention module containing both spatial and channel attention mechanisms, the extraction of key information was improved based on the enhancement of multiscale feature extraction capability, and thus the accuracy was improved. The anomaly detection results on five public datasets such as SWaT (Secure Water Treatment), SMD (Server Machine Dataset) and MSL (Mars Science Laboratory) show that the F1 values of WMAD method is 3.62 to 9.44 percentage points higher than those of the MSCRED (MultiScale Convolutional Recurrent Encoder-Decoder) method, 3.86 to 11.00 percentage points higher than those of the TranAD (deep Transformer networks for Anomaly Detection) method, and higher than those of other representative methods. The experimental results show that the WMAD method can capture complex temporal patterns in time series and alleviate the problem of difficult representation while having good anomaly detection performance.

    Multi-organization collaborative data sharing scheme with dual authorization
    Huan ZHANG, Jingyu WANG, Lixin LIU, Xiaoyu JIANG
    2024, 44(10):  3307-3314.  DOI: 10.11772/j.issn.1001-9081.2023101494
    Asbtract ( )   HTML ( )   PDF (2796KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In view of the lack of trust mechanism in the existing multi-organization collaborative data sharing framework, the problems of data privacy and security risks, data consistency and regulatory issues regarding the usage of shared data, with the help of the properties of blockchain, a multi-organization collaborative data sharing scheme with dual authorization was proposed to solve the access problem of collaborative management of shared data between various organizational entities through dual authorization. Firstly, Attribute-Based Access Control (ABAC) technology was utilized to manage shared data using a set of attributes of different organizations to achieve the first layer of authorization and prevent unauthorized access by unauthorized users. Secondly, based on access control, a multi-signature protocol was introduced for the second layer of authorization, regulating the access to shared data of collaborative organizations, thereby enhancing access security. Experimental results show that the when the number of collaborative organizations is 4,the overall time cost of system is 21 s. When the number of collaborative organizations increases to 10, the proposed scheme can still maintain low time overhead, so the proposed scheme can meet the needs of safety and practicability in actual production at the same time.

2024 Vol.44 No.10

Current Issue
Archive
Honorary Editor-in-Chief: ZHANG Jingzhong
Editor-in-Chief: XU Zongben
Associate Editor: SHEN Hengtao XIA Zhaohui
Domestic Post Distribution Code: 62-110
Foreign Distribution Code: M4616
Address:
No. 9, 4th Section of South Renmin Road, Chengdu 610041, China
Tel: 028-85224283-803
  028-85222239-803
Website: www.joca.cn
E-mail: bjb@joca.cn
WeChat
Join CCF