Loading...

Table of Content

    10 December 2022, Volume 42 Issue 12
    Artificial intelligence
    Survey on interpretability research of deep learning
    Lingmin LI, Mengran HOU, Kun CHEN, Junmin LIU
    2022, 42(12):  3639-3650.  DOI: 10.11772/j.issn.1001-9081.2021091649
    Asbtract ( )   HTML ( )   PDF (4239KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In recent years, deep learning has been widely used in many fields. However, due to the highly nonlinear operation of deep neural network models, the interpretability of these models is poor, these models are often referred to as “black box” models, and cannot be applied to some key fields with high performance requirements. Therefore, it is very necessary to study the interpretability of deep learning. Firstly, deep learning was introduced briefly. Then, around the interpretability of deep learning, the existing research work was analyzed from eight aspects, including hidden layer visualization, Class Activation Mapping (CAM), sensitivity analysis, frequency principle, robust disturbance test, information theory, interpretable module and optimization method. At the same time, the applications of deep learning in the fields of network security, recommender system, medical and social networks were demonstrated. Finally, the existing problems and future development directions of deep learning interpretability research were discussed.

    Federated learning survey:concepts, technologies, applications and challenges
    Tiankai LIANG, Bi ZENG, Guang CHEN
    2022, 42(12):  3651-3662.  DOI: 10.11772/j.issn.1001-9081.2021101821
    Asbtract ( )   HTML ( )   PDF (2464KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Under the background of emphasizing data right confirmation and privacy protection, federated learning, as a new machine learning paradigm, can solve the problem of data island and privacy protection without exposing the data of all participants. Since the modeling methods based on federated learning have become mainstream and achieved good effects at present, it is significant to summarize and analyze the concepts, technologies, applications and challenges of federated learning. Firstly, the development process of machine learning and the inevitability of the appearance of federated learning were elaborated, and the definition and classification of federated learning were given. Secondly, three federated learning methods (including horizontal federated learning, vertical federated learning and federated transfer learning) which were recognized by the industry currently were introduced and analyzed. Thirdly, concerning the privacy protection issue of federated learning, the existing common privacy protection technologies were generalized and summarized. In addition, the recent mainstream open-source frameworks were introduced and compared, and the application scenarios of federated learning were given at the same time. Finally, the challenges and future research directions of federated learning were prospected.

    Adaptive hybrid attention hashing for deep cross-modal retrieval
    Xinghua LIU, Guitao CAO, Qiubin LIN, Wenming CAO
    2022, 42(12):  3663-3670.  DOI: 10.11772/j.issn.1001-9081.2021101806
    Asbtract ( )   HTML ( )   PDF (1778KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In feature learning process, the existing hashing methods cannot distinguish the importance of the feature information of each region, and cannot utilize the label information to explore the correlation between modalities. Therefore, an Adaptive Hybrid Attention Hashing for deep cross-modal retrieval (AHAH) model was proposed. Firstly, channel attention and spatial attention were combined by the weights obtained by autonomous learning to strengthen the attention to the relevant target area and weaken the attention to the irrelevant target area. Secondly, the similarity between modalities was expressed more finely through the statistical analysis of modality labels and quantification of similarity degrees to numbers between 0 and 1 by using the proposed similarity measurement method. Compared with the most advanced method Multi-Label Semantics Preserving Hashing (MLSPH) on four commonly used datasets MIRFLICKR-25K, NUS-WIDE, MSCOCO, and IAPR TC-12, when the hash code length is 16 bit, the proposed method has the retrieval mean Average Precision (mAP) increased by 2.25%, 1.75%, 6.8%, and 2.15%, respectively. In addition, ablation experiments and efficiency analysis also prove the effectiveness of the proposed method.

    Social recommendation combining trust implicit similarity and score similarity
    Yinying ZHOU, Mengyi ZHANG, Dunhui YU, Ming ZHU
    2022, 42(12):  3671-3678.  DOI: 10.11772/j.issn.1001-9081.2021101782
    Asbtract ( )   HTML ( )   PDF (2210KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Focused on the issue that the most existing social recommendation algorithms ignore the influence of the association relationship between items on the recommendation accuracy, and fail to effectively combine user ratings with trust data, a Social recommendation algorithm combing Trust implicit similarity and Score similarity (SocialTS) was proposed. Firstly, the score similarity and trust implicit similarity between users were combined linearly to obtain reliable similar friends among users. Then, the trust relationship was integrated into the correlation analysis of items, and the modified similar items were obtained. Finally, similar users and items were added to the Matrix Factorization (MF) model as regularization terms, thereby obtaining more accurate feature representations of users and items. Experimental results show that on FilmTrust and CiaoDVD datasets, when the latent feature dimension is 10, compared with the mainstream social recommendation algorithm Trust-based Singular Value Decomposition (TrustSVD), SocialTS has the Root Mean Square Error (RMSE) reduced by 4.23% and 8.38% respectively, and the Mean Absolute Error (MAE) reduced by 4.66% and 6.88% respectively. SocialTS can not only effectively improve users' cold start problem, but also accurately predict users' actual ratings under different numbers of ratings, and has good robustness.

    Neural machine translation integrating bidirectional-dependency self-attention mechanism
    Zhijin LI, Hua LAI, Yonghua WEN, Shengxiang GAO
    2022, 42(12):  3679-3685.  DOI: 10.11772/j.issn.1001-9081.2021101805
    Asbtract ( )   HTML ( )   PDF (961KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aiming at the problem of resource scarcity in neural machine translation, a method for fusion of dependency syntactic knowledge based on a Bidirectional-Dependency self-attention mechanism (Bi-Dependency) was proposed. Firstly, an external parser was used to parse the source sentence to obtain dependency parsing data. Then, the dependency parsing data was transformed into the position vector of the parent word and the weight matrix of the child word. Finally, the dependency knowledge was integrated into the multi-head attention mechanism of the Transformer encoder. By using Bi-Dependency, the translation model was able to simultaneously pay attention to the dependency information in both directions: the parent word to the child word and the child word to the parent word. Experimental results of bi-directional translation show that compared with the Transformer model, in the case of rich resources, the proposed method has the BLEU (BiLingual Evaluation Understudy) value on Chinese-Thai translation improved by 1.07 and 0.86 respectively, and the BLEU value on Chinese-English translation improved by 0.79 and 0.68 respectively; in the case of low resources, the proposed model has the BLEU value increased by 0.51 and 1.06 respectively on Chinese-Thai translation, and the BLEU value increased by 1.04 and 0.40 respectively on Chinese-English translation. It can be seen that Bi-Dependency provides the model with richer dependence information, which can effectively improve the translation performance.

    Single direction projected Transformer method for aliasing text detection
    Zhida FENG, Li CHEN
    2022, 42(12):  3686-3691.  DOI: 10.11772/j.issn.1001-9081.2021101749
    Asbtract ( )   HTML ( )   PDF (2574KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    To address the performance degradation of segmentation-based text detection methods in aliasing text scenes, a Single Direction Projected Transformer (SDPT) was proposed for aliasing text detection. Firstly, multi-scale features were extracted and fused by using deep Residual Network (ResNet) and Feature Pyramid Network (FPN). Then, the feature map was projected into a vector sequence by using horizontal projection and was fed into the Transformer module to model, thereby mining the relationship between the lines of text. Finally, joint optimization was performed using multiple objectives. Extensive experiments were conducted on the synthetic dataset BDD-SynText and the real dataset RealText. The results show that the proposed SDPT achieves optimal effect for text detection with high aliasing level, and improves F1-Score (IoU75) by at least 21. 36 percentage points on BDD-SynText and 18.11 percentage points on RealText compared with the state-of-the-art text detection algorithms such as Progressive Scale Expansion Network (PSENet) under the same backbone network (ResNet50), verifying the important role of the proposed method for performance improvement in aliasing text detection.

    Text segmentation model based on graph convolutional network
    Yuqi DU, Jin ZHENG, Yang WANG, Cheng HUANG, Ping LI
    2022, 42(12):  3692-3699.  DOI: 10.11772/j.issn.1001-9081.2021101768
    Asbtract ( )   HTML ( )   PDF (2746KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    The main task of text segmentation is to divide the text into several relatively independent text blocks according to the topic relevance. Aiming at the shortcomings of the existing text segmentation models in extracting fine-grained features such as text paragraph structural information, semantic correlation and context interaction, a text segmentation model TS-GCN (Text Segmentation-Graph Convolutional Network) based on Graph Convolutional Network (GCN) was proposed. Firstly, a text graph based on the structural information and semantic logic of text paragraphs was constructed. Then, the semantic similarity attention was introduced to capture the fine-grained correlation between text paragraph nodes, and the information transmission between high-order neighborhoods of text paragraph nodes was realized with the help of GCN, so that the model ability of multi-granularity extraction of text paragraph topic feature representations was enhanced. The proposed model was compared with the representative model CATS (Coherence-Aware Text Segmentation), and its basic model TLT-TS (Two-Level Transformer model for Text Segmentation), which were commonly used as benchmarks for text segmentation task. Experimental results show that TS-GCN’s evaluation index Pk is 0.08 percentage points lower than that of TLT-TS without any auxiliary module on Wikicities dataset. And the proposed model has the Pk value decreased by 0.38 percentage points and 2.30 percentage points respectively on Wikielements dataset compared with CATS and TLT-TS. It can be seen that TS-GCN achieves good segmentation effect.

    Aspect-level cross-domain sentiment analysis based on capsule network
    Jiana MENG, Pin LYU, Yuhai YU, Shichang SUN, Hongfei LIN
    2022, 42(12):  3700-3707.  DOI: 10.11772/j.issn.1001-9081.2021101779
    Asbtract ( )   HTML ( )   PDF (1921KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In the cross-domain sentiment analysis, the labeled samples in the target domain are seriously insufficient, the distributions of features in different domains are very different, and the emotional polarities expressed by features in one domain differ a lot from the emotional polarities in another domain, all of these problems lead to low classification accuracy. To deal with the above problems, an aspect-level cross-domain sentiment analysis method based on capsule network was proposed. Firstly, the feature representations of text were obtained by BERT (Bidirectional Encoder Representation from Transformers) pre-training model. Secondly, for the fine-grained aspect-level sentiment features, Recurrent Neural Network (RNN) was used to fuse the context features and aspect features. Thirdly, capsule network and dynamic routing were used to distinguish overlapping features, and the sentiment classification model was constructed on the basis of capsule network. Finally, a small amount of data in the target domain was used to fine-tune the model to realize cross-domain transfer learning. The optimal F1 score of the proposed method is 95.7% on Chinese dataset and 91.8% on English dataset, which effectively solves the low accuracy problem of insufficient training samples.

    Face anti-spoofing method based on regional blocking and lightweight network
    Dan HE, Xiping HE, Yue LI, Rui YUAN, Yuanyuan NIU
    2022, 42(12):  3708-3714.  DOI: 10.11772/j.issn.1001-9081.2021101723
    Asbtract ( )   HTML ( )   PDF (1601KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    How to effectively identify all kinds of attacked faces is an urgent problem to be solved in the process of face recognition. The face anti-spoofing methods based on deep learning have high performance, but also bring a large number of parameters and calculation, so they cannot be deployed in mobile or embedded devices. To solve the above problems, a face anti-spoofing method based on regional blocking and lightweight network was proposed. Firstly, the training samples were randomly blocked. Then, a lightweight network based on attention mechanism was designed for feature extraction and image classification. Finally, in order to improve the detection accuracy, data augmentation was conducted on the test samples based on regional blocking. Experimental results show that the proposed model reaches 100% accuracy on REPLAY-ATTACK and CASIA-FASD datasets. At the same time, the proposed model obtains 99.49% accuracy and 0.458 0% Average Classification Error Rate (ACER) on the Depth modal of CASIA-SURF dataset, which are much better than those obtained by convolutional neural networks such as ResNet and ShuffleNet. And the parameter amount of the model is only 0.258 2 MB. In practical applications, the end-to-end lightweight network structure makes the proposed model easier to be deployed on mobile devices for real-time face anti-spoofing detection.

    6D pose estimation incorporating attentional features for occluded objects
    Kangzhe MA, Jiatian PI, Zhoubing XIONG, Jia LYU
    2022, 42(12):  3715-3722.  DOI: 10.11772/j.issn.1001-9081.2021101840
    Asbtract ( )   HTML ( )   PDF (2051KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In the process of robotic vision grasping, it is difficult for the existing algorithms to perform real-time, accurate and robust pose estimation of the target object under complex background, insufficient illumination, occlusion, etc. Aiming at the above problems, a 6D pose estimation network with fused attention features based on the key point method was proposed. Firstly, Convolutional Block Attention Module (CBAM) was added in the skip connection stage to focus the spatial and channel information, so that the shallow features in the encoding stage were effectively fused with the deep features in the decoding stage, the spatial domain information and accurate position channel information of the feature map were enhanced. Secondly, the attention map of every key point was regressed in a weakly supervised way using a normalized loss function. The attention map was used as the weight of the key point offset at the corresponding pixel position. Finally, the coordinates of keypoints were obtained by accumulating and summing. The experimental results demonstrate that the proposed network reaches 91.3% and 46.3% on the LINEMOD and Occlusion LINEMOD datasets respectively in the ADD(-S) metric. 5.0 percentage points and 5.5 percentage points improvement in the ADD(-S) metric are achieved compared to Pixel Voting Network (PVNet), which verifies that the proposed network improves the robustness of objects in occlusion scenes.

    Remote sensing image small target detection based on improved YOLOv3
    Hao FENG, Chaobing HUANG, Yuanqiao WEN
    2022, 42(12):  3723-3732.  DOI: 10.11772/j.issn.1001-9081.2021101802
    Asbtract ( )   HTML ( )   PDF (4914KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    YOLOv3 (You Only Look Once version 3) algorithm is widely used in target detection tasks. Although some improved algorithms based on YOLOv3 have achieved some results, there are still problems of insufficient representation ability and low detection accuracy, especially for the detection of small targets. In order to solve the above problems, a small target detection algorithm for remote sensing images based on YOLOv3 was proposed. Firstly, K-means Transformation (K-means-T) algorithm was used to optimize the size of anchor box, so that the matching degree between the priori box and ground truth box was improved. Secondly, the confidence loss function was optimized to solve the problem of uneven distribution of hard and easy samples. Finally, attention mechanism was introduced to improve the algorithm’s ability to perceive the detailed information. Results of the experiments carried out on RSOD dataset show that compared with the original YOLOv3 algorithm and YOLOv4 algorithm, the proposed algorithm has the detection Average Precision (AP) on the small target class “aircraft” increased by 7.3 percentage points and 5.9 percentage points respectively, illustrating that the proposed improved algorithm can detect small targets in remote sensing images effectively, with higher accuracy.

    Dust accumulation degree recognition of photovoltaic panel based on improved deep residual network
    Pengxiang SUN, Li BI, Junjie WANG
    2022, 42(12):  3733-3739.  DOI: 10.11772/j.issn.1001-9081.2021101715
    Asbtract ( )   HTML ( )   PDF (3164KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    The dust accumulation on photovoltaic panels will reduce the conversion efficiency of photovoltaic power generation, and easily cause damage to the photovoltaic panels at the same time. Therefore, it is of great significance to recognize the dust accumulation of photovoltaic panels intelligently. Aiming at above problems, a dust accumulation degree recognition model of photovoltaic panel based on improved deep residual network was proposed. Firstly, the NeXt Residual Network (ResNeXt)50 was improved by decomposing convolution and fine-tuning down-sampling. Then, the Coordinate Attention (CA) mechanism was fused to embed the location information into channel attention, the channel relationship and long-term dependence were encoded by using the accurate location information, and the feature map was decomposed into two one-dimensional codes by using the two-dimensional global pooling operation, thereby enhencing the representation of the objects of attention. Finally, the cross-entropy loss function was replaced by the Supervised Contrast (SupCon) learning loss function to effectively improve the recognition accuracy. Experimental results show that in the recognition of the dust accumulation of photovoltaic panel at four levels of real photovoltaic power stations, the improved ResNeXt50 model has a recognition accuracy of 90.7%, which is increased by 7.2 percentage points compared with that of the original ResNeXt50. The proposed model can meet the basic requirements of intelligent operation and maintenance of photovoltaic power stations.

    Data science and technology
    Multi-view clustering via subspace merging on Grassmann manifold
    Jiaojiao GUAN, Xuezhong QIAN, Shibing ZHOU, Kaibin JIANG, Wei SONG
    2022, 42(12):  3740-3749.  DOI: 10.11772/j.issn.1001-9081.2021101756
    Asbtract ( )   HTML ( )   PDF (1806KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Most of the existing multi-view clustering algorithms assume that there is a linear relationship between multi-view data points, and fail to maintain the locality of original feature space during the learning process. At the same time, merging subspace in Euclidean space is too rigid to align learned subspace representations. To solve the above problems, a multi-view clustering algorithm via subspaces merging on Grassmann manifold was proposed. Firstly, the kernel trick and the learning of local manifold structure were combined to obtain the subspace representations of different views. Then, the subspace representations were merged on the Grassmann manifold to obtain the consensus affinity matrix. Finally, spectral clustering was performed on the consensus affinity matrix to obtain the final clustering result. And Alternating Direction Method of Multipliers (ADMM) was used to optimize the proposed model. Compared with Kernel Multi-view Low-Rank Sparse Subspace Clustering (KMLRSSC) algorithm, the proposed algorithm has the clustering accuracy improved by 20.83 percentage points, 9.47 percentage points and 7.33 percentage points on MSRCV1, Prokaryotic and Not-Hill datasets. Experimental results verify the effectiveness and good performance of the multi-view clustering algorithm via subspace merging on Grassmann manifold.

    Imbalanced classification algorithm based on improved semi-supervised clustering
    Yu LU, Lingyun ZHAO, Binwen BAI, Zhen JIANG
    2022, 42(12):  3750-3755.  DOI: 10.11772/j.issn.1001-9081.2021101837
    Asbtract ( )   HTML ( )   PDF (706KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Imbalanced classification is one of the research hotspots in the field of machine learning, where oversampling increases minority samples through repeated extraction or artificial synthesis to rebalance the dataset. However, most of the existing oversampling methods are based on the original data distribution, and are difficult to reveal more dataset distribution characteristics. To address the above problem, firstly, an improved semi-supervised clustering algorithm was proposed to mine the data distribution characteristics. Secondly, based on the results of semi-supervised clustering, the highly-confident unlabeled data (pseudo-labeled samples) was selected from minority-class clusters to join into the original training set. In this way, in addition to rebalancing the dataset, the distribution characteristics obtained by semi-supervised clustering was able to be used to assist the imbalanced classification. Finally, the results of semi-supervised clustering and classification were fused to predict the final labels, which further improved the model performance of imbalanced classification. With G-mean and Area Under Curve (AUC) selected as evaluation indicators, the proposed algorithm was compared with seven oversampling-/undersampling-based imbalanced classification algorithms, such as TU (Trainable Undersampling) and CDSMOTE (Class Decomposition Synthetic Minority Oversampling TEchnique) on 10 public datasets. Experimental results show that compared with TU and CDSMOTE, the proposed algorithm has the average AUC increased by 6.7% and 3.9% respectively, the average G-mean improved by 7.6% and 2.1% respectively. At the same time, the proposed algorithm achieves the highest average results on both evaluation indicators than all the algorithms to be compared. It can be seen that the proposed algorithm can effectively improve the imbalanced classification performance.

    Sequential behavior recommendation based on user’s latent state and dependency learning
    Wen WEN, Fangyu LIANG
    2022, 42(12):  3756-3762.  DOI: 10.11772/j.issn.1001-9081.2021101765
    Asbtract ( )   HTML ( )   PDF (1001KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    At present, how to capture the dynamic changes and dependencies of user behaviors is an important problem in the field of sequential recommendation, which mainly faces challenges such as large behavior event space and complex sequential dependencies of behaviors. To address the above challenges, a sequential recommendation algorithm based on the learning of latent states of behavioral sequences and their dependency relationships was proposed. Firstly, the low-dimensional representation of the latent states of behavioral sequences was obtained by using the maximum pooling hierarchical structure. Then, the dependencies between the latent states were captured and described by graph neural network in order to achieve the learning of user behavior change patterns, which led to more accurate sequential recommendation effect. Experimental results show that compared with the recent Hierarchical Gating Network (HGN) baseline algorithm on the IPTV, New York City (NYC) and Tokyo (TKY) datasets, the proposed algorithm improves the performance evaluation metric recall by 30.03%, 29.48% and 33.75% respectively, and obtains 37.20%, 43.47% and 40.34% relative improvements on Normalized Discounted Cumulative Gain (NDCG) metric, respectively. And the ablation experimental results demonstrate the effectiveness of dependency learning of sequential states. Therefore, the proposed algorithm is especially suitable for solving the problems with sparse behaviors in single time slice and complex behavioral dependencies in sequential recommendation.

    Materialized view asynchronous incremental maintenance task generation under hybrid transaction/analytical processing for single record
    Yangyang SUN, Junping YAO, Xiaojun LI, Shouxiang FAN, Ziwei WANG
    2022, 42(12):  3763-3768.  DOI: 10.11772/j.issn.1001-9081.2021101725
    Asbtract ( )   HTML ( )   PDF (660KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Existing materialized view asynchronous incremental maintenance task generation algorithms under Hybrid Transaction/Analytical Processing (HTAP) are mainly used for multiple records and unable to generate materialized view asynchronous incremental maintenance task under HTAP for single record, which results in the increase of disk IO overhead and the performance degradation of materialized view asynchronous incremental maintenance under HTAP. Therefore, a materialized view asynchronous incremental maintenance task generation method under HTAP for single record was proposed. Firstly, the benefit model of materialized view asynchronous incremental maintenance task generation under HTAP for single record was established. Then, the materialized view asynchronous incremental maintenance task generation under HTAP for single record algorithm was designed on the basis of Q-learning. Experimental results show that materialized view asynchronous incremental maintenance task generation under HTAP for single record is realized by the proposed algorithm, and the proposed algorithm decreases the average IOPS (Input/output Operations Per Second), average CPU utilization (2-core) and average CPU utilization (4-core) at least by 8.49 times, 1.85 percentage points and 0.97 percentage points respectively.

    Cyber security
    Trusted integrity verification scheme of cloud data without bilinear pairings
    Wenyong YUAN, Xiuguang LI, Ruifeng LI, Zhengge YI, Xiaoyuan YANG
    2022, 42(12):  3769-3774.  DOI: 10.11772/j.issn.1001-9081.2021101780
    Asbtract ( )   HTML ( )   PDF (1345KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Focusing on the malicious cheating behaviors of Third Party Auditor (TPA) in cloud audit, a trusted cloud auditing scheme without bilinear pairings was proposed to support the correct judgment of the behaviors of TPA. Firstly, the pseudo-random bit generator was used to generate random challenge information, which ensured the reliability of the challenge information generated by TPA. Secondly, the hash value was added in the process of evidence generation to protect the privacy of user data effectively. Thirdly, in the process of evidence verification, the interactive process between users and TPA results was added. The data integrity was checked and whether TPA had completed the audit request truthfully or not was judged according to the above results. Finally, the scheme was extended to realize batch audit of multiple data. Security analysis shows that the proposed scheme can resist substitution attack and forgery attack, and can protect data privacy. Compared with Merkle-Hash-Tree based Without Bilinear PAiring (MHT-WiBPA) audit scheme, the proposed scheme has close time for verifying evidence, and the time for generating labels reduced by about 49.96%. Efficiency analysis shows that the proposed scheme can achieve lower computational cost and communication cost on the premise of ensuring the credibility of audit results.

    Multi‑type application‑layer DDoS attack detection method based on integrated learning
    Yingzhi LI, Man LI, Ping DONG, Huachun ZHOU
    2022, 42(12):  3775-3784.  DOI: 10.11772/j.issn.1001-9081.2021091653
    Asbtract ( )   HTML ( )   PDF (3299KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aiming at the problem of multiple types of application?layer Distributed Denial of Service (DDoS) attacks, which are difficult to detect simultaneously, an application?layer DDoS attack detection method based on integrated learning was proposed to detect multiple types of application?layer DDoS attacks. Firstly, by using the dataset generation module, the normal and attack traffic was simulated, the corresponding feature information was filtered and extracted, and 47?dimensional feature information characterized Challenge Collapsar (CC), HTTP Flood, HTTP Post and HTTP Get attacks were generated. Secondly, by using the offline training module, the effective features were processed and input into the integrated Stacking detection model for training, thereby obtaining a detection model that can detect multiple types of application?layer DDoS attacks. Finally, by using the online detection module, the specific traffic type of the traffic to be detected was judged through deploying the detection model online. Experimental results show that compared with the classification models constructed by Bagging,Adaboost and XGBoost,the Stacking integretion model improves the accuracy by 0. 18 percentage points,0. 21 percentage points and 0. 19 percentage points respectively,and has the malicious traffic detection rate reached 98% under the optimal time window. It can be seen that the proposed method has good performance in detecting multi-type application-layer DDoS attacks.

    Parallel chain consensus algorithm optimization scheme based on Boneh-Lynn-Shacham aggregate signature technology
    Qi LIU, Rongxin GUO, Wenxian JIANG, Dengji MA
    2022, 42(12):  3785-3791.  DOI: 10.11772/j.issn.1001-9081.2021101711
    Asbtract ( )   HTML ( )   PDF (1481KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    At present, each consensus node of the parallel chain needs to send its own consensus transaction to the main chain to participate in the consensus. As a result, a large number of consensus transactions occupy the block capacity of the main chain seriously and waste transaction fees. In order to solve the above problems, an optimization scheme of parallel chain consensus algorithm based on BLS (Boneh-Lynn-Shacham) aggregate signature technology was proposed by combining bilinear map technology with the characteristics of the same consensus data and different signatures of consensus trades on parallel chain. Firstly, the transaction data was signed by the consensus node. Then, the consensus transaction was broadcasted by each node of the parallel chain and the message was synchronized internally through P2P (Peer-to-Peer) network. Finally, the consensus transactions were counted by Leader node. When the number of consensus transactions was greater than 2/3, the corresponding BLS signature data was aggregated and the transaction aggregate signature was sent to the main chain for verification. Experimental results show that compared with the original parallel chain consensus algorithm, the proposed scheme can effectively solve the problem of consensus nodes on the parallel chain repeatedly sending consensus transactions to the main chain, save transaction fees with reducing the occupancy of the storage space of the main chain, only occupy 4 KB of the storage space of the main chain and only generate a transaction fee of 0.01 BiT Yuan (BTY).

    Efficient homomorphic neural network supporting privacy-preserving training
    Yang ZHONG, Renwan BI, Xishan YAN, Zuobin YING, Jinbo XIONG
    2022, 42(12):  3792-3800.  DOI: 10.11772/j.issn.1001-9081.2021101775
    Asbtract ( )   HTML ( )   PDF (1538KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aiming at the problems of low computational efficiency and insufficient accuracy in the privacy-preserving neural network based on homomorphic encryption, an efficient Homomorphic Neural Network (HNN) under three-party collaborative supporting privacy-preserving training was proposed. Firstly, in order to reduce the computational cost of ciphertext-ciphertext multiplication in homomorphic encryption, the idea of secret sharing was combined to design a secure fast multiplication protocol to convert the ciphertext-ciphertext multiplication into plaintext-ciphertext multiplication with low complexity. Then, in order to avoid multiple iterations of ciphertext polynomials generated during the construction of HNN and improve the nonlinear calculation accuracy, a secure nonlinear calculation method was studied, which executed the corresponding nonlinear operator for the confused plaintext message with random mask. Finally, the security, correctness and efficiency of the proposed protocols were analyzed theoretically, and the effectiveness and superiority of HNN were verified by experiments. Experimental results show that compared with the dual server scheme PPML (Privacy Protection Machine Learning), HNN has the training efficiency improved by 18.9 times and the model accuracy improved by 1.4 percentage points.

    Multi-party privacy preserving k-means clustering scheme based on blockchain
    Le ZHAO, En ZHANG, Leiyong QIN, Gongli LI
    2022, 42(12):  3801-3812.  DOI: 10.11772/j.issn.1001-9081.2021091640
    Asbtract ( )   HTML ( )   PDF (3923KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In order to solve the problems that the iterative efficiencies of the existing privacy protection k-means clustering schemes are low, the server in the centralized differential privacy preserving k-means clustering scheme may be attacked, and the server in the localized differential privacy protection k-means clustering scheme may return wrong clustering results, a Multi-party Privacy Protection k-means Clustering Scheme based on Blockchain (M-PPkCS/B) was proposed. Taking advantages of localized differential privacy technology and the characteristics of the blockchain such as being open, transparent, and non-tamperable, firstly, a Multi-party k-means Clustering Center Initialization Algorithm (M-kCCIA) was designed to improve the iterative efficiency of clustering while protecting user privacy, and ensure the correctness of initial clustering centers jointly generated by the users. Then, a Blockchain-based Privacy Protection k-means Clustering Algorithm (Bc-PPkCA) was designed, and a smart contract of clustering center updating algorithm was constructed. The clustering center was updated iteratively by the above smart contract on the blockchain to ensure that each user was able to obtain the correct clustering results. Through experiments on the datasets HTRU2 and Abalone, the results show that while ensuring that each user obtains the correct clustering results, the accuracy can reach 97.53% and 96.19% respectively, the average iteration times of M-kCCIA is 5.68 times and 2.75 times less than that of the algorithm of randomly generating initial cluster center called Random Selection (RS).

    K-Prototypes clustering method for local differential privacy
    Guopeng ZHANG, Xuebin CHEN, Haoshi WANG, Ran ZHAI, Zheng MA
    2022, 42(12):  3813-3821.  DOI: 10.11772/j.issn.1001-9081.2021101724
    Asbtract ( )   HTML ( )   PDF (2056KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In order to protect data privacy while ensuring data availability in clustering analysis, a privacy protection clustering scheme based on Local Differential Privacy (LDP) technique called LDPK-Prototypes (LDP K-Prototypes) was proposed. Firstly, the hybrid dataset was encoded by users. Then, a random response mechanism was used to disturb the sensitive data, and after collecting the users’ disturbed data, the original dataset was recovered by the third party to the maximum extent. After that, the K-Prototypes clustering algorithm was performed. In the clustering process, the initial clustering center was determined by the dissimilarity measure method, and the new distance calculation formula was redefined by the entropy weight method. Theoretical analysis and experimental results show that compared with the ODPC (Optimizing and Differentially Private Clustering) algorithm based on the Centralized Differential Privacy (CDP) technique, the proposed scheme has the average accuracy on Adult and Heart datasets improved by 2.95% and 12.41% respectively, effectively improving the clustering usability. Meanwhile, LDPK-Prototypes expands the difference between data, effectively avoids local optimum, and improves the stability of the clustering algorithm.

    New permissioned public blockchain based on main-sub chain architecture
    Jiagui XIE, Zhiping LI, Jian JIN, Bo ZHANG, Jian GUO, Fanjie NIE
    2022, 42(12):  3822-3830.  DOI: 10.11772/j.issn.1001-9081.2021101790
    Asbtract ( )   HTML ( )   PDF (3554KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Focused on the issue that different blockchains are independent from and difficult to communicate with each other, a new type of permissioned public blockchain architecture of "main chain + sub chain" was proposed. Firstly, based on the existing algorithms such as Delegated Proof Of Stake (DPOS), Verifiable Random Function (VRF) and Practical Byzantine Fault Tolerance (PBFT), an innovative two-layer consensus algorithm was designed. And a trusted permission mechanism was added to make the blockchain have both permission and public characteristics. Secondly, the design process of the main and sub chains was described in detail. The management of the chain group and public services was provided by the main chain, while the sub chains were designed independently for different business scenarios, and cross-chain data communication was realized by connecting the main chain relay, thereby realizing the data secure isolation. Finally, an experimental environment was built for testing to verify the feasibility of the permissioned public blockchain design. Experimental results show that compared with some existing blockchains such as the Hyperledger Fabric, the proposed permissioned public blockchain has significant advantages, including a throughput of up to 25 000 times per second and an average delay time of about 8 s. It can be seen that this permissioned public blockchain provides technical support for further research on cross-chain data interconnection of different types of blockchains.

    PDF document detection model based on system calls and data provenance
    Jingwei LEI, Peng YI, Xiang CHEN, Liang WANG, Ming MAO
    2022, 42(12):  3831-3840.  DOI: 10.11772/j.issn.1001-9081.2021101730
    Asbtract ( )   HTML ( )   PDF (3249KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Focused on the issue that the traditional static detection and dynamic detection methods cannot cope with malicious PDF document attacks using a lot of obfuscation and unknown technologies, a new detection model based on system calls and data provenance, called NtProvenancer, was proposed. Firstly, the system call records during execution of the document were collected by the system call tracing tool. Then, the data provenance technology was used to establish a data provenance graph based on the system calls. After that, the feature segments of system calls were extracted for detection by using the key point algorithm of the graph. The experimental dataset consists of 528 benign PDF documents and 320 malicious ones. The test was carried out on Adobe Reader, and the Term Frequency-Inverse Document Frequency (TF-IDF) and the rarity algorithm in PROVDETECTOR were used to replace the key point algorithm of the graph to conduct the comparative study. The results show that NtProvenancer has better performance on precision and F1 score. Under the optimal parameter setting, the proposed model has the average time of document training and detection stages of 251.51 ms and 60.55 ms respectively, the false alarm rate lower than 5.22%, and the F1 score reached 0.989, showing that NtProvenancer is an efficient and practical model for PDF document detection.

    Advanced computing
    Low density parity check code decoding acceleration technology based on GPU
    Qidi XU, Zhenghong LIU, Lin ZHENG
    2022, 42(12):  3841-3846.  DOI: 10.11772/j.issn.1001-9081.2021101726
    Asbtract ( )   HTML ( )   PDF (1785KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    With the development of communication technology, communication terminals gradually adopt software to be compatible with multiple communication modes and protocols. As in the traditional software radio architecture with a Central Processing Unit (CPU) of computer as an arithmetic unit, the wideband data throughput of high-speed wireless communication systems such as Multiple-Input Multiple-Output (MIMO) is not be satisfied, an acceleration method of Low Density Parity Check (LDPC) code decoder based on Graphics Processing Unit (GPU) was proposed. Firstly, according to the theoretical analysis of the acceleration performance of GPU parallelly accelerated heterogeneous computing in GNU Radio 4G/5G physical layer signal processing module, a more parallelly efficient Layered Normalized Min-Sum (LNMS) algorithm was adopted. Then, the decoding delay of the decoder was reduced by using the methods such as global synchronization strategy, reasonably allocation of GPU memory space and stream parallelism mechanism. At the same time, the LDPC code decoding process was optimized in parallel with the multi-threaded parallel technology in GPU. Finally, the GPU accelerated decoder was implemented and verified on the software radio platform, and the bit error rate performance and acceleration performance bottlenecks of the parallel decoder were analyzed. Experimental results show that compared with the traditional CPU serial code processing method, CPU+GPU heterogeneous platform has the decoding rate for LDPC codes increased to about 200 times, and the throughput of decoder can reach more than 1 Gb/s, especially in the case of large-scale data, the decoding performance is greatly improved compared with traditional decoder.

    Improved firefly algorithm based on multi-strategy fusion
    Xin YONG, Yuelin GAO, Yahua HE, Huimin WANG
    2022, 42(12):  3847-3855.  DOI: 10.11772/j.issn.1001-9081.2021101830
    Asbtract ( )   HTML ( )   PDF (1051KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In order to solve the problems that the traditional Firefly Algorithm (FA) is easy to fall into local optimum and has low convergence speed, an improved FA based on multi-strategy fusion, named LEEFA (Levy flight-Elite participated crossover-Elite opposition-based learning Firefly Algorithm) was proposed after integrating Levy flight, elite participated crossover operator and elite opposition-based learning mechanism in the firefly optimization algorithm. Firstly, Levy flight was introduced based on the traditional FA, so that the global search ability of the algorithm was improved. Secondly, an elite participated crossover operator was proposed to improve the convergence speed and accuracy of the algorithm, as well as to enhance the diversity and quality of solutions in the iterative process. Finally, the elite opposition-based learning mechanism was combined to search for the optimal solution, which improved the ability of jumping out of local optimum and convergence performance of FA, and realized the rapid exploration of solution search space. In order to verify the effectiveness of the proposed algorithm, simulation experiments were carried out on the benchmark functions. The results show that compared with algorithms such as Particle Swarm Optimization (PSO) algorithm, traditional FA, Levy Flight Firefly Algorithm (LFFA), Levy flight and Mutation operator based Firefly Algorithm (LMFA) and ADaptive logarithmic spiral-Levy Improved Firefly Algorithm (ADIFA), the proposed algorithm performs better in both convergence speed and accuracy.

    Network and communications
    Joint optimization of user association and resource allocation in cognitive radio ultra-dense networks to improve genetic algorithm
    Junjie ZHANG, Runhe QIU
    2022, 42(12):  3856-3862.  DOI: 10.11772/j.issn.1001-9081.2021101777
    Asbtract ( )   HTML ( )   PDF (2848KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aiming at the multi-dimensional resource allocation problem in the downlink heterogeneous cognitive radio Ultra-Dense Network (UDN), an improved genetic algorithm was proposed to jointly optimize user association and resource allocation with the objective of maximizing the throughput of femtocell users. Firstly, preprocessing was performed before the algorithm running to initialize the user’s reachable base stations and available channels matrix. Secondly, symbol coding was used to encode the matching relationships between the user and the base stations as well as the user and the channels into a two-dimensional chromosome. Thirdly, dynamic choosing best for replication + roulette was used as the selection algorithm to speed up the convergence of the population. Finally, in order to avoid the algorithm from falling into the local optimum, the mutation operator of premature judgment was added in the mutation stage, so that the connection strategy of base station, user and channel was obtained with limited number of iterations. Experimental results show that when the numbers of base stations and channels are fixed, the proposed algorithm improves the total user throughput by 7.2% and improves the cognitive user throughput by 1.2% compared with the genetic algorithm of three-dimensional matching, and the computational complexity of the proposed algorithm is lower. The proposed algorithm reduces the search space of feasible solutions, and can effectively improve the total throughput of cognitive radio UDNs with lower complexity.

    Data center flow scheduling mechanism based on differential evolution and ant colony optimization algorithm
    Rongrong DAI, Honghui LI, Xueliang FU
    2022, 42(12):  3863-3869.  DOI: 10.11772/j.issn.1001-9081.2021101766
    Asbtract ( )   HTML ( )   PDF (2071KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    As the traditional flow scheduling method for data center network is easy to cause network congestion and link load imbalance, a dynamic flow scheduling mechanism based on Differential Evolution (DE) and Ant Colony Optimization (ACO) algorithm (DE-ACO) was proposed to optimize elephant flow scheduling in data center networks. Firstly, Software Defined Network (SDN) technology was used to capture the real-time network status information and set the optimization objectives of flow scheduling. Then, DE algorithm was redefined by the optimization objectives, several available candidate paths were calculated and used as the initialized global pheromone of the ACO algorithm. Finally, the global optimal path was obtained by combining with the global network status, and the elephant flow on the congested link was rerouted. Experimental results show that compared with Equal-Cost Multi-Path routing (ECMP) algorithm and network flow scheduling algorithm of SDN data center based on ACO algorithm (ACO-SDN), the proposed algorithm increases the average bisection bandwidth by 29.42% to 36.26% and 5% to 11.51% respectively in random communication mode, reducing the Maximum Link Utilization (MLU) of the network, and achieving better load balancing of the network.

    Channel estimation based on compressive sensing in RIS-assisted millimeter wave system
    Yi WANG, Liu YANG, Tongkuai ZHANG
    2022, 42(12):  3870-3875.  DOI: 10.11772/j.issn.1001-9081.2021101808
    Asbtract ( )   HTML ( )   PDF (1756KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Since the pilot overhead using traditional channel estimation methods in the Reconfigurable Intelligent Surface (RIS)-assisted wireless communication systems is excessively high, a block sparseness based Orthogonal Matching Pursuit (OMP) channel estimation scheme was proposed. Firstly, according to the millimeter Wave (mmWave) channel model, the cascaded channel matrix was derived and transformed into the Virtual Angular Domain (VAD) to obtain the sparse representation of the cascaded channels. Secondly, by utilizing the sparse characteristics of the cascaded channels, the channel estimation problem was transformed into the sparse matrix recovery problem, and the reconstruction algorithm based on compressive sensing was adopted to recover the sparse matrix. Finally, the special row-block sparse structure was analyzed, and the traditional OMP scheme was optimized to further reduce pilot overhead and improve estimation performance. Simulation results show that the Normalized Mean Squared Error (NMSE) of the proposed optimized OMP scheme based on the row-block sparse structure decreases about 1 dB compared with that of the conventional OMP scheme. Therefore, the proposed channel estimation scheme can effectively reduce pilot overhead and obtain better estimation performance.

    Multimedia computing and computer simulation
    Real-time water wave simulation method based on wave annulus particles
    Haojie GU, Jun ZHANG
    2022, 42(12):  3876-3883.  DOI: 10.11772/j.issn.1001-9081.2021091700
    Asbtract ( )   HTML ( )   PDF (5586KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    A real-time two-dimensional water wave simulation method based on wave annulus particle packet was proposed to reduce computational cost in the water wave simulation process and improve its diffusion phenomenon's fidelity. In this method, wave annulus particle was used as primary calculation unit, the concept of “wave packet” was inherited inside particles, and visual effect of water waves was reproduced by using superposition of water waves in multiple frequency bands. Collision calculation was reduced by adding a mirror wave source to avoid complex geometric judgment when calculating the water wave reflection process. Additional calculation accuracy parameters were provided so that the algorithm could adjust calculation complexity of water wave reflection according to different hardware calculation capabilities. Experimental results show that the proposed method can use fewer particles to simulate natural water wave motion and avoid problem of water wave fracture after collision reflection. The performance test on the same hardware platform shows that rendering frame rate of the proposed wave annulus simulation algorithm is at least 60% higher than that of traditional wave packet algorithm and even achieves an acceleration effect of more than 400% in some cases with particularly complex water wave states.

    Semi-supervised video object segmentation via deep and shallow representations fusion
    Xiao LYU, Huihui SONG, Jiaqing FAN
    2022, 42(12):  3884-3890.  DOI: 10.11772/j.issn.1001-9081.2021091636
    Asbtract ( )   HTML ( )   PDF (1463KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In order to solve the problems that the segmentation accuracy and speed are difficult to balance and the algorithm cannot effectively distinguish similar foreground and background objects in the task of semi-supervised video object segmentation, a semi-supervised video object segmentation algorithm was proposed on the basis of deep and shallow feature fusion. Firstly, a pre-generated rough mask was used to process image features, thereby achieving more robust features. Secondly, deep semantic information was extracted by the attention model. Finally, deep semantic information and shallow position information were fused to obtain more accurate segmentation results. Experiments were conducted on multiple popular datasets. The experiment results demonstrate that the proposed algorithm improves the Jaccard (J) index by 1.8 percentage points and improves the comprehensive evaluation index mean of J and F?score J&F by 2.3 percentage points compared with Learning Fast and Robust Target Models for Video Object Segmentation (FRTM) algorithm on DAVIS 2016 dataset. Meanwhile, on DAVIS 2017 dataset, the proposed algorithm improves J index by 1.2 percentage points and improves the comprehensive evaluation index J&F by 1.1 percentage points compared with FRTM algorithm. The above results fully prove that the proposed algorithm can achieve higher segmentation accuracy with fast speed, and effectively distinguish background and foreground objects with strong robustness. It can be seen that the proposed algorithm has superior performance in balancing speed and accuracy and effectively distinguishing foreground and background.

    Multi-attention fusion network for medical image segmentation
    Hong LI, Junying ZOU, Xicheng TAN, Guiyang LI
    2022, 42(12):  3891-3899.  DOI: 10.11772/j.issn.1001-9081.2021101737
    Asbtract ( )   HTML ( )   PDF (1600KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In the field of deep medical image segmentation, TransUNet (merit both Transformers and U-Net) is one of the current advanced segmentation models. However, the local connection between adjacent blocks in its encoder is not considered, and the inter-channel information is not interactive during the upsampling process of the decoder. To address the above problems, a Multi-attention FUsion Network (MFUNet) model was proposed. Firstly, a Feature Fusion Module (FFM) was introduced in encoder part to enhance the local connections between adjacent blocks in the Transformer and maintain the spatial location relationships of the images themselves. Then, a Double Channel Attention (DCA) module was introduced in the decoder part to fuse the channel information of multi-level features, which enhanced the sensitivity of the model to the key information between channels. Finally, the model's constraints on the segmentation results was strengthened by combining cross-entropy loss and Dice loss. By conducting experiments on Synapse and ACDC public datasets, it can be seen that MFUNet achieves Dice Similarity Coefficient (DSC) of 81.06% and 90.91%, respectively. Compared with the baseline model TransUNet, MFUNet achieved an 11.5% reduction in Hausdorff Distance (HD) on the Synapse dataset, and improved segmentation accuracy by 1.43 and 3.48 percentage points on the ACDC dataset for both the right ventricular and myocardial components, respectively. The experimental results show that MFUNet can achieve better segmentation results in both internal filling and edge prediction of medical images, which can help improve the diagnostic efficiency of doctors in clinical practice.

    Image caption generation model with adaptive commonsense gate
    You YANG, Lizhi CHEN, Xiaolong FANG, Longyue PAN
    2022, 42(12):  3900-3905.  DOI: 10.11772/j.issn.1001-9081.2021101743
    Asbtract ( )   HTML ( )   PDF (2101KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Focusing on the issues that the traditional image caption models cannot make full use of image information, and have only single method of fusing features, an image caption generation model with Adaptive Commonsense Gate (ACG) was proposed. Firstly, VC R-CNN (Visual Commonsense Region-based Convolutional Neural Network) was used to extract visual commonsense features and input commonsense feature layer into Transformer encoder. Then, ACG was designed in each layer of encoder to perform adaptive fusion operation on visual commonsense features and encoding features. Finally, the encoding features fused with commonsense information were fed into Transformer decoder to complete the training. Training and testing were carried out on MSCOCO dataset. The results show that the proposed model reaches 39.2, 129.6 and 22.7 respectively on the evaluation indicators BLEU (BiLingual Evaluation Understudy)-4, CIDEr (Consensus-based Image Description Evaluation) and SPICE (Semantic Propositional Image Caption Evaluation), which are improved by 3.2%,2.9% and 2.3% respectively compared with those of the POS-SCAN (Part-Of-Speech Stacked Cross Attention Network) model. It can be seen that the proposed model significantly outperforms Transformer models using single salient region feature and can describe the image content accurately.

    Influence of channel on formant of vowel in Chinese mandarin
    Yijie LIU, Jiangchun LI, Weina CHEN, Qihan HUANG
    2022, 42(12):  3906-3912.  DOI: 10.11772/j.issn.1001-9081.2021101816
    Asbtract ( )   HTML ( )   PDF (2395KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Aiming at the problem of influence of the channel on the characteristics of the vowel formant, a systematic experiment was carried out. Firstly, the standard recordings of 8 volunteers were collected. Then, the standard recordings were played with the mouth simulator, and 104 channel recordings were recorded using 13 different channels. Finally, the characteristic voice segments were extracted, and chi-square test analysis was used in the qualitative analysis of the spectral characteristics, and one-sample t-test was used in the quantitative analysis of acoustic parameters. The statistical results show that about 69% of the channels have a significant influence on the overall form of the high-order formants, and about 85% of the channels have significant differences in the relative intensity of the formants. The one-sample t-test results show that there is no significant difference between the standard recordings and the channel recordings in center frequency of the formant. Experimental results show that the frequency characteristics of formants should be paid more attention to when processing the identification of voices in different channels.

    Frontier and comprehensive applications
    Review of peer grading technologies for online education
    Jia XU, Jing LIU, Ge YU, Pin LYU, Panyuan YANG
    2022, 42(12):  3913-3923.  DOI: 10.11772/j.issn.1001-9081.2021101709
    Asbtract ( )   HTML ( )   PDF (1682KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    With the rapid development of online education platforms represented by Massive Open Online Courses (MOOC), how to evaluate the large-scale subjective question assignments submitted by platform learners is a big challenge. Peer grading is the mainstream scheme for the challenge, which has been widely concerned by both academia and industry in recent years. Therefore, peer grading technologies for online education were survyed and analyzed. Firstly, the general process of peer grading was summarized. Secondly, the main research results of important peer grading activities, such as grader allocation, comment analysis, abnormal peer grading information detection and processing, true grade estimation of subjective question assignments, were explained. Thirdly, the peer grading functions of representative online education platforms and published teaching systems were compared. Finally, the future development trends of peer grading was summed up and prospected, thereby providing reference for people who are engaged in or intend to engage in peer grading research.

    UWB-VIO integrated indoor positioning algorithm for mobile robots
    Bingqi SHEN, Zhiming ZHANG, Shaolong SHU
    2022, 42(12):  3924-3930.  DOI: 10.11772/j.issn.1001-9081.2021101778
    Asbtract ( )   HTML ( )   PDF (2499KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    For the positioning task of mobile robots in indoor environment, the emerging auxiliary positioning technology based on Visual Inertial Odometry (VIO) is heavily limited by the light conditions and cannot works in the dark environment. And Ultra-Wide Band (UWB)-based positioning methods are easily affected by Non-Line Of Sight (NLOS) error. To solve the above problems, an indoor mobile robot positioning algorithm based on the combination of UWB and VIO was proposed. Firstly, S-MSCKF (Stereo-Multi-State Constraint Kalman Filter) algorithm/DS-TWR (Double Side-Two Way Ranging) algorithm and trilateral positioning method were used to obtain the position information of VIO output/positioning information resolved by UWB respectively. Then, the motion equation and observation equation of the position measurement system were established. Finally, the optimal position estimation of the robot was obtained by data fusion carried out using Error State-Extended Kalman Filter (ES-EKF) algorithm. The built mobile positioning platform was used to verify the combined positioning method in different indoor environments. Experimental results show that in the indoor environment with obstacles, the proposed algorithm can reduce the maximum error of overall positioning by about 4.4% and the mean square error of overall positioning by about 6.3% compared with the positioning method only using UWB, and reduce the maximum error of overall positioning by about 31.5% and the mean square error of overall positioning by about 60.3% compared with the positioning method using VIO. It can be seen that the proposed algorithm can provide real-time, accurate and robust positioning results for mobile robots in indoor environment.

    Air passenger demand forecasting based on dual decomposition and reconstruction strategy
    Huilin LI, Hongtao LI, Zhi LI
    2022, 42(12):  3931-3940.  DOI: 10.11772/j.issn.1001-9081.2021101716
    Asbtract ( )   HTML ( )   PDF (2466KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    Considering the seasonal, nonlinear and non-stationary characteristics of air passenger demand series, an air passenger demand forecasting model based on a dual decomposition and reconstruction strategy was proposed. Firstly, the air passenger demand series was decomposed twice by Seasonal and Trend decomposition using Loess (STL) and Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) methods, and the components were reconstructed based on the feature analysis results of complexity and correlation. Then, Seasonal AutoRegressive Integrated Moving Average (SARIMA), AutoRegressive Integrated Moving Average (ARIMA), Kernel based Extreme Learning Machine (KELM) and Bidirectional Long Short-Term Memory (BiLSTM) network models were selected by model matching strategy to predict each reconstructed component respectively, among which the hyperparameters of KELM and BiLSTM models were determined by the Adaptive Tree of Parzen Estimators (ATPE) algorithm. Finally, the prediction results of the reconstruction components were linearly integrated. The air passenger demand data collected from Beijing Capital International Airport, Shenzhen Bao’an International Airport and Haikou Meilan International Airport were taken as research subjects for one-step and multi-step ahead prediction experiments. Experimental results show that compared with the single decomposition ensemble model STL-SAAB, the proposed model has the Root Mean Square Error (RMSE) improved by 14.98% to 60.72%. It can be seen that guided by the idea of “divide and rule”, the proposed model combines model matching and reconstruction strategies to extract the inherent development pattern of the data, which provides a new thinking to scientifically predict the change of air passenger demand.

    Spatio-temporal heat prediction of online car‑hailing demand based on deep aggregated neural network
    Yuhan GUO, Ning TIAN
    2022, 42(12):  3941-3949.  DOI: 10.11772/j.issn.1001-9081.2021101718
    Asbtract ( )   HTML ( )   PDF (3749KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    To solve the supply-demand imbalance between service vehicles and passengers, improve the operational efficiency and profit of service vehicles, and reduce passengers' waiting time as well as improve their satisfaction with the service platform at the same time, a Deep Aggregation Neural Network (DANN) model was proposed for predicting the demand of online car-hailing aiming at the multi-dimensional spatio-temporal data with differentiated structures. Firstly, a period-based spatio-temporal variable classification method and a spatial variable classification method based on image point values were proposed by considering multi-dimensional influencing factors such as time, space, and external environment comprehensively. Secondly, different sub neural network structures were constructed to fit the nonlinear relationships between temporal, spatial, environmental variables and the demand respectively based on data characteristics. Thirdly, an aggregation method of multiple heterogeneous sub neural networks was proposed to simultaneously capture the implicit features of spatio-temporal data with different structures. Finally, a method of setting aggregation weights was analyzed to obtain the optimal performance of the network model. Experimental results show that the proposed model has the average error of R2 on three real-world datasets of 9.36%, and compared with the Fusion Convolutional Long Short-Term Memory Network (FCL-Net) and Hybrid Deep Learning Neural Network (HDLN-Net) models, the proposed model has the R2 increased by 4.6% and 5.22% on average respectively, and the Mean Square Error (MSE) reduced by 27.01% and 26.6% on average respectively. Therefore, DANN can greatly improve the accuracy of demand prediction in practical applications and can be used as an effective means of demand prediction for online car-hailing.

    Trajectory control of quadrotor based on reinforcement learning-iterative learning
    Xuguang LIU, Changping DU, Yao ZHENG
    2022, 42(12):  3950-3956.  DOI: 10.11772/j.issn.1001-9081.2021101814
    Asbtract ( )   HTML ( )   PDF (1647KB) ( )  
    Figures and Tables | References | Related Articles | Metrics

    In order to further improve the trajectory tracking accuracy of quadrotor in unknown environment, a control method adding an iterative learning feedforward controller to the traditional feedback control architecture was proposed. Facing the difficulty of tuning learning parameters in the process of Iterative Learning Control (ILC), a method of tuning and optimizing learning parameters of iterative learning controllers using Reinforcement Learning (RL) was proposed. Firstly, RL was used to optimize the learning parameters of iterative learning controller, and the optimal learning parameters under the current environment and tasks were filtered out to ensure the optimal control effect of the iterative learning controller. Then, with the learning ability of iterative learning controllers, the feedforward input was optimized iteratively until the perfect tracking was achieved. Finally, in the simulation environment with random noise, experiments were carried out to compare the proposed Reinforcement Learning-Iterative Learning Control (RL-ILC) algorithm with ILC method without optimizing parameters, Sliding Mode Control (SMC) method and Proportional-Integral-Derivative (PID) control method. Experimental results show that after two iterations, the proposed algorithm has the total error reduced to 0.2% of the initial error, achieving rapid convergence. Compared with SMC method and PID control method, RL-ILC algorithm is not affected by noise and does not produce trajectory fluctuations after algorithm convergence. The results illustrate that the proposed algorithm can effectively improve the trajectory tracking task’s accuracy and robustness.

2024 Vol.44 No.4

Current Issue
Archive
Honorary Editor-in-Chief: ZHANG Jingzhong
Editor-in-Chief: XU Zongben
Associate Editor: SHEN Hengtao XIA Zhaohui
Domestic Post Distribution Code: 62-110
Foreign Distribution Code: M4616
Address:
No. 9, 4th Section of South Renmin Road, Chengdu 610041, China
Tel: 028-85224283-803
  028-85222239-803
Website: www.joca.cn
E-mail: bjb@joca.cn
WeChat
Join CCF