Journal of Computer Applications

Feature selection method for graph neural network based on network architecture design

Dapeng XU, Xinmin HOU

2024, 44(3): 663-670. DOI: 10.11772/j.issn.1001-9081.2023030353

Asbtract ( )

HTML ( )

PDF (1001KB) ( )

Figures and Tables | References | Related Articles | Metrics

In recent years， researchers have proposed many improved model architecture designs for Graph Neural Network （GNN）， driving performance improvements in various prediction tasks. But most GNN variants start with the assumption that node features are equally important， which is not the case. To solve this problem， a feature selection method was proposed to improve the existing model and select important feature subsets for the dataset. The proposed method consists of two components， a feature selection layer， and a separate label-feature mapping. Softmax normalizer and feature “soft selector” were used for feature selection in the feature selection layer， and the model structure was designed under the idea of separate label-feature mapping to select the corresponding subsets of related features for different labels， and multiple related feature subsets were performed union operation to obtain an important feature subset of the final dataset. Graph ATtention network （GAT） and GATv2 models were selected as the benchmark models， and the algorithm was applied to the benchmark models to obtain new models. Experimental results show that when the proposed models perform node classification tasks on six datasets， their accuracies are improved by 0.83% - 8.79% compared with the baseline models. The new models also select the corresponding important feature subsets for the six datasets， in which the number of features accounts for 3.94% - 12.86% of the total number of features in their respective datasets. After using the important feature subset as the new input of the benchmark model， the accuracy more than 95% （using all features） is still achieved. That is， the scale of the model is reduced while ensuring the accuracy. It can be seen that the proposed new algorithm can improve the accuracy of node classification， and can effectively select the corresponding important feature subset for the dataset.

Hyperparameter optimization for neural network based on improved real coding genetic algorithm

Wei SHE, Yang LI, Lihong ZHONG, Defeng KONG, Zhao TIAN

2024, 44(3): 671-676. DOI: 10.11772/j.issn.1001-9081.2023040441

Asbtract ( )

HTML ( )

PDF (1532KB) ( )

Figures and Tables | References | Related Articles | Metrics

To address the problems of poor effects， easily falling into suboptimal solutions， and inefficiency in neural network hyperparameter optimization， an Improved Real Coding Genetic Algorithm （IRCGA） based hyperparameter optimization algorithm for the neural network was proposed， which was named IRCGA-DNN （IRCGA for Deep Neural Network）. Firstly， a real-coded form was used to represent the values of hyperparameters， which made the search space of hyperparameters more flexible. Then， a hierarchical proportional selection operator was introduced to enhance the diversity of the solution set. Finally， improved single-point crossover and variational operators were designed to explore the hyperparameter space more thoroughly and improve the efficiency and quality of the optimization algorithm， respectively. Two simulation datasets were used to show IRCGA’s performance in damage effectiveness prediction and convergence efficiency. The experimental results on two datasets indicate that， compared to GA-DNN（Genetic Algorithm for Deep Neural Network）， the proposed algorithm reduces the convergence iterations by 8.7% and 13.6% individually， and the MSE （Mean Square Error） is not much different； compared to IGA-DNN（Improved Genetic Algorithm for Deep Neural Network）， IRCGA-DNN achieves reductions of 22.2% and 13.6% in convergence iterations respectively. Experimental results show that the proposed algorithm is better in both convergence speed and prediction performance， and is suitable for hyperparametric optimization of neural networks.

Strategy of invalid clause elimination in first-order logic theorem prover

Shipan JIANG, Shuwei CHEN, Guoyan ZENG

2024, 44(3): 677-682. DOI: 10.11772/j.issn.1001-9081.2023030284

Asbtract ( )

HTML ( )

PDF (905KB) ( )

Figures and Tables | References | Related Articles | Metrics

In the first-order logic theorem prover， clause preprocessing is an essential step， and the rule of clause elimination is an extremely important part of preprocessing. The traditional clause elimination method based on pure literal rules has some drawbacks which more than enough clauses should be deleted in theory， while less clauses were deleted during implementation. In order to make the clause elimination more accurate， the clauses were classified based on pure literal rules. The first category was called the invalid clause which was not able to form complementary pair to any clause in the clause set through equivalence substitution， and should be completely deleted. The second category was called the relatively invalid clause， which was not complementary to any clause in the current clause set， but could be replaced by other clause after equivalence substitution and should be deleted after certain deduction steps. The clause elimination should actually be a dynamic process where the current clause elimination would affect the invalidity of the determined clauses. Therefore， a clause elimination recursive traversal algorithm for determining clause invalidity was presented and implemented to the prover CSE1.5 （Contradiction Separation Extension 1.5）. The problems in first-order logic problem group of the CADE （Conference on Automated DEduction） Automated Theorem Proving （ATP） system competition from 2019 to 2022 were used as the test problems. The CSE1.5_IC with the invalid clause elimination algorithm proved 27 more problems than original CSE1.5 in 300 s. Among all the FNE （FOF theorems without Equality） test cases jointly proved by the two versions of the prover， CSE1.5_IC eliminated 28 more invalid clauses per problem on average than CSE1.5， and the average solution time was reduced by 7.07 s. The experimental results show that the proposed invalid clause elimination algorithm is an effective preprocessing method， which increases the reduction accuracy in the first-order logical clause set， and improves the proving ability and shortens the proof time of automatic theorem prover.

Knowledge-guided visual relationship detection model

Yuanlong WANG, Wenbo HU, Hu ZHANG

2024, 44(3): 683-689. DOI: 10.11772/j.issn.1001-9081.2023040413

Asbtract ( )

HTML ( )

PDF (1592KB) ( )

Figures and Tables | References | Related Articles | Metrics

The task of Visual Relationship Detection （VRD） is to further detect the relationship between target objects on the basis of target recognition， which belongs to the key technology of visual understanding and reasoning. Due to the interaction and combination between objects， it is easy to cause the combinatorial explosion problem of relationship between objects， resulting in many entity pairs with weak correlation， which in turn makes the subsequent relationship detection recall rate low. To solve the above problems， a knowledge-guided visual relationship detection model was proposed. Firstly， visual knowledge was constructed， data analysis and statistics were carried out on entity labels and relationship labels in common visual relationship detection datasets， and the interaction co-occurrence frequency between entities and relationships was obtained as visual knowledge. Then， the constructed visual knowledge was used to optimize the combination process of entity pairs， the score of entity pairs with weak correlation decreased， while the score of entity pairs with strong correlation increased， and then the entity pairs were ranked according to their scores and the entity pairs with lower scores were deleted； the relationship score was also optimized in a knowledge-guided way for the relationship between entities， so as to improve the recall rate of the model. The effect of the proposed model was verified in the public datasets VG （Visual Genome） and VRD， respectively. In predicate classification tasks， compared with the existing model PE-Net （Prototype-based Embedding Network）， the recall rates Recall@50 and Recall@100 improved by 1.84 and 1.14 percentage points respectively in the VG dataset. Compared to Coacher， the Recall@20， Recall@50 and Recall@100 increased by 0.22， 0.32 and 0.31 percentage points respectively in the VRD dataset.

Joint approach of intent detection and slot filling based on multi-task learning

Aiguo SHANG, Xinjuan ZHU

2024, 44(3): 690-695. DOI: 10.11772/j.issn.1001-9081.2023040443

Asbtract ( )

HTML ( )

PDF (1281KB) ( )

Figures and Tables | References | Related Articles | Metrics

With the application of pre-trained language models in Natural Language Processing （NLP） tasks， joint modeling of Intent Detection （ID） and Slot Filling （SF） has improved the performance of Spoken Language Understanding （SLU）. Existing methods mostly focus on the interaction between intents and slots， neglecting the influence of modeling differential text sequences on SLU tasks. A joint method for Intent Detection and Slot Filling based on Multi-task Learning （IDSFML） was proposed. Firstly， differential texts were constructed using random mask strategy， and a neural network structure combining AutoEncoder and Attention mechanism （AEA） was designed to incorporate the features of differential text sequences into the SLU task. Secondly， a similarity distribution task was designed to make the representations of differential texts and original texts similar. Finally， three tasks of ID， SF and differential text sequence similarity distribution were jointly trained. Experimental results on Airline Travel Information Systems （ATIS） and SNIPS datasets show that， compared with the suboptimal baseline method SASGBC （Self-Attention and Slot-Gated on top of BERT with CRF）， IDSFML improves the F1 scores of slot filling by 1.9 and 1.6 percentage points respectively， and improves the accuracy of intent detection by 0.2 and 0.4 percentage points respectively， enhancing the accuracy of spoken language understanding tasks.

Relational and interactive graph attention network for aspect-level sentiment analysis

Lei GUO, Zhen JIA, Tianrui LI

2024, 44(3): 696-701. DOI: 10.11772/j.issn.1001-9081.2023030288

Asbtract ( )

HTML ( )

PDF (1072KB) ( )

Figures and Tables | References | Related Articles | Metrics

The neural network models based on attention mechanism are mainly used in the field of aspect-level sentiment analysis. The dependencies between aspect words and opinion words， as well as the distances between aspect words and context words， are ignored by this type of models， which further leads to inaccurate classification of emotions by this type of models. To solve above problems， a Relational and Interactive Graph ATtention network （RI-GAT） model was established. Firstly， the semantic features of sentences were learned by the Long Short-Term Memory （LSTM） network. Then the learned semantic features were combined with the position information of sentences to generate new features. Finally the dependencies between various aspects words and opinion words were extracted from the new features， realizing efficient and comprehensive use of syntactic dependency information and position information. Experimental results on Laptop， Restaurant， and Twitter datasets show that compared to the suboptimal Dynamic Multi-channel Graph Convolutional Network （DM-GCN）， RI-GAT model has the classification Accuracy （Acc） improved by 0.67， 1.65， and 1.36 percentage points， indicating that RI-GAT model can better establish the relationship between aspect words and opinion words， making sentiment classification more accurate.

Chinese named entity recognition combining prior knowledge and glyph features

Yongfeng DONG, Jiaming BAI, Liqin WANG, Xu WANG

2024, 44(3): 702-708. DOI: 10.11772/j.issn.1001-9081.2023030361

Asbtract ( )

HTML ( )

PDF (750KB) ( )

Figures and Tables | References | Related Articles | Metrics

To address the problem that relevant models typically only model characters and relevant vocabulary without fully utilizing the unique glyph structure information and entity type information of Chinese characters， a model that integrates prior knowledge and glyph features for Named Entity Recognition （NER） task was proposed. Firstly， the input sequence was encoded using a Transformer combined with Gaussian attention mechanism， and the Chinese definitions of entity types were obtained from Chinese Wikipedia. Bidirectional Gated Recurrent Unit （BiGRU） was used to encode the entity type information as prior knowledge， which was combined with the character representation using an attention mechanism. Secondly， Bidirectional Long Short-Term Memory （BiLSTM） network was used to encode the long-distance dependency relationship of the input sequence， and a glyph encoding table was used to obtain traditional Chinese characters’ Cangjie codes and simplified Chinese characters’ modern Wubi codes. Then， Convolutional Neural Network （CNN） was used to extract glyph feature representations， and the traditional and simplified glyph feature representations were combined with different weights， which were then combined with the character representation encoded by BiLSTM using a gating mechanism. Finally， decoding was performed using Conditional Random Field （CRF） to obtain a sequence of named entity annotations. Experiment results on the colloquial dataset Weibo， the small dataset Boson， and the large dataset PeopleDaily show that， compared with the baseline model MECT （Multi-metadata Embedding based Cross-Transformer）， the proposed model has the F1 value increased by 2.47， 1.20， and 0.98 percentage points， respectively， proving the effectiveness of the proposed model.

Text classification based on pre-training model and label fusion

Hang YU, Yanling ZHOU, Mengxin ZHAI, Han LIU

2024, 44(3): 709-714. DOI: 10.11772/j.issn.1001-9081.2023030340

Asbtract ( )

HTML ( )

PDF (922KB) ( )

Figures and Tables | References | Related Articles | Metrics

Accurate classification of massive user text comment data has important economic and social benefits. Nowadays， in most text classification methods， text encoding method is used directly before various classifiers， while the prompt information contained in the label text is ignored. To address the above issues， a pre-training model based Text and Label Information Fusion Classification model based on RoBERTa （Robustly optimized BERT pretraining approach） was proposed， namely TLIFC-RoBERTa. Firstly， a RoBERTa pre-training model was used to obtain the word vector. Then， the Siamese network structure was used to train the text and label vectors respectively， and the label information was mapped to the text through interactive attention， so as to integrate the label information into the model. Finally， an adaptive fusion layer was set to closely fuse the text representation with the label representation for classification. Experimental results on Today Headlines and THUCNews datasets show that compared with mainstream deep learning models such as RA-Labelatt （replacing static word vectors in Label-based attention improved model with word vectors trained by RoBERTa-wwm） and LEMC-RoBERTa （RoBERTa combined with Label-Embedding-based Multi-scale Convolution for text classification）， the accuracy of TLIFC-RoBERTa is the highest， and it achieves the best classification performance in user comment datasets.

Multi-feature fusion attention-based hierarchical classification method for dialogue act

Zongze JIA, Pengfei GAO, Yinglong MA, Xiaofeng LIU, Haixin XIA

2024, 44(3): 715-721. DOI: 10.11772/j.issn.1001-9081.2023030358

Asbtract ( )

HTML ( )

PDF (1143KB) ( )

Figures and Tables | References | Related Articles | Metrics

Nowadays， deep learning models have been widely applied in dialogue act recognition， which can improve classification performance by mining various features of dialogue acts. However， the existing methods neglect the latent association and interaction between different features of dialogue acts and also seldom consider the semantic relevance between labels of dialogue act in the classification process， which hinders from improving the performance of dialogue act recognition. To solve these problems， an MFA-HC （Multi-feature Fusion Attention-based Hierarchical Classification） method for recognizing dialogue act was proposed. Firstly， a hierarchical dialogue act classification framework based on learning without forgetting was proposed， which combined various fine-grained features such as words， parts of speech and relevant linguistic statistics to learn and train the dialogue act classification model. Secondly， a universality-individuality model based on attention mechanism was proposed to capture the universality and individuality features among different features. Experimental results on two benchmark datasets SwDA （Switchboard Dialogue Act corpus） and MRDA （ICSI Meeting Recorder Dialogue Act corpus） show that， compared with DARER （Dual-tAsk temporal Relational rEcurrent Reasoning network）， which has the current overall superior performance in existing methods， MFA-HC method improves the classification accuracy by 0.6% and 0.1% by capturing the universality and individuality features hidden in the utterance.

Remote sensing image recommendation method based on content interpretation

Yuqiu LI, Liping HOU, Jian XUE, Ke LYU, Yong WANG

2024, 44(3): 722-731. DOI: 10.11772/j.issn.1001-9081.2023030313

Asbtract ( )

HTML ( )

PDF (2902KB) ( )

Figures and Tables | References | Related Articles | Metrics

With the continuous development of remote sensing technology， there has been a significant increase in the volume of remote sensing data. Providing accurate and timely remote sensing information recommendation services has become an urgent problem to solve. Existing remote sensing image recommendation algorithms mainly focus on user portrait， overlooking the influence of image content semantics on recommendation results. To address these issues， a remote sensing image recommendation method based on content interpretation was proposed. Firstly， an object extraction module based on YOLOv3 was used to extract objects from remote sensing images. Then， the location distribution vectors of key objects were integrated as image content information. Additionally， a multi-element user interest portrait was constructed and dynamically adjusted based on the user’s active search history to enhance the personality of recommendation results. Finally， the image content information was matched with the inherent attribute information of image and the user profile model to achieve accurate and intelligent recommendations of remote sensing data. Comparative experiments were conducted on real order data， to compare the proposed method with the newer recommendation method based solely on image attribute information. Experimental results show that the proposed method achieves a 70% improvement in the discrimination of positive and negative samples on the experimental data compared to the recommendation method considering user portrait. When using 10% training data with similar consumption time， the recommendation error rate decreases by 4.0 - 5.6 percentage points compared to the comparison method. When using 100% training data， the recommendation error rate decreases by 0.6 - 1.0 percentage points. These results validate the feasibility and effectiveness of the proposed method.

Remote sensing image classification based on sample incremental learning

Xue LI, Guangle YAO, Honghui WANG, Jun LI, Haoran ZHOU, Shaoze YE

2024, 44(3): 732-736. DOI: 10.11772/j.issn.1001-9081.2023030366

Asbtract ( )

HTML ( )

PDF (1266KB) ( )

Figures and Tables | References | Related Articles | Metrics

Deep learning models have achieved remarkable results in remote sensing image classification. With the continuous collection of new remote sensing images， when the remote sensing image classification models based on deep learning train new data to learn new knowledge， their recognition performance of old data will decline， that is， old knowledge forgetting. In order to help remote sensing image classification model consolidate old knowledge and learn new knowledge， a remote sensing image classification model based on sample incremental learning， namely ICLKM （Incremental Collaborative Learning Knowledge Model） was proposed. The model consisted of two knowledge networks. The first network mitigated knowledge forgetting by retaining the output of the old model through knowledge distillation. The second network took the output of new data as the learning objective of the first network and effectively learned new knowledge by maintaining the consistency of the dual network models. Finally， two networks learned together to generate more accurate model through knowledge collaboration strategy. Experimental results on two remote sensing datasets NWPU-RESISC45 and AID show that， ICLKM has the accuracy improved by 3.53 and 6.70 percentage points respectively compared with FT （Fine-Tuning） method. It can be seen that ICLKM can effectively solve the knowledge forgetting problem of remote sensing image classification and continuously improve the recognition accuracy of known remote sensing images.

Semantic segmentation method for remote sensing images based on multi-scale feature fusion

Ning WU, Yangyang LUO, Huajie XU

2024, 44(3): 737-744. DOI: 10.11772/j.issn.1001-9081.2023040439

Asbtract ( )

HTML ( )

PDF (2809KB) ( )

Figures and Tables | References | Related Articles | Metrics

To improve the accuracy of semantic segmentation for remote sensing images and address the loss problem of small-sized target information during feature extraction by Deep Convolutional Neural Network （DCNN）， a semantic segmentation method based on multi-scale feature fusion named FuseSwin was proposed. Firstly， an Attention Enhancement Module （AEM） was introduced in the Swin Transformer to highlight the target area and suppress background noise. Secondly， the Feature Pyramid Network （FPN） was used to fuse the detailed information and high-level semantic information of the multi-scale features to complement the features of the target. Finally， the Atrous Spatial Pyramid Pooling （ASPP） module was used to capture the contextual information of the target from the fused feature map and further improve the model segmentation accuracy. Experimental results demonstrate that the proposed method outperforms current mainstream segmentation methods.The mean Pixel Accuracy （mPA） and mean Intersection over Union （mIoU） of the proposed method on Potsdam remote sensing dataset are 2.34 and 3.23 percentage points higher than those of DeepLabV3 method， and 1.28 and 1.75 percentage points higher than those of SegFormer method. Additionally， the proposed method was applied to identify and segment oyster rafts in high-resolution remote sensing images of the Maowei Sea in Qinzhou， Guangxi， and achieved Pixel Accuracy （PA） and Intersection over Union （IoU） of 96.21% and 91.70%， respectively.

Few-shot object detection combining feature fusion and enhanced attention

Xinye LI, Yening HOU, Yinghui KONG, Zhiqi YAN

2024, 44(3): 745-751. DOI: 10.11772/j.issn.1001-9081.2023030315

Asbtract ( )

HTML ( )

PDF (4000KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to fully utilize the key information in support features and query features， a few-shot object detection method based on feature fusion and enhanced attention was proposed， namely FFA-FSOD （Feature Fusion and enhanced Attention Few-Shot Object Detection）. Firstly， the iterative Attention Feature Fusion （iAFF） module was introduced to effectively fuse the key features of the support image and the query image. Secondly， the feature enhancement operation was added after the iAFF module， which made full use of the support feature information to enhance the object features in the query image. To avoid the loss of part of the details of the query image after the above two operations， the Multi-Scale Channel Attention Module （MS-CAM） was improved in the iAFF module to capture more context information. Experimental results on MS COCO dataset under 2-way 10-shot condition show that compared with FSOD （Few-Shot Object Detection） method， after adding the iAFF module， feature enhancement operation and improving MS-CAM， FFA-FSOD has mean Average Precision （mAP） increased by 8.0%. Experimental results show that the proposed feature fusion enhancement method pays full attention to the details of features， thus achieving better detection effect of few-shot objects.

Motif detection algorithm in multiplex networks

Shuhong XUE, Biao FENG, Hailong YU, Li WANG, Yunyun YANG

2024, 44(3): 752-759. DOI: 10.11772/j.issn.1001-9081.2023030300

Asbtract ( )

HTML ( )

PDF (2299KB) ( )

Figures and Tables | References | Related Articles | Metrics

The interaction between entities in complex systems is vividly described by multiplex networks， and motifs frequently appear in networks as a higher-order structure. Compared with single-layer motifs， multiplex motifs have the characteristics of large quantity， diverse types， and complicated structure. Given the current lack of complete detection algorithm for multiplex motifs， a Fast Algorithm for Multiplex Motif Detection （FAMMD） suitable for multiplex networks was proposed. Firstly， an improved ESU （Enumerate SUbgraphs） algorithm was used to enumerate multiplex subgraphs. Then a method combining layer markers and binary strings was used for accelerating the process of isomorphism detection， and a null model that preserved degree sequences and inter-layer dependencies was constructed for multiplex subgraph testing. Finally， motif detection was performed on two-layer real networks. Multiplex motifs exhibited a closely connected triple mode， and they were more homogeneous in social networks while more complementary in transportation networks. Experimental results show that the proposed method can accurately and quickly detect multiplex motifs that reflect the structure characteristics of the network and conform the actual situation.

Social event recommendation method based on unexpectedness metric

Tao SUN, Zhangtian DUAN, Haonan ZHU, Peihao GUO, Heli SUN

2024, 44(3): 760-766. DOI: 10.11772/j.issn.1001-9081.2023030362

Asbtract ( )

HTML ( )

PDF (919KB) ( )

Figures and Tables | References | Related Articles | Metrics

In Event-Based Social Network （EBSN）， the recommendation work starts from the user historical preferences to model user preferences， which hinders the scope and ways for users to access new things. Aiming at the above problems， an unexpectedness metric-based social event recommendation model was proposed， namely UER（Unexpectedness-based Event Recommendation）. UER model included two sub-models， Base and Unexpected. Firstly， based on the interaction sequence characteristics of users， events， and user historical events， the Base sub-model used the attention mechanism to measure the weights of events in user historical preferences， and finally predicted the probabilities of users participating in events. Secondly， multiple interest representations of the user were extracted by Unexpected sub-model through the self-attention mechanism to calculate the unexpectedness of the user itself and the unexpectedness value of the candidate event to the user according to the multiple interest representations of the user， so as to measure the unexpectedness of the recommended event. Experimental results on Meetup-California dataset show that compared with Deep Interest Network （DIN） and Personalized Unexpected Recommender System （PURS）， the recommendation Hit Ratio （HR） of the UER model is increased by 22.9% and 30.3%， the Normalized Discounted Cumulative Gain （NDCG） is increased by 27.5% and 42.3%， and the unexpectedness of recommended events is increased by 54.5% and 21.4% respectively. On Meetup-NewYork dataset， the recommendation HR of the UER model is increased by 18.2% and 21.8%， the NDCG is increased by 26.9% and 32.0%， and the unexpectedness of recommended events is increased by 52.6% and 20.8% respectively.

Feature selection algorithm for high-dimensional data with maximum correlation and maximum difference

Shengjie MENG, Wanjun YU, Ying CHEN

2024, 44(3): 767-771. DOI: 10.11772/j.issn.1001-9081.2023030365

Asbtract ( )

HTML ( )

PDF (698KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problems of redundant information and too high dimension in high-dimensional data， a Maximum Correlation maximum Difference feature selection algorithm （MCD） based on the maximum correlation of information quantity was proposed. Firstly， the correlation between Mutual Information （MI） measurement features and labels was used to sort and select features with the largest mutual information into feature subsets according to the relevant knowledge of information theory. Then， the information distance was introduced to measure the information redundancy and difference between the two features， and the evaluation criteria were designed to evaluate each feature， so that the correlation between the features and labels， and the difference between the features were the largest. Finally， the forward search strategy combined with the evaluation criteria was used to reduce the attributes and optimize the feature subset. Using 2 different classifiers， comparative experiments were carried out on 6 datasets with 5 classical algorithms such as mRMR （minimal-Redundancy-Maximal-Relevance criterion） and RReliefF， and the validity of MCD was verified by using the classification accuracy. Under the Support Vector Machine （SVM） classifier， the average classification accuracy increased by 5.67 - 23.80 percentage points， respectively； and under the K-Nearest Neighbor （KNN） classifier， the average classification accuracy increased by 2.69 - 25.18 percentage points， respectively. It can be seen that in the vast majority of cases， MCD can effectively remove redundant features and significantly improve classification accuracy.

High-capacity robust image steganography scheme based on encoding-decoding network

Weina DONG, Jia LIU, Xiaozhong PAN, Lifeng CHEN, Wenquan SUN

2024, 44(3): 772-779. DOI: 10.11772/j.issn.1001-9081.2023040477

Asbtract ( )

HTML ( )

PDF (3068KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problems that the high-capacity steganography model based on encoding-decoding network has weak robustness and can not resist noise attack and channel compression， a high-capacity robust image steganography scheme based on encoding-decoding network was proposed. In the proposed scheme， encoder， decoder and discriminator based on Densely connected convolutional Network （DenseNet） were designed. The secret information and the carrier image were jointly encoded into a steganographic image by the encoder， the secret information was extracted by the decoder， and the discriminator was used to distinguish between carrier images and steganographic images. A noise layer was added between the encoder and the decoder； Dropout， JPEG compression， Gaussian blur， Gaussian noise and salt and pepper noise were used to simulate a real environment with various kinds of noise attacks. The steganographic image output by the encoder was processed by different kinds of noise and decoded by the decoder. Through training the model， the secret information could be extracted from the noise-processed steganographic image by the decoder， so that the noise attacks could be resisted. Experiment results show that the steganographic capacity of the proposed scheme reaches 0.45 - 0.95 bpp on 360×360 pixel images， and the relative embedding capacity is improved by 2.04 times compared to the suboptimal robust steganographic scheme； the decoding accuracy reaches 0.72 - 0.97， and compared with the steganography without noise layer， the average decoding accuracy is improved by 44 percentage points. The proposed scheme not only guarantees high embedding quantity and high coding image quality， but also has stronger anti-noise capability.

Reversible information hiding based on pixel prediction and secret image sharing

Qingyu YUAN, Tiegang GAO

2024, 44(3): 780-787. DOI: 10.11772/j.issn.1001-9081.2023030321

Asbtract ( )

HTML ( )

PDF (2395KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to improve the security of image encryption and increase the information hiding capacity of encrypted images， a reversible information hiding algorithm based on pixel prediction and secret image sharing was proposed. Firstly， the shared matrix was used to process an image line-by-line and save it into four shared images. Secondly， random key-encrypted shared images were generated using a 2D chaotic mapping. Thirdly， the Median Edge Detector （MED） was used to predict the pixel value of the embedded position in the shared image. The predicted value was aligned to the same bits as the original pixel from the high bit， and the label value was recorded according to the rule. The high three bits of the reference pixel were stored into the embeddable bits. Finally， the label value was stored in the high bits of the reference pixel， and the remaining embeddable bits were the embedding capacity of the proposed algorithm. Experimental results show that the proposed algorithm can provide large capacity embedding space for information hiding， realize the reversible data hiding and the lossless recovery of encrypted images according to the （k，n） threshold strategy.

Analysis of consistency between sensitive behavior and privacy policy of Android applications

Baoshan YANG, Zhi YANG, Xingyuan CHEN, Bing HAN, Xuehui DU

2024, 44(3): 788-796. DOI: 10.11772/j.issn.1001-9081.2023030290

Asbtract ( )

HTML ( )

PDF (1850KB) ( )

Figures and Tables | References | Related Articles | Metrics

The privacy policy document declares the privacy information that an application needs to obtain， but it cannot guarantee that it clearly and fully discloses the types of privacy information that the application obtains. Currently， there are still deficiencies in the analysis of the consistency between actual sensitive behaviors of applications and privacy policies. To address the above issues， a method for analyzing the consistency between sensitive behaviors and privacy policies of Android applications was proposed. In the privacy policy analysis stage， a Bi-GRU-CRF （Bi-directional Gated Recurrent Unit Conditional Random Field） neural network was used and the model was incrementally trained by adding a custom annotation library to extract key information from the privacy policy declaration. In the sensitive behavior analysis stage， IFDS （Interprocedural， Finite， Distributive， Subset） algorithm was optimized by classifying sensitive API （Application Programming Interface） calls， deleting already analyzed sensitive API calls from the input sensitive source list， and marking already extracted sensitive paths. It ensured that the analysis results of sensitive behaviors matched the language granularity of the privacy policy description， reduced the redundancy of the analysis results and improved the efficiency of analysis. In the consistency analysis stage， the semantic relationships between ontologies were classified into equivalence， subordination， and approximation relationships， and a formal model for consistency between sensitive behaviors and privacy policies was defined based on these relationships. The consistency situations between sensitive behaviors and privacy policies were classified into clear expression and ambiguous expression， and inconsistency situations were classified into omitted expression， incorrect expression， and ambiguous expression. Finally， based on the proposed semantic similarity-based consistency analysis algorithm， the consistency between sensitive behaviors and privacy policies was analyzed. Experimental results show that， by analyzing 928 applications， with the privacy policy analysis accuracy of 97.34%， 51.4% of Android applications are found to have inconsistencies between the actual sensitive behaviors and the privacy policy declaration.

SAT-based impossible differential cryptanalysis of GRANULE cipher

Xiaonian WU, Jing KUANG, Runlian ZHANG, Lingchen LI

2024, 44(3): 797-804. DOI: 10.11772/j.issn.1001-9081.2023040435

Asbtract ( )

HTML ( )

PDF (902KB) ( )

Figures and Tables | References | Related Articles | Metrics

The Boolean SATisfiability problem （SAT）-based automated search methods can directly describe logical operations such as AND， OR， NOT， XOR， and establish more efficient search models. In order to efficiently evaluate the ability of GRANULE cipher to resist impossible differential attacks， firstly， the SAT model described by the S-box differential property was optimized based on the S-box differential distribution table property. Then， the SAT model of bit-oriented impossible differential distinguisher was established for GRANULE cipher， and multiple 10-round impossible differential distinguishers of GRANULE cipher were obtained by solving the SAT model. Furthermore， an improved SAT automated verification method was given， by which the impossible differential distinguishers were verified. Finally， 16-round impossible differential attack was performed on GRANULE-64/80 cipher， where the impossible differential distinguisher was further extended forward 3-round and backward 3-round respectively. As a result， 80-bit master key was recovered with the time complexity of 2^51.8 16-round encryptions and the data complexity of 2^41.8 chosen-plaintexts. Compared with the suboptimal results for impossible differential cryptanalysis of the GRANULE cipher， the number of distinguisher rounds and key recovery attack rounds obtained are improved by 3 rounds， and the time complexity and data complexity are further reduced.

Two-round three-party password-authenticated key exchange protocol over lattices without non-interactive zero-knowledge proof

Xinyuan YIN, Xiaojian ZHENG, Jinbo XIONG

2024, 44(3): 805-810. DOI: 10.11772/j.issn.1001-9081.2023040417

Asbtract ( )

HTML ( )

PDF (918KB) ( )

Figures and Tables | References | Related Articles | Metrics

Focused on the issues of high communication rounds and low execution efficiency in existing lattice-based three-party Password-Authenticated Key Exchange （PAKE） protocols， a two-round three-party PAKE protocol over lattices without Non-Interactive Zero-Knowledge （NIZK） proof was proposed. First， the advantage of non-adaptive approximate smooth projective hash function was taken to achieve key exchange and reduce the number of communication rounds without NIZK proof. Second， session keys were constructed by using hash values and projection hash values without random oracles， thus avoiding potential password guessing attacks. Finally， formal security proof of the proposed protocol was given in the standard model. Simulation results show that compared with lattice-based three-party PAKE protocols， the proposed protocol has the execution time reduced by 89.2% - 98.6% on the client side and 19.0% - 91.6% on the server side. It is verified that the proposed protocol can resist quantum attacks with high execution efficiency and few communication rounds.

Privacy protection scheme for crowdsourced testing tasks based on blockchain and CP-ABE policy hiding

Gaimei GAO, Jin ZHANG, Chunxia LIU, Weichao DANG, Shangwang BAI

2024, 44(3): 811-818. DOI: 10.11772/j.issn.1001-9081.2023040430

Asbtract ( )

HTML ( )

PDF (2095KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to improve the crowdsourced testing data sharing system in the cloud environment and solve the problems of data security and privacy protection in the field of crowdsourced testing， a Crowdsourced Testing Task Privacy Protection （CTTPP） scheme based on blockchain and CP-ABE （Ciphertext-Policy Attribute-Based Encryption） policy hiding was proposed. Blockchain technology and attribute based encryption were combined to improve the privacy of crowdsourced testing data sharing by the proposed scheme. Firstly， the terminal internal nodes were used to construct an access tree to express the access policy， and the exponentiation operation and bilinear pairing operation in CP-ABE were used to realize policy hiding， so as to improve the privacy protection ability of data sharing in the crowdsourced testing scenarios. Secondly， the blockchain smart contract was called to automatically verify the legitimacy of data visitors， and completed the verification of task ciphertext access rights together with the cloud server to further improve the security of crowdsourced testing tasks. The performance test results show that the average encryption and decryption time is shorter， and the calculation overhead of encryption and decryption is lower than the same type of access tree policy hiding algorithm. In addition， when the frequency of decryption requests reaches 1 000 transactions per second， the processing capacity of blockchain is saturated gradually， and the maximum processing delay for data uplinking and data querying is 0.80 s and 0.12 s， so the proposed scheme is suitable for lightweight commercial crowdsourced testing application scenarios.

Survey of subgroup optimization strategies for intelligent algorithms

Xiaoxin DU, Wei ZHOU, Hao WANG, Tianru HAO, Zhenfei WANG, Mei JIN, Jianfei ZHANG

2024, 44(3): 819-830. DOI: 10.11772/j.issn.1001-9081.2023030380

Asbtract ( )

HTML ( )

PDF (2404KB) ( )

Figures and Tables | References | Related Articles | Metrics

The optimization of swarm intelligence algorithms is a main way to improve swarm intelligence algorithms. As the swarm intelligence algorithms are more and more widely used in all kinds of model optimization， production scheduling， path planning and other problems， the demand for performance of intelligent algorithms is also getting higher and higher. As an important means to optimize swarm intelligence algorithms， subgroup strategies can balance the global exploration ability and local exploitation ability flexibly， and has become one of the research hotspots of swarm intelligence algorithms. In order to promote the development and application of subgroup strategies， the dynamic subgroup strategy， the subgroup strategy based on master-slave paradigm， and the subgroup strategy based on network structure were investigated in detail. The structural characteristics， improvement methods and application scenarios of various subgroup strategies were expounded. Finally， the current problems and the future research trends and development directions of the subgroup strategies were summarized.

K-means clustering based on adaptive cuckoo optimization feature selection

Lin SUN, Menghan LIU

2024, 44(3): 831-841. DOI: 10.11772/j.issn.1001-9081.2023030351

Asbtract ( )

HTML ( )

PDF (2193KB) ( )

Figures and Tables | References | Related Articles | Metrics

The initial cluster number of the K-means clustering algorithm is randomly determined， a large number of redundant features are contained in the original datasets， which will lead to the decrease of clustering accuracy， and Cuckoo Search （CS） algorithm has the disadvantages of low convergence speed and weak local search. To address these issues， a K-means clustering algorithm combined with Dynamic CS Feature Selection （DCFSK） was proposed. Firstly， an adaptive step size factor was designed during the Levy flight phase to improve the search speed and accuracy of the CS algorithm. Then， to adjust the balance between global search and local search， and accelerate the convergence of the CS algorithm， the discovery probability was dynamically adjusted. An Improved Dynamic CS algorithm （IDCS） was constructed， and then a Dynamic CS-based Feature Selection algorithm （DCFS） was built. Secondly， to improve the calculation accuracy of the traditional Euclidean distance， a weighted Euclidean distance was designed to simultaneously consider the contribution of samples and features to distance calculation. To determine the selection scheme of the optimal number of clusters， the weighted intra-cluster and inter-cluster distances were constructed based on the improved weighted Euclidean distance. Finally， to overcome the defect that the objective function of the traditional K-means clustering only considers the distance within the clusters and does not consider the distance between the clusters， a objective function based on the contour coefficient of median was proposed. Thus， a K-means clustering algorithm based on the adaptive cuckoo optimization feature selection was designed. Experimental results show that， on ten benchmark test functions， IDCS achieves the best metrics. Compared to algorithms such as K-means and DBSCAN （Density-Based Spatial Clustering of Applications with Noise）， DCFSK achieves the best clustering effects on six synthetic datasets and six UCI datasets.

Stochastic local search algorithm for solving exact satisfiability problem

Xingyu ZHAO, Xiaofeng WANG, Yi YANG, Lichao PANG, Lan YANG

2024, 44(3): 842-848. DOI: 10.11772/j.issn.1001-9081.2023030364

Asbtract ( )

HTML ( )

PDF (906KB) ( )

Figures and Tables | References | Related Articles | Metrics

SATisfiability problem （SAT） is a NP complete problem， which is widely used in artificial intelligence and machine learning. Exact SATisfiability problem （XSAT） is an important subproblem of SAT. Most of the current research on XSAT is mainly at the theoretical level， but few efficient solution algorithms are studied， especially the stochastic local search algorithms with efficient verifiability. To address above problems and analyze some properties of both basic and equivalent coding formulas， a stochastic local search algorithm WalkXSAT was proposed for solving XSAT directly. Firstly， the random local search framework was used for basic search and condition determination. Secondly， the appropriate unsatisfiable scoring value of the text to which the variables belonged was added， and the variables that were not easily and appropriately satisfied were prioritized. Thirdly， the search space was reduced using the heuristic strategy of anti-repeat selection of flipped variables. Finally， multiple sources and multiple formats of examples were used for comparison experiments. Compared with ProbSAT algorithm， the number variable flips and the solving time of WalkXSAT are significantly reduced when directly solving the XSAT. In the example after solving the basic encoding transformation， when the variable size of the example is greater than 100， the ProbSAT algorithm is no longer effective， while WalkXSAT can still solve the XSAT in a short time. Experimental results show that the proposed algorithm WalkXSAT has high accuracy， strong stability， and fast convergence speed.

Application of quantum approximate optimization algorithm in exact cover problems

Lingling GUO, Zhiqiang LI, Menghuan DUAN

2024, 44(3): 849-854. DOI: 10.11772/j.issn.1001-9081.2023030332

Asbtract ( )

HTML ( )

PDF (1100KB) ( )

Figures and Tables | References | Related Articles | Metrics

Exact cover problems are NP complete problems in combinatorial optimization， and it is difficult to solve them in polynomial time by using classical algorithms. In order to solve this problem， on the open source quantum computing framework qiskit， a quantum circuit solution based on Quantum Approximate Optimization Algorithm （QAOA） was proposed， and Constrained Optimization BY Linear Approximation （COBYLA） algorithm based on the simplex method was used to optimize the parameters in the quantum logic gates. Firstly， the classical Ising model was established through the mathematical model of the exact cover problem. Secondly， the classical Ising model was quantized by using the rotation variable in quantum theory， and then the Pauli rotation operator was used to replace the rotation variable to obtain the quantum Ising model and the problem Hamiltonian， which improved the speed of QAOA in finding the optimal solution. Finally， the expected expression of the problem Hamiltonian was obtained by the accumulation of the product of the unitary transformation with the mixed Hamiltonian as the generator and the unitary transformation with the problem Hamiltonian as the generator， and the generative quantum circuit was designed accordingly. In addition， the classical processor was used to optimize the parameters in the two unitary transformations to adjust the expected value of the problem Hamiltonian， thereby increasing the probability of solution. The circuit was simulated on qiskit， IBM’s open source quantum computing framework. Experimental results show that the proposed scheme can obtain the solution of the problem in polynomial time with a probability of 95.6%， which proves that the proposed quantum circuit can find a solution to the exact cover problem with a higher probability.

Image denoising-based cell-level RSRP estimation method for urban areas

Yi ZHENG, Cunyi LIAO, Tianqian ZHANG, Ji WANG, Shouyin LIU

2024, 44(3): 855-862. DOI: 10.11772/j.issn.1001-9081.2023030292

Asbtract ( )

HTML ( )

PDF (4442KB) ( )

Figures and Tables | References | Related Articles | Metrics

The planning， deployment and optimization of mobile communication system networks all depend to varying degrees on the accuracy of the Reference Signal Receiving Power （RSRP） estimation. Traditionally， the RSRP of a signal receiver in a cell covered by a base station can be estimated by the corresponding wireless propagation model. In an urban environment， the wireless propagation models for different cells need to be calibrated using a large number of RSRP measurements. Due to the environment differences of different cells， the calibrated model is only applicable to the corresponding cell， and has low accuracy of RSRP estimation within the cell. To address these issues， the RSRP estimation problem was transformed into an image denoising problem and a cell-level wireless propagation model was obtained through image processing and deep learning techniques， which not only enabled RSRP estimation for the cell as a whole， but also was suitable to cells in similar environments. Firstly， the RSRP estimation map of the whole cell was obtained by predicting the RSRP of each receiver point by point through a random forest regressor. Then， the loss between the RSRP estimation map and the measured RSRP distribution map was regarded as the RSRP noise map， and a image denoising RSRP estimation method based on Conditional Generative Adversarial Network （CGAN） was proposed to reflect the environmental information of the cell through an electronic environmental map， which effectively reduced the RSRP of different cell. Experimental results show that the root mean square error of the proposed method is 6.77 dBm in predicting RSRP in a new cross-cell RSRP scenario without measured data， which is 2.55 dBm lower than that of the convolutional neural network-based RSRP estimation method EFsNet； in the same-cell RSRP prediction scenario， the number of model parameters is reduced by 80.3% compared with EFsNet.

Channel access and resource allocation algorithm for adaptive p-persistent mobile ad hoc network

Xintong QIN, Zhengyu SONG, Tianwei HOU, Feiyue WANG, Xin SUN, Wei LI

2024, 44(3): 863-868. DOI: 10.11772/j.issn.1001-9081.2023030322

Asbtract ( )

HTML ( )

PDF (2070KB) ( )

Figures and Tables | References | Related Articles | Metrics

For the channel access and resource allocation problem in the p-persistent Mobile Ad hoc NETwork （MANET）， an adaptive channel access and resource allocation algorithm with low complexity was proposed. Firstly， considering the characteristics of MANET， the optimization problem was formulated to maximize the channel utility of each node. Secondly， the formulated problem was then transformed into a Markov decision process and the state， action， as well as the reward function were defined. Finally， the network parameters were trained based on policy gradient to optimize the competition probability， priority growth factor， and the number of communication nodes. Simulation experiment results indicate that the proposed algorithm can significantly improve the performance of p-persistent CSMA （Carrier Sense Multiple Access） protocol. Compared with the scheme with fixed competition probability and predefined p-value， the proposed algorithm can improve the channel utility by 45% and 17%， respectively. The proposed algorithm can also achieve higher channel utility compared to the scheme with fixed number of communication nodes when the total number of nodes is less than 35. Most importantly， with the increase of packet arrival rate， the proposed algorithm can fully utilize the channel resource to reduce the idle period of time slot.

Efficient clustered routing protocol for intelligent road cone ad-hoc networks based on non-random clustering

Long CHEN, Xuanlin YU, Wen CHEN, Yi YAO, Wenjing ZHU, Ying JIA, Denghong LI, Zhi REN

2024, 44(3): 869-875. DOI: 10.11772/j.issn.1001-9081.2023040483

Asbtract ( )

HTML ( )

PDF (2650KB) ( )

Figures and Tables | References | Related Articles | Metrics

Existing multi-hop clustered routing protocols for Intelligent Road Cone Ad-hoc Network （IRCAN） suffer from redundancy in network control overhead and the average number of hops for data packet transmission is not guaranteed to be minimal. To solve the above problems， combined with the link characteristics of the network topology， an efficient clustered routing protocol based on non-random retroverted clustering， called Retroverted-Clustering-based Hierarchy Routing （RCHR）， was proposed. Firstly， the retroverted clustering mechanism based on central extension and the cluster head selection algorithm based on overhearing， cross-layer sharing， and extending the adjacency matrix was proposed. Then， the proposed mechanism and the proposed algorithm were used to generate clusters with retroverted characteristics around sink nodes in sequence， and to select the optimal cluster heads for sink nodes at different directions without additional conditions. Thus， networking control overhead and time were decreased， and the formed network topology was profit for diminishing the average number of hops for data packet transmission. Theoretic analysis validated the effectiveness of the proposed protocol. The simulation experiment results show that compared with Ring-Based Multi-hop Clustering （RBMC） routing protocol and MODified Low Energy Adaptive Clustering Hierarchy （MOD-LEACH） protocol， the networking control overhead and the average number of hops for data packet transmission of the proposed protocol are reduced by 32.7% and 2.6% at least， respectively.

UAV detection and recognition based on improved convolutional neural network and radio frequency fingerprint

Jingxian ZHOU, Xina LI

2024, 44(3): 876-882. DOI: 10.11772/j.issn.1001-9081.2023030299

Asbtract ( )

HTML ( )

PDF (2693KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to solve the problems that the UAV （Unmanned Aerial Vehicle） is vulnerable to environmental interference in image recognition， and the traditional signal recognition is difficult to accurately extract features and has poor real-time performance， a UAV detection and recognition method based on improved CNN （Convolutional Neural Network） and RF （Radio Frequency） fingerprint was proposed. Firstly， a USRP （Universal Software Radio Peripheral） was used for capturing radio signals in an environment， a deviation value was obtained through multi-resolution analysis， to detect whether the radio signal was an unmanned aerial vehicle radio frequency signal or not. Secondly， the detected unmanned aerial vehicle radio frequency signal was subjected to wavelet transformation and PCA （Principal Component Analysis） to obtain a radio frequency signal spectrum which was used as an input of a neural network. Finally， a LRCNN （Lightweight Residual Convolutional Neural Network） was constructed， and the RF spectrum was input to train the network for UAV classification and recognition. Experimental results show that LRCNN can effectively detect and recognize UAV signals， and the average recognition accuracy reaches 84%. When the SNR （Signal-to-Noise Ratio） is greater than 20 dB， the recognition accuracy of LRCNN reaches 88%， which is 31 and 7 percentage points higher than those of SVM （Support Vector Machine） and the original OracleCNN， respectively. Compared with these two methods， LRCNN has improved recognition accuracy and robustness.

UMCS tree based hybrid similarity measure of UML class diagram

Zhongchen YUAN, Zongmin MA

2024, 44(3): 883-889. DOI: 10.11772/j.issn.1001-9081.2022111702

Asbtract ( )

HTML ( )

PDF (2820KB) ( )

Figures and Tables | References | Related Articles | Metrics

Software reuse is to retrieve previously developed software artifacts from a repository according to given conditions. The retrieval is based on similarity measure. UML （Unified Modeling Language） class diagram is widely applied to software design， and its reuse as a core of software design reuse has attracted much attention. Therefore， the research on the similarity of UML class diagrams was carried out. UML class diagram contains semantic and structural contents. At present， the similarity research of UML class diagrams mainly focuses on semantics， and there are also some discussions on structural similarity， but the combination of semantics and structure has not been considered. Therefore， a hybrid similarity measure combining semantics and structure was proposed. Due to the non-formal nature of UML class diagram， the UML class diagram was transformed into a graph model for similarity measure， the Maximum Common Subgraph List （MCSL） was searched， a Maximum Common Subgraph （MCS） tree was created based on MCSL， and a hybrid similarity measure method was proposed based on MCS sequence. The semantic matching and structural matching were defined corresponding to concept and structure common subgraphs， respectively. The similarity comparison and similarity based classification quality comparison experiments were carried out， and the experimental results validate the advantages of the proposed method.

Research status and prospect of CT image ring artifact removal methods

Yaoyao TANG, Yechen ZHU, Yangchuan LIU, Xin GAO

2024, 44(3): 890-900. DOI: 10.11772/j.issn.1001-9081.2023030305

Asbtract ( )

HTML ( )

PDF (1994KB) ( )

Figures and Tables | References | Related Articles | Metrics

Ring artifact is one of the most common artifacts in various types of CT （Computed Tomography） images， which is usually caused by the inconsistent response of detector pixels to X-rays. Effective removal of ring artifacts， which is a necessary step in CT image reconstruction， will greatly improve the quality of CT images and enhance the accuracy of later diagnosis and analysis. Therefore， the methods of ring artifact removal （also known as ring artifact correction） were systematically reviewed. Firstly， the performance and causes of ring artifacts were introduced， and commonly used datasets and algorithm libraries were given. Secondly， ring artifact removal methods were divided into three categories to introduce. The first category was based on detector calibration. The second category was based on analytical and iterative solution， including projection data preprocessing， CT image reconstruction and CT image post-processing. The last category was based on deep learning methods such as convolutional neural network and generative adversarial network. The principle， development process， advantages and limitations of each method were analyzed. Finally， the technical bottlenecks of existing ring artifact removal methods in terms of robustness， dataset diversity and model construction were summarized， and the solutions were prospected.

Cross-view matching model based on attention mechanism and multi-granularity feature fusion

Meiyu CAI, Runzhe ZHU, Fei WU, Kaiyu ZHANG, Jiale LI

2024, 44(3): 901-908. DOI: 10.11772/j.issn.1001-9081.2023040412

Asbtract ( )

HTML ( )

PDF (3816KB) ( )

Figures and Tables | References | Related Articles | Metrics

Cross-view scene matching refers to the discovery of images of the same geographical target from different platforms （such as drones and satellites）. However， different image platforms lead to low accuracy of UAV （Unmanned Aerial Vehicle） positioning and navigation tasks， and the existing methods usually focus only on a single dimension of the image and ignore the multi-dimensional features of the image. To solve the above problems， GAMF （Global Attention and Multi-granularity feature Fusion） deep neural network was proposed to improve feature representation and feature distinguishability. Firstly， the images from the UAV perspective and the satellite perspective were combined， and the three branches were extended under the unified network architecture， the spatial location， channel and local features of the images from three dimensions were extracted. Then， by establishing the SGAM （Spatial Global relationship Attention Module） and CGAM （Channel Global Attention Module）， the spatial global relationship mechanism and channel attention mechanism were introduced to capture global information， so as to better carry out attention learning. Secondly， in order to fuse local perception features， a local division strategy was introduced to better improve the model’s ability to extract fine-grained features. Finally， the features of the three dimensions were combined as the final features to train the model. The test results on the public dataset University-1652 show that the AP （Average Precision） of the GAMF model on UAV visual positioning tasks reaches 87.41%， and the Recall （R@1） in UAV visual navigation tasks reaches 90.30%， which verifies that the GAMF model can effectively aggregate the multi-dimensional features of the image and improve the accuracy of UAV positioning and navigation tasks.

3D vehicle detection with adaptive horizon line constraints

Wei WANG, Chunhui ZHAO, Xinyao TANG, Liugang XI

2024, 44(3): 909-915. DOI: 10.11772/j.issn.1001-9081.2023040416

Asbtract ( )

HTML ( )

PDF (3570KB) ( )

Figures and Tables | References | Related Articles | Metrics

The commonly used monocular vision-based vehicle 3D detection method at present combines object detection with geometric constraint. However， the position of the vanishing point in the geometric constraint has a significant impact on the results. To obtain more accurate constraint conditions， a 3D vehicle detection algorithm based on horizon line detection was proposed. First， the relative position of the vanishing point was obtained using the vehicle image， and the vehicle image was preprocessed to an appropriate size. Then， the preprocessed vehicle image was fed into a vanishing point detection network to obtain a set of heatmaps indicating the vanishing point information. The vanishing point information was regressed， and the horizon information was calculated. Finally， geometric constraint was constructed based on the horizon line information， and the initial dimensions of the vehicle were iteratively optimized within the constrained space to calculate the precise 3D information of the vehicle. The experimental results demonstrate that the proposed horizon line solving algorithm obtains more accurate horizon lines. Compared to the random forest method， there is an AUC （Area Under Curve） improvement of 1.730 percentage points. Simultaneously， the introduced horizon line constraint effectively restricts the 3D vehicle information， resulting in an average precision improvement of 2.201 percentage points compared to the algorithm using diagonal and vanishing point constraint. It can be observed that the horizon line serves as a geometric constraint for solving vehicle 3D information in the context of roadside monocular camera perspectives.

Iterative denoising network based on total variation regular term expansion

Ruifeng HOU, Pengcheng ZHANG, Liyuan ZHANG, Zhiguo GUI, Yi LIU, Haowen ZHANG, Shubin WANG

2024, 44(3): 916-921. DOI: 10.11772/j.issn.1001-9081.2023030376

Asbtract ( )

HTML ( )

PDF (2529KB) ( )

Figures and Tables | References | Related Articles | Metrics

For the shortcomings of poor interpretation ability and instability in neural network training， a Chambolle- Pock （CP） algorithm optimized denoising network based on Total Variational （TV） regularization， CPTV-Net， was proposed to solve the denoising problem of Low-Dose Computed Tomography （LDCT） images. Firstly， the TV constraint term was introduced into the L1 regularization term model to preserve the structural information of the image. Secondly， the CP algorithm was used to solve the denoising model and obtain specific iterative steps to ensure the convergence of the algorithm. Finally， the shallow CNN （Convolutional Neural Network） was used to learn the iterative formula of the primal dual variables of the linear operation. The neural network was used to calculate the solution of the model， and the network parameters were collected to optimize the combined data. The experimental results on simulated and real LDCT datasets show that compared with five advanced denoising methods such as REDCNN （Residual Encoder-Decoder Convolutional Neural Network） and TED-Net （Transformer Encoder-decoder Dilation Network）， CPTV-Net has the best Peak Signal-to-Noise Ratio （PSNR）， Structural SIMilarity （SSIM）， and Visual Information Fidelity （VIF） evaluation values， and can generate LDCT images with significant denoising effect and the most details preserved.

Asymmetric unsupervised end-to-end image deraining network

Rui JIANG, Wei LIU, Cheng CHEN, Tao LU

2024, 44(3): 922-930. DOI: 10.11772/j.issn.1001-9081.2023030367

Asbtract ( )

HTML ( )

PDF (3275KB) ( )

Figures and Tables | References | Related Articles | Metrics

Existing learning-based single-image deraining networks mostly focus on the effect of rain streaks in rainy images on visual imaging， while ignoring the effect of fog on visual imaging due to the increase of humidity in the air in rainy environments， thus causing problems such as low generation quality and blurred texture detail information in the derained images. To address these problems， an asymmetric unsupervised end-to-end image deraining network model was proposed. It mainly consists of rain and fog removal network， rain and fog feature extraction network and rain and fog generation network， which form two different data domain mapping conversion modules： Rain-Clean-Rain and Clean-Rain-Clean. The above three sub-networks constituted two parallel transformation paths： the rain removal path and the rain-fog feature extraction path. In the rain-fog feature extraction path， a rain-fog-aware extraction network based on global and local attention mechanisms was proposed to learn rain-fog related features by using the global self-similarity and local discrepancy existing in rain-fog features. In the rain removal path， a rainy image degradation model and the above extracted rain-fog related features were introduced as priori knowledge to enhance the ability of rain-fog image generation， so as to constrain the rain-fog removal network and improve its mapping conversion capability from rain data domain to rain-free data domain. Extensive experiments on different rain image datasets show that compared to state-of-the-art deraining method CycleDerain， the Peak Signal-to-Noise Ratio （PSNR） is improved by 31.55% on the synthetic rain-fog dataset HeavyRain. The proposed model can adapt to different rainy scenarios， has better generalization， and can better recover the details and texture information of images.

Vehicle target detection by fusing event data and image frames

Yuliang ZHENG, Yunhua CHEN, Weijie BAI, Pinghua CHEN

2024, 44(3): 931-937. DOI: 10.11772/j.issn.1001-9081.2023040420

Asbtract ( )

HTML ( )

PDF (2274KB) ( )

Figures and Tables | References | Related Articles | Metrics

Combining event cameras with traditional cameras for vehicle target detection can not only solve the problems of over-exposure， underexposure， and motion blur in high dynamic range of traditional cameras， but also solve the problem of low detection accuracy caused by missing texture information of event cameras. Existing fusion algorithms often have problems such as high computational complexity， loss of feature information， and poor fusion results. To solve the above problems， a vehicle target detection algorithm that effectively fused event cameras and conventional cameras was proposed. Firstly， a spatio-temporal event representation based on Event Frequency （EF） and Time Surface （TS） was proposed， which encoded event data into event frames. Then， a Feature fusion module based on Channel and Spatial Attention mechanism （FCSA） was proposed to perform feature-level fusion of image frames and event frames. Finally， the prior box was optimized by using the differential evolution search algorithm to further improve the vehicle detection performance. In addition， due to the lack of public datasets containing image frames and event data， a vehicle detection dataset MVSEC-CAR was established. The experimental results show that， on the public PKU-DDD17-CAR dataset， the mean Average Precision （mAP） of the proposed algorithm is 2.6 percentage points higher than that of the second best ADF （Attention fusion Detection Framework）， and it achieves a higher frame rate， effectively improving the accuracy of vehicle target detection and robustness to lighting， which validate the effectiveness of the proposed event representation， feature fusion， and prior box optimization algorithms.

Faster-RCNN water-floating garbage recognition based on multi-scale feature and polarized self-attention

Zhanjun JIANG, Baijing WU, Long MA, Jing LIAN

2024, 44(3): 938-944. DOI: 10.11772/j.issn.1001-9081.2023030368

Asbtract ( )

HTML ( )

PDF (4460KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problems of variable morphology， low resolution and limited information of small-target water-floating garbage， which lead to unsatisfactory detection results， an improved Faster-RCNN （Faster Regions with Convolutional Neural Network） water-floating garbage detection algorithm was proposed， namely MP-Faster-RCNN （Faster-RCNN with Multi-scale feature and Polarized self-attention）. Firstly， a small-target water-floating garbage dataset in Lanzhou part of the Yellow River was established， the combination of atrous convolution and ResNet-50 was used as the backbone feature extraction network instead of the original VGG-16 （Visual Geometry Group 16） to expand the perception field for extracting more small-target features. Secondly， two layers of convolutions of 3×3 and 1×1 were set in the Region Proposal Network （RPN） by using multi-scale features to compensate for the feature loss caused by a single sliding window. Finally， polarized self-attention was added before RPN to further utilize multi-scale and channel features to extract finer-grained multi-scale spatial information and inter-channel dependencies to generate a feature map with global features， achieving more accurate target box localization. Experimental results show that compared with the original Faster-RCNN， MP-Faster-RCNN can effectively improve the detection accuracy of water-floating garbage with a mean Average Precision （mAP） improvement of 6.37 percentage points， the model size is reduced from 521 MB to 108 MB， and the convergence speed is faster under the same training epoch.

Real-time pulmonary nodule detection algorithm combining attention and multipath fusion

Kui ZHAO, Huiqi QIU, Xu LI, Zhifei XU

2024, 44(3): 945-952. DOI: 10.11772/j.issn.1001-9081.2023040424

Asbtract ( )

HTML ( )

PDF (2387KB) ( )

Figures and Tables | References | Related Articles | Metrics

Existing single-stage target detection algorithms are insensitive to nodule detection in lung nodule detection， multiple up-samplings during feature extraction by Convolutional Neural Network （CNN） has difficult feature extraction and poor detection effect， and the existing pulmonary nodule detection algorithm models are complex and not conductive to practical application employment and implementation. To address the above problems， a real-time pulmonary nodule detection algorithm combining attention mechanism and multipath fusion was proposed， based on which the up-sampling algorithm was improved to effectively increase the detection accuracy of lung nodules and speed of model inference， the model size was small and easy to deploy. Firstly， the hybrid attention mechanism of channel and space was fused in the backbone network part of feature extraction. Secondly， the sampling algorithm was improved to enhance the quality of generated feature maps. Finally， the channels were established between different paths in the enhanced feature extraction network part to achieve the fusion of deep and shallow features， so the semantic and location information at different scales was fused. Experimental results on LUNA16 dataset show that， compared to the original YOLOv5s algorithm， the proposed algorithm achieves an improvement of 9.5， 6.9， and 8.7 percentage points in precision， recall， and average precision， respectively， with a frame rate of 131.6 frames/s， and a model weight file of only 14.2 MB， demonstrating that the proposed algorithm can detect lung nodules in real time with much higher accuracy than existing single-stage detection algorithms such as YOLOv3 and YOLOv8.

Reliability evaluation of multi-component system based on time-varying Copula function

Lei WANG, Shijuan CHENG, Yu HAN

2024, 44(3): 953-959. DOI: 10.11772/j.issn.1001-9081.2023040459

Asbtract ( )

HTML ( )

PDF (1746KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the mechanical system related to multi-component failure， a reliability evaluation method of multi-component system based on time-varying Copula function was proposed. Firstly， the nonlinear Wiener process was introduced to characterize the performance degradation process， and the Copula function was used to characterize the correlation between multiple component failures. Secondly， based on the evolutionary equation of the Copula function approximation of the Fourier series， the fitting effects of the Fourier series on common time-varying forms were verified by Monte Carlo （MC） simulation. In addition， the likelihood ratio statistic was used to test the existence of time-varying correlation， indicating the necessity of time-varying correlation research. The example analysis shows that compared with the static correlation model， the time-varying correlation model has the log-likelihood function value increased by 4.36%， and the Akaike Information Criterion （AIC） decreased by 3.81%， achieving more accurate reliability evaluation results.

Fixed-time consensus of dynamic event-triggered multi-agent systems

Zhaojun TANG, Meiyan XIA, Hua ZHANG, Ting XIE

2024, 44(3): 960-965. DOI: 10.11772/j.issn.1001-9081.2023030320

Asbtract ( )

HTML ( )

PDF (1279KB) ( )

Figures and Tables | References | Related Articles | Metrics

The problem of event-triggered fixed-time consistency based on event triggering was studied for multi-agent systems with unknown disturbances and nonlinear dynamics. Based on the traditional static event-triggered strategy， a fixed-time consensus protocol based on dynamic event-triggered strategy was proposed by introducing an adjustable dynamic variable. A dynamic event-triggered function based on state information and dynamic variables was given for each agent， and the event was triggered only when the measurement error of each agent satisfied the given triggering function. The introduced dynamic variables were adjustable threshold parameters that could further reduce the number of event triggers and use the limited resources of the system more efficiently. By using graph theory， fixed-time consensus theory and Lyapunov stability theory， the conditions that the parameters in the consensus protocol and trigger functions needed to satisfy when the system reaching fixed-time consensus were obtained， meanwhile Zeno behavior was shown not to exist. Finally， the numerical simulation results were applied to verify the correctness and validity of the theoretical analysis.

Full coverage path planning of bridge inspection wall-climbing robot based on improved grey wolf optimization

Haixin HUANG, Guangwei YU, Shoushan CHENG, Chunming LI

2024, 44(3): 966-971. DOI: 10.11772/j.issn.1001-9081.2023030334

Asbtract ( )

HTML ( )

PDF (2953KB) ( )

Figures and Tables | References | Related Articles | Metrics

Automatic inspection of concrete bridge health based on wall-climbing robot is an effective way to promote intelligent bridge management and maintenance， moreover reasonable path planning is particularly important for the robot to obtain comprehensive detection data. Aiming at the engineering practical problem of weight limitation of the wall-climbing robot power supply and the difficulty of energy supplement during inspection， the inspection scenarios of bridge components such as main beams and high piers were fully considered， the energy consumption index was taken as the objective function of performance evaluation optimization and corresponding constraint conditions were established， and a full coverage path planning evaluation model was proposed. An Improved Grey Wolf Optimization （IGWO） algorithm was proposed to solve the problem that traditional Grey Wolf Optimization （GWO） algorithm is prone to fall into local optimum. The IGWO algorithm improved the characteristics of initial gray wolf population which was difficult to maintain relatively uniform distribution in the search space by K-Means clustering. The nonlinear convergence factor was used to improve the local development ability and global search performance of the algorithm. Combining with the idea of individual superiority of particle swarm optimization， the position updating formula was improved to enhance the model solving ability of the algorithm. Algorithm simulation and comparison experiment results show that IGWO has better stability compared with GWO， Different Evolution （DE） and Genetic Algorithm （GA）， IGWO reduces energy consumption by 10.2% - 16.7%， decreases iterations by 19.3% - 36.9% and solving time by 12.8% - 32.3%， reduces path repetition rate by 0.23 - 1.91 percentage points， and reduces path length by 1.6% - 11.0%.

Prediction of landslide displacement based on improved grey wolf optimizer and support vector regression

Shuai REN, Yuanfa JI, Xiyan SUN, Zhaochuan WEI, Zian LIN

2024, 44(3): 972-982. DOI: 10.11772/j.issn.1001-9081.2023030331

Asbtract ( )

HTML ( )

PDF (3878KB) ( )

Figures and Tables | References | Related Articles | Metrics

To address the issues of difficult prediction of landslide displacement and difficulty in selecting influencing factors， a model combining Double Moving Average （DMA）， Variational Modal Decomposition （VMD）， Improved Gray Wolf Optimizer （IGWO） algorithm and Support Vector Regression （SVR） was proposed for landslide displacement prediction. Firstly， DMA was used to extract the trend and periodic terms of landslide displacement， and polynomial fitting was used to predict the trend term. Secondly， the influencing factors of the landslide periodic term were classified， and VMD was used to decompose the original factor sequence to obtain the optimal sequence. Then， a grey wolf optimizer algorithm combining SVR with an improved Circle-based multi-tactic， called CTGWO-SVR （Circle Tactics Grey Wolf Optimizer with SVR）， was proposed to predict the landslide periodic term. Finally， the cumulative displacement prediction sequence was obtained using a time series additive model， and the model was evaluated using post validation difference verification and small probability error in grey prediction. Experimental results show that compared with GA （Genetic Algorithm）-SVR and GWO-SVR models， CTGWO-SVR has higher prediction accuracy with a fitting degree of 0.979， and the Root Mean Square Error （RMSE） reduces by 51.47% and 59.25%， respectively. The model evaluation accuracy is level one， which can meet the real-time and accuracy requirements of landslide prediction.

Lightweight deep learning algorithm for weld seam surface quality detection of traction seat

Zijie HUANG, Yang OU, Degang JIANG, Cailing GUO, Bailin LI

2024, 44(3): 983-988. DOI: 10.11772/j.issn.1001-9081.2023030349

Asbtract ( )

HTML ( )

PDF (3404KB) ( )

Figures and Tables | References | Related Articles | Metrics

In order to address the low accuracy and speed of detection by manual and traditional automation methods for the weld seam surface of traction seat， a lightweight weld seam quality detection algorithm YOLOv5s-G2CW was proposed for the weld seam surface of traction seat. Firstly， the GhostBottleneckV2 module was applied as a replacement for the C3 module in YOLOv5s to reduce the number of parameters used in the model. Then， the CBAM （Convolutional Block Attention Module） was introduced into the Neck of the YOLOv5s model for integration of the weld features in two dimensions： channel and space. Also， the positioning loss function of the YOLOv5s model was improved into Wise-IoU， focusing on the predictive regression of ordinary quality anchor frames. Finally， the $13 × 13$ feature layer used for the detection of large-sized objects in the YOLOv5s model was removed to further reduce the number of parameters used in the model. Experimental results show that， compared with the YOLOv5s model， the size of YOLOv5s-G2CW model reduces by 53.9%， the number of frames transmitted per second increases by 8.0%， and the mAP （mean Average Precision） value increases by 0.8 percentage points. It can be seen that the model is applicable to meet the requirements for real-time and accurate detection of the weld seam surface for traction seat.

Self-adaptive spherical evolution for prediction of drug target interaction

Yidi LIU, Zihao WEN, Fuxiang REN, Shiyin LI, Deyu TANG

2024, 44(3): 989-994. DOI: 10.11772/j.issn.1001-9081.2023070929

Asbtract ( )

HTML ( )

PDF (757KB) ( )

Figures and Tables | References | Related Articles | Metrics

Drug-target prediction method can effectively reduce costs and accelerate research process compared with traditional drug discovery. However， there are various challenges such as low balance of datasets and low precision of prediction in practical applications. Therefore， a drug-target interaction prediction method based on self-adaptive spherical evolution was proposed， namely ASE-KELM （self-Adaptive Spherical Evolution based on Kernel Extreme Learning Machine）. By the method， negative samples with high confidence were selected based on the principle that drugs with similar structures are likely to interact with targets. And to solve the problem that spherical evolution algorithm tends to fall into local optima， the feedback mechanism of historical memory of search factors and Linear Population Size Reduction （LPSR） were used to balance global and local search， which improved the optimization ability of the algorithm. Then the parameters of Kernel Extreme Learning Machine （KELM） were optimized by the self-adaptive spherical evolution algorithm. ASE-KELM was compared with algorithms such as NetLapRLS （Network Laplacian Regularized Least Square） and BLM-NII （Bipartite Local Model with Neighbor-based Interaction profile Inferring） on gold standard based datasets to verify the performance of the algorithms. Experimental results show that ASE-KELM outperforms comparison algorithms in AUC （Area Under the receiver operating Characteristic curve） and AUPR （Area Under the Precision-Recall curve） for the Enzyme （E）， G-Protein-Coupled Receptor （GPCR）， Ion Channel （IC）， and Nuclear Receptor （NR） datasets. And the effectiveness of ASE-KELM in predicting new drug-target pairs was validated on databases such as DrugBank.

Table of Content