Journal of Computer Applications

Multi-view clustering network with deep fusion

Ziyi HE, Yan YANG, Yiling ZHANG

2023, 43(9): 2651-2656. DOI: 10.11772/j.issn.1001-9081.2022091394

Asbtract ( )

HTML ( )

PDF (1074KB) ( )

Figures and Tables | References | Related Articles | Metrics

Current deep multi-view clustering methods have the following shortcomings： 1） When feature extraction is carried out for a single view， only attribute information or structural information of the samples is considered， and these two types of information are not integrated. Thus， the extracted features cannot fully represent latent structure of the original data. 2） Feature extraction and clustering were divided into two separated processes， without establishing the relationship between them， so that the feature extraction process cannot be optimized by the clustering process. To solve these problems， a Deep Fusion based Multi-view Clustering Network （DFMCN） was proposed. Firstly， the embedding space of each view was obtained by combining autoencoder and graph convolution autoencoder to fuse attribute information and structure information of samples. Then， the embedding space of the fusion view was obtained through weighted fusion， and clustering was carried out in this space. And in the process of clustering， the feature extraction process was optimized by a two-layer self-supervision mechanism. Experimental results on FM （Fashion-MNIST）， HW （HandWritten numerals）， and YTF （YouTube Face） datasets show that the accuracy of DFMCN is higher than those of all comparison methods； and DFMCN has the accuracy increased by 1.80 percentage points compared with the suboptimal CMSC-DCCA （Cross-Modal Subspace Clustering via Deep Canonical Correlation Analysis） method on FM dataset， the Normalized Mutual Information （NMI） of DFMCN is increased by 1.26 to 14.84 percentage points compared to all methods except for CMSC-DCCA and DMSC （Deep Multimodal Subspace Clustering networks）. Experimental results verify the effectiveness of the proposed method.

Adaptive learning-based multi-view unsupervised feature selection method

Tian HE, Zongxin SHEN, Qianqian HUANG, Yanyong HUANG

2023, 43(9): 2657-2664. DOI: 10.11772/j.issn.1001-9081.2022091404

Asbtract ( )

HTML ( )

PDF (1956KB) ( )

Figures and Tables | References | Related Articles | Metrics

Most of the existing multi-view unsupervised feature selection methods have the following problem： the similarity matrix of samples， the weight matrix of different views， and the feature weight matrix are usually predefined， and cannot effectively describe the real intrinsic structure of data and reflect the importance of different views and features， which results in the failure of selection of useful features. In order to address the above issue， firstly， adaptive learning of view weight and feature weight was performed on the basis of multi-view fuzzy C-means clustering， thereby achieving feature selection and guaranteeing the clustering performance simultaneously. Then， under the constraint of Laplacian rank， the similarity matrix of samples was learned adaptively， and an Adaptive Learning-based Multi-view Unsupervised Feature Selection （ALMUFS） method was constructed. Finally， an alternate iterative optimization algorithm was designed to solve the objective function， and the proposed method was compared with six unsupervised feature selection baseline methods on eight real datasets. Experimental results show that ALMUFS is superior to other methods in terms of clustering accuracy and F-measure. In specific， ALMUFS method improves the clustering accuracy and F-measure by 8.99 and 11.87 percentage points compared to Adaptive Collaborative Similarity Learning （ACSL） averagely and respectively and by 11.09 and 13.21 percentage points compared to Adaptive Similarity and View Weight （ASVM） averagely and respectively， which demonstrates the feasibility and effectiveness of the proposed method.

Clustering ensemble algorithm with high-order consistency learning

Jianwen GAN, Yan CHEN, Peng ZHOU, Liang DU

2023, 43(9): 2665-2672. DOI: 10.11772/j.issn.1001-9081.2022091406

Asbtract ( )

HTML ( )

PDF (2069KB) ( )

Figures and Tables | References | Related Articles | Metrics

Most of the research on clustering ensemble focuses on designing practical consistency learning algorithms. To solve the problems that the quality of base clusters varies and the low-quality base clusters have an impact on the performance of the clustering ensemble， from the perspective of data mining， the intrinsic connections of data were mined based on the base clusters， and a high-order information fusion algorithm was proposed to represent the connections between data from different dimensions， namely Clustering Ensemble with High-order Consensus learning （HCLCE）. Firstly， each high-order information was fused into a new structured consistency matrix. Then， the obtained multiple consistency matrices were fused together. Finally， multiple information was fused into a consistent result. Experimental results show that LCLCE algorithm has the clustering accuracy improved by an average of 7.22%， and the Normalized Mutual Information （NMI） improved by an average of 9.19% compared with the suboptimal Locally Weighted Evidence Accumulation （LWEA） algorithm. It can be seen that the proposed algorithm can obtain better clustering results compared with clustering ensemble algorithms and using one information alone.

Haoyang CUI, Hui ZHANG, Lei ZHOU, Chunming YANG, Bo LI, Xujian ZHAO

2023, 43(9): 2673-2678. DOI: 10.11772/j.issn.1001-9081.2022091376

Asbtract ( )

HTML ( )

PDF (1618KB) ( )

Figures and Tables | References | Related Articles | Metrics

For the problems that the performance of the nearest neighbor classification algorithm is greatly affected by the adopted similarity or distance measuring method， and it is difficult to select the optimal similarity or distance measuring method， with multi-similarity method adopted， a K-Nearest Neighbor algorithm with Ordered Pairs of Normalized real numbers （OPNs-KNN） was proposed. Firstly， the new mathematical theory of Ordered Pair of Normalized real numbers （OPN） was introduced in machine learning. And all the samples in the training and test sets were converted into OPNs by multiple similarity or distance measuring methods， so that different similarity information was included in each OPN. Then， the improved nearest neighbor algorithm was used to classify the OPNs， so that different similarity or distance measuring methods were able to be mixed and complemented to improve the classification performance. Experimental results show that compared with 6 improved nearest neighbor classification algorithms， such as distance-Weighted K-Nearest-Neighbor rule （WKNN） rule on Iris， seeds， and other datasets， OPNs-KNN has the classification accuracy improved by 0.29 to 15.28 percentage points， which proves that the performance of classification can be improved greatly by the proposed algorithm.

Improved grey wolf optimizer algorithm based on dual convergence factor strategy

Yun OU, Kaiqing ZHOU, Pengfei YIN, Xuewei LIU

2023, 43(9): 2679-2685. DOI: 10.11772/j.issn.1001-9081.2022091389

Asbtract ( )

HTML ( )

PDF (988KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the drawbacks of standard Grey Wolf Optimizer （GWO） algorithm， such as slow convergence and being easy to fall into local optimum， an improved Grey Wolf Optimizer with Two Headed Wolves guide （GWO-THW） algorithm was proposed by utilizing a dual nonlinear convergence factor strategy. Firstly， the chaotic Cubic mapping was used to initialize the population for improving the uniformity and diversity of the population distribution. And the wolves were divided into hunter wolves and scout wolves through the average fitness values. The different convergence factors were used to two types of wolves to seek after and round up their prey under the leadership of their respective leader wolf. Secondly， an adaptive weight factor of position updating was designed to improve the search speed and accuracy. Meanwhile， a Levy flight strategy was employed to randomly update the positions of wolves for jumping out of local optimum， when no prey was found in a certain period of time. Ten benchmark functions were selected to test the performance and effectiveness of GWO-THW. Experimental results show that compared with standard GWO and related variants， GWO-THW achieves higher optimization accuracy and faster convergence on eight benchmark functions， especially on the multi-peak functions， the algorithm can converge to the ideal optimal value within 200 iterations， indicating that GWO-THW has better optimization performance.

Deep neural network compression algorithm based on hybrid mechanism

Xujian ZHAO, Hanglin LI

2023, 43(9): 2686-2691. DOI: 10.11772/j.issn.1001-9081.2022091392

Asbtract ( )

HTML ( )

PDF (2917KB) ( )

Figures and Tables | References | Related Articles | Metrics

With the rapid development of Artificial Intelligence （AI） in recent years， the demand for Deep Neural Network （DNN） from devices with limited resources such as embedded devices and mobile devices has increased sharply. The problem of how to compress neural networks without affecting the effect of DNNs has great theoretical and practical significance， and is a hot research topic in deep learning now. Firstly， aiming at the problem that DNN is difficult to be ported to resource-limited devices such as mobile devices due to their large models and large computational cost， the experimental performance of existing DNN compression algorithms in terms of memory usage， running speed， and compression effect was deeply analyzed， so that the influence factors of the DNN compression algorithm were explored. Then， the knowledge transfer structure composed of student network and teacher network was designed， the knowledge distillation， structural design， network pruning， and parameter quantization mechanisms were fused together， and a DNN optimization and compression model based on hybrid mechanism was proposed. Experimental comparison and analysis were conducted on mini-ImageNet dataset using AlexNet as the Benchmark. Experimental results show that the capacity of compressed AlexNet is reduced by 98.5% with 6.3% loss of accuracy， which verify the effectiveness of the proposed algorithm.

Lightweight image tamper localization algorithm based on large kernel attention convolution

Hong WANG, Qing QIAN, Huan WANG, Yong LONG

2023, 43(9): 2692-2699. DOI: 10.11772/j.issn.1001-9081.2022091405

Asbtract ( )

HTML ( )

PDF (2288KB) ( )

Figures and Tables | References | Related Articles | Metrics

Convolutional Neural Networks （CNN） are used for image forensics because of their high recognizable property， easy understanding， and strong learnability. However， their inherent disadvantages of the receptive field increasing slowly and neglecting long-range dependencies， and high computational cost cause the unsatisfactory accuracy and lightweight deployment of deep learning algorithms. To solve the above problems， a lightweight network-based image copy-paste tamper detection algorithm namely LKA-EfficientNet （Large Kernel Attention EfficientNet） was proposed. The characteristics of long-range dependencies and global receptive field were contained in LKA-EfficientNet， and the number of EfficientNetV2 parameters was optimized. As a result， the localization speed and detection accuracy of image tamper were improved. Firstly， the image was inputted into and processed in the backbone network based on Large Kernel Attention （LKA） to obtain the candidate feature maps. Then， the feature maps of different scales were used to construct the feature pyramid for feature matching. Finally， the candidate feature maps after feature matching were fused to locate the tampered area of the image. In addition， the triple cross entropy loss function was used by LKA-EfficientNet to further improve the accuracy of the algorithm in image tamper localization. Experimental results show that LKA-EfficientNet can not only reduce the floating-point operations by 29.54% but also increase the F1 by 4.88% compared to the same type algorithm — Dense-InceptionNet. The above verifies that LKA-EfficientNet can reduce computational cost and maintain high detection performance at the same time.

Forest-based entity-relation joint extraction model

Xuanli WANG, Xiaolong JIN, Zhongni HOU, Huaming LIAO, Jin ZHANG

2023, 43(9): 2700-2706. DOI: 10.11772/j.issn.1001-9081.2022091419

Asbtract ( )

HTML ( )

PDF (1117KB) ( )

Figures and Tables | References | Related Articles | Metrics

Nested entities pose a challenge to the task of entity-relation joint extraction. The existing joint extraction models have the problems of generating a large number of negative examples and high complexity when dealing with nested entities. In addition， the interference of nested entities on triplet prediction is not considered by these models. To solve these problems， a forest-based entity-relation joint extraction method was proposed， named EF2LTF （Entity Forest to Layering Triple Forest）. In EF2LTF， a two-stage joint training framework was adopted. Firstly， through the generation of an entity forest， different entities within specific nested entities were identified flexibly. Then， the identified nested entities and their hierarchical structures were combined to generate a hierarchical triplet forest. Experimental results on four benchmark datasets show that EF2LTF outperforms methods such as joint entity and relation extraction with Set Prediction Network （SPN） model， joint extraction model for entities and relations based on Span — SpERT （Span-based Entity and Relation Transformer） and Dynamic Graph Information Extraction ++ （DyGIE++）^{on F1 score. It is verified that the proposed method not only enhances the recognition ability of nested entities， but also enhances the ability to distinguish nested entities when constructing triples， thereby improving the joint extraction performance of entities and relations.}

Scholar fine-grained information extraction method fused with local semantic features

Yuelin TIAN, Ruizhang HUANG, Lina REN

2023, 43(9): 2707-2714. DOI: 10.11772/j.issn.1001-9081.2022091407

Asbtract ( )

HTML ( )

PDF (1296KB) ( )

Figures and Tables | References | Related Articles | Metrics

It is importantly used in the fields such as creation of large-scale professional talent pools to extract scholar fine-grained information such as scholar’s research directions， education experience from scholar homepages. To address the problem that the existing scholar fine-grained information extraction methods cannot use contextual semantic associations effectively， a scholar fine-grained information extraction method incorporating local semantic features was proposed to extract fine-grained information from scholar homepages by using semantic associations in the local text. Firstly， general semantic representation was learned by the full-word mask Chinese pre-trained model RoBERTa-wwm-ext. Subsequently， the representation vector of the target sentence， as well as its locally adjacent text representation vector from the general semantic embeddings， were jointly fed into a CNN （Convolutional Neural Network） to accomplish local semantic fusion， thereby obtaining a higher-dimensional representation vector for the target sentence. Finally， the representation vector of the target sentence was mapped from the high-dimensional space to the low-dimensional labeling space to extract the fine-grained information from the scholar homepage. Experimental results show that the micro-average F1 score of the scholar fine-grained information extraction method fusing local semantic features reaches 93.43%， which is higher than that of RoBERTa-wwm-ext-TextCNN method without fusing local semantic by 8.60 percentage points， which verifies the effectiveness of the proposed method on the scholar fine-grained information extraction task.

Chinese homophonic neologism discovery method based on Pinyin similarity

Hanchen LI, Shunxiang ZHANG, Guangli ZHU, Tengke WANG

2023, 43(9): 2715-2720. DOI: 10.11772/j.issn.1001-9081.2022091390

Asbtract ( )

HTML ( )

PDF (927KB) ( )

Figures and Tables | References | Related Articles | Metrics

As one of the basic tasks of natural language processing， new word identification provides theoretical support for the establishment of Chinese dictionary and analysis of word sentiment tendency. However， the current new word identification methods do not consider the homophonic neologism identification， resulting in low precision of homophonic neologism identification. To solve this problem， a Chinese homophonic neologism discovery method based on Pinyin similarity was proposed， and the precision of homophonic neologism identification was improved by introducing the phonetic comparison of new and old words in this method. Firstly， the text was preprocessed， the Average Mutual Information （AMI） was calculated to determine the degree of internal cohesion of candidate words， and the improved branch entropy was used to determine the boundaries of candidate new words. Then， the retained words were transformed into Chinese Pinyin with similar pronunciations and compared to the Chinese Pinyin of the old words in the Chinese dictionary， and the most similar results of comparisons would be retained. Finally， if a comparison result exceeded the threshold， the new word in the result was taken as the homophonic neologism， and its corresponding word was taken as the original word. Experimental results on self built Weibo datasets show that compared with BNshCNs （Blended Numeric and symbolic homophony Chinese Neologisms） and DSSCNN （similarity computing model based on Dependency Syntax and Semantics）， the proposed method has the precision， recall and F1 score improved by 0.51 and 5.27 percentage points， 2.91 and 6.31 percentage points， 1.75 and 5.81 percentage points respectively， indicating that the proposed method has better Chinese homophonic neologism identification effect.

Automatic international classification of diseases coding model based on meta-network

Xiaomin ZHOU, Fei TENG, Yi ZHANG

2023, 43(9): 2721-2726. DOI: 10.11772/j.issn.1001-9081.2022091388

Asbtract ( )

HTML ( )

PDF (1032KB) ( )

Figures and Tables | References | Related Articles | Metrics

The frequency distribution of International Classification of Diseases （ICD） codes is long tail， resulting in it is challenging to perform multi-label text classification for few-shot code. An MNIC （Meta Network-based automatic ICD Coding model） was proposed to solve the problem of insufficient training data in few-shot code classification. Firstly， instances in the feature space and features in the semantic space were fitted to the same space for mapping， and the feature representations of many-shot codes were mapped to their classifier weights， thus learning meta-knowledge through meta-network. Secondly， the learned meta-knowledge was transferred from data-abundant many-shot codes to data-poor few-shot codes. Finally， a reasonable explanation was provided for the transferability and generality of meta-knowledge. Experimental results on MIMIC-Ⅲ dataset show that MNIC improves the Micro-F1 and Micro Area Under Curve （Micro-AUC） of few-shot codes by 3.77 and 3.82 percentage points respectively compared to the suboptimal AGM-HT （Adversarial Generative Model conditioned on code descriptions with Hierarchical Tree structure） model， indicating that the proposed model improves the performance of few-shot code classification significantly.

Feature pyramid network algorithm based on context information and multi-scale fusion importance awareness

Hao YANG, Yi ZHANG

2023, 43(9): 2727-2734. DOI: 10.11772/j.issn.1001-9081.2022081249

Asbtract ( )

HTML ( )

PDF (2864KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problem that the classification and localization sub-tasks in object detection require large receptive field and high resolution respectively， and it is difficult to achieve a balance between these two contradictory requirements， a feature pyramid network algorithm based on attention mechanism for object detection was proposed. In the algorithm， multiple different receptive fields were integrated to obtain richer semantic information， multi-scale feature maps were fused in the way of paying more attention to the importance of different feature maps， and the fused feature maps were further refined under the guidance of the attention mechanism. Firstly， multi-scale receptive fields were obtained through multiple atrous convolutions with different dilation rates， which enhanced the semantic information with the preservation of the resolution. Secondly， through the Multi-Level Fusion （MLF）， multiple feature maps of different scales were fused after changing to the same resolution through upsampling or pooling operations. Finally， the proposed Attention-guided Feature Refinement Module （AFRM） was used to refine the fused feature maps to enhance semantic information and eliminate the aliasing effect caused by fusion. After replacing the Feature Pyramid Network （FPN） in Faster R-CNN with the proposed feature pyramid， experiments were performed on MS COCO 2017 dataset. The results show that when the backbone network is ResNet （Residual Network） with a depth of 50 and 101， with the use of the proposed algorithm， the Average Precision （AP） of the model reaches 39.2% and 41.0% respectively， which is 1.4 and 1.0 percentage points higher than that of Faster R-CNN using the original FPN， respectively. It can be seen that the proposed feature pyramid network algorithm can replace the original feature pyramid to be better applied in the object detection scenarios.

Few-shot text classification method based on prompt learning

Bihui YU, Xingye CAI, Jingxuan WEI

2023, 43(9): 2735-2740. DOI: 10.11772/j.issn.1001-9081.2022081295

Asbtract ( )

HTML ( )

PDF (884KB) ( )

Figures and Tables | References | Related Articles | Metrics

Text classification tasks usually rely on sufficient labeled data. Concerning the over-fitting problem of classification models on samples with small size in low resource scenarios， a few-shot text classification method based on prompt learning called BERT-P-Tuning was proposed. Firstly， the pre-trained model BERT （Bidirectional Encoder Representations from Transformers） was used to learn the optimal prompt template from labeled samples. Then， the prompt template and vacancy were filled in each sample， and the text classification task was transformed into the cloze test task. Finally， the final labels were obtained by predicting the word with the highest probability of the vacant positions and combining the mapping relationship between it and labels. Experimental results on the short text classification tasks of public dataset FewCLUE show that the proposed method have significantly improved the evaluation indicators compared to the BERT fine-tuning based method. In specific， the proposed method has the accuracy and F1 score increased by 25.2 and 26.7 percentage points respectively on the binary classification task， and the proposed method has the accuracy and F1 score increased by 6.6 and 8.0 percentage points respectively on the multi-class classification task. Compared with the PET （Pattern Exploiting Training） method of constructing templates manually， the proposed method has the accuracy increased by 2.9 and 2.8 percentage points respectively on two tasks， and the F1 score increased by 4.4 and 4.2 percentage points respectively on two tasks. The above verifies the effectiveness of applying pre-trained model on few-shot tasks.

Collaborative recommendation algorithm based on deep graph neural network

Runchao PAN, Qishan YU, Hongfei XIONG, Zhihui LIU

2023, 43(9): 2741-2746. DOI: 10.11772/j.issn.1001-9081.2022091361

Asbtract ( )

HTML ( )

PDF (1539KB) ( )

Figures and Tables | References | Related Articles | Metrics

For the problem of over-smoothing in the existing recommendation algorithms based on Graph Neural Network （GNN）， a collaborative filtering recommendation algorithm based on deep GCN was proposed， namely Deep NGCF （Deep Neural Graph Collaborative Filtering）. In the algorithm， the initial residual connection and identity mapping were introduced into GNN， which avoided GNN from falling into over-smoothing after multiple graph convolution operations. Firstly， the initial embeddings of users and items were obtained through their interaction history. Next， in aggregation and propagation layer， collaborative signals of users and items in different stages were obtained with the use of initial residual connection and identity mapping. Finally， score prediction was performed according to the linear representation of all collaborative signals. In addition， to further improve the flexibility and recommendation performance of the model， the weights were set in the initial residual connection and identity mapping for adjustment. In order to verify the feasibility and effectiveness of Deep NGCF algorithm， experiments were conducted on datasets Gowalla， Yelp-2018 and Amazon-book. The results show that compared with the existing GNN recommendation algorithm such as Graph Convolutional Matrix Completion （GCMC） and Neural Graph Collaborate Filtering （NGCF）， Deep NGCF algorithm achieves the best results on recall and Normalized Discounted Cumulative Gain （NDCG）， thereby verifying the effectiveness of the proposed algorithm.

Knowledge tracing model based on graph neural network blending with forgetting factors and memory gate

Haodong ZHENG, Hua MA, Yingchao XIE, Wensheng TANG

2023, 43(9): 2747-2752. DOI: 10.11772/j.issn.1001-9081.2022081184

Asbtract ( )

HTML ( )

PDF (1266KB) ( )

Figures and Tables | References | Related Articles | Metrics

The knowledge tracing task diagnoses a student’s cognitive state in real time based on historical learning data， and predicts the future performance of the student in answering questions. In order to accurately model the forgetting behaviors and the time-series characteristics of the answering sequence in knowledge tracing， a Graph neural network-based Knowledge Tracing blending with Forgetting factors and Memory gate （GKT-FM） model was proposed. Firstly， through the answering record， the correlations of knowledge points were calculated and a knowledge graph was constructed. Then， Graph Neural Network （GNN） was used to model the cognitive state of the student， and seven characteristics that affect forgetting behaviors were considered comprehensively. After that， the memory gate structure was used to model the time-series characteristics in the student’s answering sequence， and the update process of GNN-based knowledge tracing was reconstructed. Finally， the prediction results were obtained by integrating the forgetting factors and the time-series characteristics. Experimental results on public datasets ASSISTments2009 and KDDCup2010 show that compared with GKT （Graph-based Knowledge Tracing） model， GKT-FM model improves the average AUC （Area Under Curve） by 6.9% and 9.5% respectively， and the average ACC （ACCuarcy） by 5.3% and 6.7% respectively， indicating that GKT-FM model can better model students’ forgetting behaviors and trace their cognitive states.

Aspect-based sentiment analysis method with integrating prompt knowledge

Xinyue ZHANG, Rong LIU, Chiyu WEI, Ke FANG

2023, 43(9): 2753-2759. DOI: 10.11772/j.issn.1001-9081.2022091347

Asbtract ( )

HTML ( )

PDF (1699KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aspect-based sentiment analysis based on pre-trained models generally uses end-to-end frameworks， has the problems of inconsistency between the upstream and downstream tasks， and is difficult to model the relationships between aspect words and context effectively. To address these problems， an aspect-based sentiment analysis method integrating prompt knowledge was proposed. First， in order to capture the semantic relation between aspect words and context effectively and enhance the model’s perception ability for sentiment analysis tasks， based on the Prompt mechanism， a prompt text was constructed and spliced with the original sentence and aspect words， and the obtained results were used as the input of the pre-trained model Bidirectional Encoder Representations from Transformers （BERT）. Then， a sentimental label vocabulary was built and integrated into the sentimental verbalizer layer， so as to reduce search space of the model， make the pre-trained model obtain rich semantic knowledge in the label vocabulary， and improve the learning ability of the model. Experimental results on Restaurant and Laptop field datasets of SemEval2014 Task4 dataset as well as ChnSentiCorp dataset show that the F1-score of the proposed method reaches 77.42%， 75.20% and 94.89% respectively， which is increased by 0.65 to 10.71， 1.02 to 9.58 and 0.83 to 6.40 percentage points compared with the mainstream aspect-based sentiment analysis methods such as Glove-TextCNN and P-tuning. The above verifies the effectiveness of the proposed method.

Spatial-temporal traffic flow prediction model based on gated convolution

Li XU, Xiangyuan FU, Haoran LI

2023, 43(9): 2760-2765. DOI: 10.11772/j.issn.1001-9081.2022081146

Asbtract ( )

HTML ( )

PDF (2271KB) ( )

Figures and Tables | References | Related Articles | Metrics

Concerning the problems that the existing traffic flow prediction models cannot accurately capture the spatio-temporal features of traffic data， and most models show good prediction performance in single-step prediction， and the prediction performance of models in multi-step prediction is not ideal， a Spatio-Temporal Traffic Flow Prediction Model based on Gated Convolution （GC-STTFPM） was proposed. Firstly， the Graph Convolution Network （GCN） combining with Gated Recurrent Unit （GRU） was used to capture the spatio-temporal features of traffic flow data. Then， a method of splicing and filtering the original data and spatio-temporal feature data by using gated convolution unit was proposed to verify the validity of spatio-temporal feature data. Finally， GRU was used as the decoder to make accurate and reliable prediction of future traffic flow. Experimental results on traffic dataset of Los Angeles Highway show that compared with Attention based Spatial-Temporal Graph Neural Network （ASTGNN） and Diffusion Convolutional Recurrent Neural Network （DCRNN） under single step prediction （5 min）， GC-STGCN model has the Mean Absolute Error （MAE） reduced by 5.9% and 9.9% respectively， and the Root Mean Square Error （RMSE） reduced by 1.7% and 5.8% respectively. At the same time， it is found that the prediction accuracy of this model is better than those of most existing benchmark models under three multi-step scales of 15， 30 and 60 min， demonstrating strong adaptability and robustness.

Double fault tolerant array code with low compilation complexity

Zheng XIE, Zihao WANG, Dan TANG, Hang ZHANG, Hongliang CAI

2023, 43(9): 2766-2774. DOI: 10.11772/j.issn.1001-9081.2022091344

Asbtract ( )

HTML ( )

PDF (2691KB) ( )

Figures and Tables | References | Related Articles | Metrics

Erasure coding is the underlying implementation technology of double fault tolerance for Redundant Array of Independent Disks-6 （RAID-6）， and the performance of erasure code is one of the important factors affecting the performance of RAID-6. Aiming at the problems of I/O imbalance and slow data recovery of array erasure codes commonly used in RAID-6， an Exclusive OR （XOR） based hybrid array code was proposed， namely J-code. A new parity check generation rule was adopted by J-code. Firstly， two-dimensional array constructed from the original data was used to calculate the diagonal parity bits and construct a new array. Then， the positional relationship between the data blocks in the new array was used to calculate the anti-diagonal parity bits. Besides， the original data and part of the parity bits were stored by J-code on the same disk， which reduced the number of XOR operations in the process of encoding and decoding as well as the number of data blocks read in the recovery process of a single disk， thereby reducing the complexity of encoding and decoding as well as the I/O cost for repairing a single disk， and alleviating the phenomenon of disk hotspot concentration. Simulation results show that compared with array codes such as RDP （Row-Diagonal Parity） and EaR （Endurance-aware RAID-6）， J-code has the encoding time reduced by 0.30% to 28.70%， the single disk failure repair time reduced by 2.23% to 31.62%， and the double disk failure repair time reduced by 0.39% to 36.00%.

Scientific collaboration potential prediction based on dynamic heterogeneous information fusion

Guoshuai MA, Yuhua QIAN, Yayu ZHANG, Junxia LI, Guoqing LIU

2023, 43(9): 2775-2783. DOI: 10.11772/j.issn.1001-9081.2022081266

Asbtract ( )

HTML ( )

PDF (1968KB) ( )

Figures and Tables | References | Related Articles | Metrics

In the existing scientific collaboration potential prediction methods， feature engineering is used to extract the shallow and static attributes of authors in scientific collaboration networks manually. At the same time， the relationships among heterogeneous entities in the scientific collaboration networks are ignored. To address this shortcoming， a dynamic Collaboration Potential Prediction （CPP） model was proposed to incorporate the potential attribute information of multiple entities in scientific collaboration networks. In this model， the structural features of scholar-scholar collaboration relationships were considered while extracting attributes of heterogeneous entities， and the model was optimized by the collaborative optimization method to realize the prediction of scientific collaboration potential while recommending scientific collaborators for scholars. To verify the effectiveness of the proposed model， the information of more than 500 000 papers published in the China Computer Federation （CCF）-recommended journals and the complete attribute information of related entities were collected and collated. And the temporal collaborative heterogeneous networks of different periods were constructed by the sliding window method to extract the dynamic attribute information of each entity during the evolution of the scientific collaborative network. In addition， to improve the generalization and practicality of the proposed model， the data from different periods were input to train the model randomly. Experimental results show that compared with the suboptimal model — Graph Sample and aggregate network （GraphSAGE）， CPP model improves the classification accuracy on collaborator recommendation task by 1.47 percentage points； for the cooperation potential prediction task， the test error of CPP is 1.23% lower than that of GraphSAGE. In conclusion， CPP model can recommend high-quality collaborators for scholars more accurately.

Impossible differential cryptanalysis of reduced-round ultra-lightweight block cipher PFP

Guangyao ZHAO, Xuan SHEN, Bo YU, Chenhui YI, Zhen LI

2023, 43(9): 2784-2788. DOI: 10.11772/j.issn.1001-9081.2022091395

Asbtract ( )

HTML ( )

PDF (1455KB) ( )

Figures and Tables | References | Related Articles | Metrics

The ultra-lightweight block cipher PFP based on Feistel structure is suitable for extremely resource-constrained environments such as internet of things terminal devices. Up to now， the best impossible differential cryptanalysis of PFP is to use 7-round impossible differential distinguishers to attack the 9-round PFP， which can recover 36-bit master key. The structure of PFP was studied in order to evaluate the ability for resisting impossible differential cryptanalysis more accurately. Firstly， by analyzing the differential distribution characteristics of S-box in the round function， two groups of differences with probability 1 were found. Secondly， combined with the characteristics of the permutation layer， a set of 7-round impossible differential distinguishers containing 16 impossible differences was constructed. Finally， based on the constructed 7-round impossible differential distinguishers， 40-bit master key was recovered by performing impossible differential cryptanalysis on the 9-round PFP， and an impossible differential cryptanalysis method for 10-round PFP was proposed to recover 52-bit master key. The results show that the proposed method has great improvement in terms of the number of distinguishers， the number of cryptanalysis rounds， and the number of bits of the recovered key.

Blockchain-based decentralized attribute-based encryption scheme for revocable attributes

Haiying MA, Jinzhou LI, Jikun YANG

2023, 43(9): 2789-2797. DOI: 10.11772/j.issn.1001-9081.2023020138

Asbtract ( )

HTML ( )

PDF (952KB) ( )

Figures and Tables | References | Related Articles | Metrics

For the problems of existing Attribute-Based Encryption （ABE） schemes， such as low efficiency of attribute revocation and difficulty in coordinating the distribution and revocation of user attribute keys， a Blockchain-based Decentralized Attribute-Based Encryption for Revocable attributes （BRDABE） scheme was proposed. Firstly， the consensus-driven blockchain architecture was used to map the trust issue of key distribution from the attribute authority to the distributed ledger， and smart contracts were used to record the status of user attributes and data sharing and assist the attribute authority to realize the user attribute revocation. When revoking a user’s attribute， the smart contracts were used by the attribute authority to automatically screen out the involved data owners and non-revoked authorized users and computed the ciphertext update key and key update key related to the revoked attribute， and the off-chain ciphertext and key update was realized. Then， the version key and the user’s global identity were embedded in the attribute private key， so that the identities in the session key ciphertext and the user’s attribute private key were able to cancel each other out when the user decrypted. Based on reasonable assumptions， BRDABE scheme was proved to resist the collusion attack of users and satisfy the forward and backward security of user attribute revocation. Experimental results show that with the increase of the number of user attributes， the time of user key generation， encryption and decryption and attribute revocation increase linearly. In the case of the same number of attributes， compared with DABE （Decentralizing Attribute-Based Encryption） scheme BRDABE scheme has the decryption time reduced by 94.06% to 94.75%， and compared with EDAC-MCSS （Effective Data Access Control for Multiauthority Cloud Storage Systems） scheme， BRDABE scheme has the attribute revocation time reduced by 92.19% to 92.27%. Therefore， BRDABE scheme not only improves the efficiency of attribute revocation， but also guarantees the forward and backward security of shared data.

Identity-based ring signature scheme on number theory research unit lattice

Jinbo LI, Ping ZHANG, Ji ZHANG, Muhua LIU

2023, 43(9): 2798-2805. DOI: 10.11772/j.issn.1001-9081.2022081268

Asbtract ( )

HTML ( )

PDF (962KB) ( )

Figures and Tables | References | Related Articles | Metrics

Concerning the problems that the size of the trapdoor base is too large and the public key of ring members needs digital certificate authentication in the lattice-based ring signature schemes， an NTRU （Number Theory Research Unit） lattice-based Identity-Based Ring Signature scheme （NTRU-IBRS） was proposed. Firstly， the trapdoor generation algorithm on NTRU lattice was used to generate the system master public-private key pairs. Secondly， the master private key was taken as the trapdoor information and the one-way function was reversely operated to obtain the private key of every ring member. Finally， based on the Small Integer Solution （SIS） problem， the ring signature was generated by using the rejection sampling technology. Security analysis shows that NTRU-IBRS is anonymous and existentially unforgeable under adaptive chosen message and chosen identity attacks. Performance analysis and experimental simulation show that compared with the ring signature scheme on ideal lattice and the identity-based linkable ring signature scheme on NTRU lattice： in storage overhead， NTRU-IBRS has the system private key length decreased by 0 to 99.6% and the signature private key length decreased by 50.0% to 98.4%； and in time overhead， NTRU-IBRS has the total time overhead reduced by 15.3% to 21.8%. Simulation results of applying NTRU-IBRS to the dynamic Internet of Vehicles （IoV） scenario show that NTRU-IBRS can ensure privacy security and improve communication efficiency during vehicle interaction at the same time.

Fair multi-party private set intersection protocol based on cloud server

Jing ZHANG, He TIAN, Kun XIONG, Yongli TANG, Li YANG

2023, 43(9): 2806-2811. DOI: 10.11772/j.issn.1001-9081.2022081229

Asbtract ( )

HTML ( )

PDF (1675KB) ( )

Figures and Tables | References | Related Articles | Metrics

Private Set Intersection （PSI） is an important solution for privacy information sharing. A fair multi-party PSI protocol based on cloud server was proposed for the unfairness caused by the existing protocols in which the parties involved do not have simultaneous access to the calculation results. Firstly， the storage of a sub-share of the private information in Garbled Bloom Filter （GBF） was accomplished by using hash mapping. Secondly， in order to avoid the leakage of the index value of each party’s set element during the interaction， combined with Oblivious Transfer （OT） technique， the share replacement of the stored information was realized. Finally， the bit-by-bit calculation was performed by the cloud server， and the results were returned to each party at the same time to ensure the fairness of each party’s access to the results. The correctness and security analysis of the protocol shows that the proposed protocol can achieve the fairness of the parties in obtaining the intersection results， and can resist the collusion of parties with the cloud server. The performance analysis shows that both of the computational complexity and the communication complexity of the proposed protocol are independent of the total number of elements contained in the set of participants. Under the same conditions， compared with Multi-party PSI protocol （MPSI）， practical multiparty maliciously-secure PSI protocol （PSImple） and Private Intersection Sum algorithm （PI-Sum）， the proposed protocol has less storage overhead， communication overhead and running time.

CHAIN： edge computing node placement algorithm based on overlapping domination

Xuyan ZHAO, Yunhe CUI, Chaohui JIANG, Qing QIAN, Guowei SHEN, Chun GUO, Xianchao LI

2023, 43(9): 2812-2818. DOI: 10.11772/j.issn.1001-9081.2022081250

Asbtract ( )

HTML ( )

PDF (1484KB) ( )

Figures and Tables | References | Related Articles | Metrics

In edge computing， computing resources are deployed at edge computing nodes closer to end users， and selecting the appropriate edge computing node deployment location from the candidate locations can enhance the node capacity and user Quality of Service （QoS） of edge computing services. However， there is less research on how to place edge computing nodes to reduce the cost of edge computing. In addition， there is no edge computing node deployment algorithm that can maximize the robustness of edge services while minimizing the deployment cost of edge computing nodes under the constraints of QoS factors such as the delay of edge services. To address the above issues， firstly， the edge computing node placement problem was transformed into a minimum dominating set problem with constraints by building a model about computing nodes， user transmission delay， and robustness. Then， the concept of overlapping domination was proposed， so that the network robustness was measured on the basis of overlapping domination， and an edge computing node placement algorithm based on overlapping domination was designed， namely CHAIN （edge server plaCement algoritHm based on overlAp domINation）. Simulation results show that CHAIN can reduce the system latency by 50.54% and 50.13% compared to the coverage oriented approximate algorithm and base station oriented random algorithm， respectively.

Electric vehicle charging station siting method based on spatial semantics and individual activities

Maozu GUO, Yazhe ZHANG, Lingling ZHAO

2023, 43(9): 2819-2827. DOI: 10.11772/j.issn.1001-9081.2022091421

Asbtract ( )

HTML ( )

PDF (6390KB) ( )

Figures and Tables | References | Related Articles | Metrics

To address the issue of siting for Electric EVCS （Vehicle Charging Station）， an urban charging station siting method based on spatial semantics and individual activities was proposed. First， according to the urban planning， unsupervised learning was used to cluster the Point Of Interests （POIs） out of the service radius to determine the number of new charging stations. Then， Constrained Two-Archive Evolutionary Algorithm （CTAEA） was used to solve the objective function to optimize the electric vehicle siting scheme under the constraints of maximizing the distance between stations and covering the most POIs with new charging stations. The trajectory data and POIs of taxis in the second-ring road of Chengdu were used as the experimental samples， and siting scheme with 15 charging stations was planned. Experimental results show that compared with NSGA2 （Non-dominated Sorting Genetic Algorithm 2） and SPEA2 （Strength Pareto Evolutionary Algorithm 2）， CTAEA improves 22.9 and 20.6 percentage points on POI coverage， and reduces 18.9% and 25.5% on driver’s average selected distance， which illustrates the convenience and rationality of the method in electric vehicle charging station siting.

Adaptive partitioning and scheduling method of convolutional neural network inference model on heterogeneous platforms

Shaofa SHANG, Lin JIANG, Yuancheng LI, Yun ZHU

2023, 43(9): 2828-2835. DOI: 10.11772/j.issn.1001-9081.2022081177

Asbtract ( )

HTML ( )

PDF (3025KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problems of low hardware resource utilization and high latency of Convolutional Neural Network （CNN） when performing inference on heterogeneous platforms， a self-adaptive partitioning and scheduling method of CNN inference model was proposed. Firstly， the key operators of CNN were extracted by traversing the computational graph to complete the adaptive partition of the model， so as to enhance the flexibility of the scheduling strategy. Then， based on the performance measurement and the critical path-greedy search algorithm， according to the sub-model running characteristics on the CPU-GPU heterogeneous platform， the optimal running load was selected to improve the sub-model inference speed. Finally， the cross-device scheduling mechanism in TVM （Tensor Virtual Machine） was used to configure the dependencies and running loads of sub-models in order to achieve adaptive scheduling of model inference， and reduce the communication delay between devices. Experimental results show that on GPU and CPU， compared to the method optimized by TVM operator， the proposed method improves the inference speed by 5.88% to 19.05% and 45.45% to 311.46% with no loss of model inference accuracy.

Deep neural network model acceleration method based on tensor virtual machine

Yunfei SHEN, Fei SHEN, Fang LI, Jun ZHANG

2023, 43(9): 2836-2844. DOI: 10.11772/j.issn.1001-9081.2022081259

Asbtract ( )

HTML ( )

PDF (3331KB) ( )

Figures and Tables | References | Related Articles | Metrics

With the development of Artificial Intelligence （AI） technology， the Deep Neural Network （DNN） models have been applied to various mobile and edge devices widely. However， the model deployment becomes challenging and the popularization and application of the models are limited due to the facts that the computing power of edge devices is low， the memory capacity of edge devices is small， and the realization of model acceleration requires in-depth knowledge of edge device hardware. Therefore， a DNN acceleration and deployment method based on Tensor Virtual Machine （TVM） was presented to accelerate the Convolutional Neural Network （CNN） model on Field-Programmable Gate Array （FPGA）， and the feasibility of this method was verified in the application scenarios of distracted driving classification. Specifically， in the proposed method， the computational graph optimization method was utilized to reduce the memory access and computational overhead of the model， the model quantization method was used to reduce the model size， and the computational graph packing method was adopted to offload the convolution calculation to the FPGA in order to speed up the model inference. Compared with MPU （MicroProcessor Unit）， the proposed method can reduce the inference time of ResNet50 and ResNet18 on MPU+FPGA by 88.63% and 77.53% respectively. On AUC （American University in Cairo） dataset， compared to MPU， the top1 inference accuracies of the two models on MPU+FPGA are only reduced by 0.26 and 0.16 percentage points respectively. It can be seen that the proposed method can reduce the deployment difficulty of different models on FPGA.

Enhanced sparrow search algorithm based on multiple improvement strategies

Dahai LI, Meixin ZHAN, Zhendong WANG

2023, 43(9): 2845-2854. DOI: 10.11772/j.issn.1001-9081.2022081270

Asbtract ( )

HTML ( )

PDF (4003KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the drawbacks that Sparrow Search Algorithm （SSA） has relatively low search accuracy and is easy to fall into the local optimum， an Enhanced Sparrow Search Algorithm based on Multiple Improvement strategies （EMISSA） was proposed. Firstly， in order to balance the global search and local search abilities of the algorithm， fuzzy logic was introduced to adjust the scale of sparrow finders dynamically. Secondly， the mixed differential mutation operation was performed on sparrow followers to generate mutation subgroups， thereby enhancing the ability of EMISSA to jump out of the local optimum. Finally， Topological Opposition-Based Learning （TOBL） was used to obtain topological opposition solutions of sparrow finders， thereby fully mining high-quality position information in the search space. EMISSA， standard SSA and Chaotic Sparrow Search Optimization Algorithm （CSSOA） were evaluated by 12 test functions in 2013 Congress on Evolutionary Computation （CEC2013）. Experimental results show that EMISSA achieves 11 first places on 12 test functions in the 30-dimensional case； in the 80-dimensional case， the proposed algorithm has the optimal results on all the test functions. In the Friedman test， EMISSA ranks first on all the test functions. Experimental results of applying EMISSA to the Wireless Sensor Network （WSN） node deployment in obstacle environment show that compared with other algorithms， EMISSA achieves the highest wireless node coverage with more uniform node distribution and less coverage redundancy.

Multiple level binary imperialist competitive algorithm for solving heterogeneous multiple knapsack problem

Bin LI, Zhibin TANG

2023, 43(9): 2855-2867. DOI: 10.11772/j.issn.1001-9081.2022081189

Asbtract ( )

HTML ( )

PDF (2507KB) ( )

Figures and Tables | References | Related Articles | Metrics

On the basis of the classical multiple knapsack problem， the Heterogeneous Multiple Knapsack Problem （HMKP） was proposed， which was abstracted from the commonalities of typical logistics service scenarios. And， an Imperialist Competitive Algorithm （ICA） was designed and customized to solve HMKP. As the origin ICA is easy to fall into the local optimum and the optimal solution of the 0-1 knapsack problem is usually near the constraint boundary， Two-Point Automutation Strategy （TPAS） and Jump out of Local Optimum Algorithm （JLOA） were designed to improve ICA， and a Binary Imperialist Competitive Algorithm （BICA） for 0-1 knapsack problem was presented. BICA showed comprehensive and efficient optimization ability in solving 35 numerical examples of 0-1 knapsack problem. BICA based on Best-Matched Value （BMV） was able to find the ideal optimal solutions of 19 out of 20 examples with 100% success rate in the first test set， and the ideal optimal solutions of 12 out of 15 examples were found by the above algorithm with 100% success rate in the second test set， achieving the best performance of all the comparison algorithms. The numerical analysis results show that BICA maintains the multipolar development strategy in the optimization evolution and relies on the unique population evolution method to search the ideal solution in the solution space efficiently. Subsequently， aiming at the strong constraint and high complexity of HMKP， a Multiple Level Binary Imperialist Competitive Algorithm （MLB-ICA） for solving HMKP was put forward based on BICA. Finally， the numerical experiments and performance evaluation of MLB-ICA were carried out on a high dimensional HMKP test set constructed by combining multiple typical numerical examples of 0-1 knapsack problems. The results showed that the solving time of MLB-ICA is longer than that of Gurobi solver， but the solving accuracy of MLB-ICA is 28% higher than that of Gurobi solver. It can be seen that MLB-ICA can solve high-dimensional complicated HMKP efficiently with low computational cost within acceptable time， and provides a feasible algorithm design scheme for ICA to solve super-large scale combinatorial optimization problems.

Hybrid dragonfly algorithm based on subpopulation and differential evolution

Bo WANG, Hao WANG, Xiaoxin DU, Xiaodong ZHENG, Wei ZHOU

2023, 43(9): 2868-2876. DOI: 10.11772/j.issn.1001-9081.2022060813

Asbtract ( )

HTML ( )

PDF (2338KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aiming at the problems such as weak development ability， low population diversity， and premature convergence to local optimum in Dragonfly Algorithm （DA）， an HDASDE （Hybrid Dragonfly Algorithm based on Subpopulation and Differential Evolution） was proposed. Firstly， the basic dragonfly algorithm was improved： the chaotic factor and purposeful Levy flight were integrated to improve the optimization ability of the dragonfly algorithm， and a chaotic transition mechanism was proposed to enhance the exploration ability of the basic dragonfly algorithm. Secondly， opposition-based learning was introduced on the basis of DE （Differential Evolution） algorithm to strengthen the development ability of DE algorithm. Thirdly， a dynamic double subpopulation strategy was designed to divide the entire population into two dynamically changing subpopulations according to the ability that the subpopulation can improve the algorithm’s ability to jump out of the local optimum. Fourthly， the dynamic subgroup structure was used to fuse the improved dragonfly algorithm and the improved DE algorithm. The fused algorithm had good global exploration ability and strong local development ability. Finally， HDASDE was applied to 13 typical complex function optimization problems and three-bar truss design optimization problem， and was compared with the original DA， DE and other meta-heuristic optimization algorithms. Experimental results show that， HDASDE outperforms DA， DE and ABC （Artificial Bee Colony） algorithms in all 13 test functions， outperforms Particle Swarm Optimization （PSO） algorithm in 12 test functions， and outperforms Grey Wolf Optimizer （GWO） algorithm in 10 test functions. And it performs well in the design optimization problem of three-bar truss.

β-QoM target-barrier coverage construction algorithm for wireless visual sensor network

Xinming GUO, Rui LIU, Fei XIE, Deyu LIN

2023, 43(9): 2877-2884. DOI: 10.11772/j.issn.1001-9081.2023010084

Asbtract ( )

HTML ( )

PDF (4482KB) ( )

Figures and Tables | References | Related Articles | Metrics

Focusing on the failure of intrusion detection resulted from low captured image width of traditional Wireless Visual Sensor Network （WVSN） target-barrier， a Wireless visual sensor network β Quality of Monitoring （β-QoM） Target-Barrier coverage Construction （WβTBC） algorithm was proposed to ensure that the captured image width is not less than β. Firstly， the geometric model of the visual sensor β-QoM region was established， and it was proven that the width of intruder image captured by the target-barrier of intersection of all adjacent visual sensor β-QoM regions must be greater than or equal to β. Then， based on the linear programming modeling for optimal β-QoM target-barrier coverage of WVSN， it was proven that this coverage problem is NP-hard. Finally， in order to obtain suboptimal solution of the problem， a heuristic algorithm WβTBC was proposed. In this algorithm， the directed graph of WVSN was constructed according to the counterclockwise β neighbor relationship between sensors， and Dijkstra algorithm was used to search β-QoM target-barriers in WVSN. Experimental results show that WβTBC algorithm can construct β-QoM target-barriers effectively， and save about 23.3%， 10.8% and 14.8% sensor nodes compared with Spiral Periphery Outer Coverage （SPOC）， Spiral Periphery Inner Coverage （SPIC） and Target-Barrier Construction （TBC） algorithms， respectively. In addition， under the condition of meeting the requirements of intrusion detection， with the use of WβTBC algorithm， the smaller β is， the higher success rate of building β-QoM target-barrier will be， the fewer nodes will be needed in forming the barrier， and the longer working period of WVSN for β-QoM intrusion detection will be.

Impact of non-persistent carrier sense multiple access mechanism on scalability of LoRa networks

Yicheng WAN, Guangxiang YANG, Qingda ZHANG, Chenyang GAN, Lin YI

2023, 43(9): 2885-2896. DOI: 10.11772/j.issn.1001-9081.2022081237

Asbtract ( )

HTML ( )

PDF (3616KB) ( )

Figures and Tables | References | Related Articles | Metrics

LoRaWAN， as a wireless communication standard in Low Power Wide Area Network （LPWAN）， provides the support for the development of IoT （Internet of Things）. However， limited by the characteristics of incomplete orthogonality among Spreading Factor （SF） and the fact that LoRaWAN does not have a Listen-Before-Transmit （LBT） mechanism， the ALOHA-based transmission scheduling method will trigger serious channel conflicts， which reduces the scalability of LoRa （Long Range Radio） networks greatly. Therefore， in order to improve the scalability of LoRa network， Non-Persistent Carrier Sense Multiple Access （NP-CSMA） mechanism was proposed to replace the medium access control mechanism of ALOHA in LoRaWAN. The time of accessing the channel for each node with the same SF in LoRa network was coordinated by LBT， and multiple SF signals were transmitted in parallel for the transmission between different SFs， thus reducing the interference of same SF and avoiding inter-SF interference in the common channel. To analyze the impact of NP-CSMA on the scalability of LoRa networks， LoRa networks constructed by Lo RaWAN and NP-CSMA were compared by theoretical analysis and NS3 simulation. Experimental results show that NP-CSMA has 58.09% higher theoretical Packet Delivery Rate （PDR） performance than LoRaWAN under the same conditions， at a network communication load rate of 1. In terms of channel utilization， NP-CSMA increases the saturated channel utilization by 214.9% and accommodates 60.0% more nodes compared to LoRaWAN. In addition， the average latency of NP-CSMA is also shorter than that of the confirmed LoRaWAN at a network traffic load rate of less than 1.7， and the additional energy consumption to maintain the CAD （Channel Activity Detection） mode is 1.0 mJ to 1.3 mJ and 2.5 mJ to 5.1 mJ lower than the additional energy consumption required by LoRaWAN to receive confirmation messages from the gateway when spreading factor is 7 and 10. The above fully reflects that NP-CSMA can improve LoRa network scalability effectively.

Focal stack depth estimation method based on defocus blur

Meng ZHOU, Zhangjin HUANG

2023, 43(9): 2897-2903. DOI: 10.11772/j.issn.1001-9081.2022091342

Asbtract ( )

HTML ( )

PDF (3089KB) ( )

Figures and Tables | References | Related Articles | Metrics

The existing monocular depth estimation methods often use image semantic information to obtain depth， and ignore another important cue — defocus blur. At the same time， the defocus blur based depth estimation methods usually take the focal stack or gradient information as input， and do not consider the characteristics of the small variation of blur between image layers of the focal stack and the blur ambiguity on both sides of the focal plane. Aiming at the deficiencies of the existing focal stack depth estimation methods， a lightweight network based on three-dimensional convolution was proposed. Firstly， a Three-Dimensional perception module was designed to roughly extract the blur information of the focal stack. Secondly， the extracted information was concatenated with the difference features of the focal stack RGB channels output by a channel difference module to construct a focus volume that was able to identify the blur ambiguity patterns. Finally， a multi-scale three-dimensional convolution was used to predict the depth. Experimental results show that compared with methods such as All in Focus Depth Network （AiFDepthNet）， the proposed method achieves the best on seven indicators such as Mean Absolute Error （MAE） on DefocusNet dataset， and the best on four indicators as well as the suboptimal on three indicators on NYU Depth V2 dataset； at the same time， the lightweight design reduces the inference time of the proposed method by 43.92% to 70.20% and 47.91% to 77.01% on two datasets respectively. The above verifies that the proposed method can effectively improve the accuracy and inference speed of focal stack depth estimation.

Rain detection algorithm based on event camera

Junyu YANG, Yan DONG, Zhennan LONG, Xin YANG, Bin HAN

2023, 43(9): 2904-2909. DOI: 10.11772/j.issn.1001-9081.2022091360

Asbtract ( )

HTML ( )

PDF (1427KB) ( )

Figures and Tables | References | Related Articles | Metrics

To reduce the harmful effects of rain for visual tasks， rain removal algorithms are commonly utilized on single frame images or video streams to remove rain. However， since rain falls extremely fast， frame-based cameras cannot capture the temporal continuity of rain， and the fixed exposure time and motion blur further reduce the sharpness of the rain on images， as a result， the traditional image rain removal algorithms cannot detect rain coverage areas accurately. In order to explore the new idea of image rain removal， a rain event generation model was constructed and a rain detection algorithm for event camera based on spatial-temporal relevance was proposed by using the characteristics of event camera： extremely high sampling rate and no motion blur. In this algorithm， the probability of each event generated by rain movement was calculated by analyzing the spatial-temporal relationship between each event recorded by the event camera and adjacent events， so as to achieve rain detection. Experimental results on three rainfall scenes show that when the camera is static， the proposed algorithm can reach more than 95% rain detection true positive rate， and the false positive rate less than 5%， and when the camera moves， the proposed algorithm can still reach more than 95% true positive rate and no more than 20% false positive rate. The above shows that the rain can be detected effectively by the proposed algorithm.

Triplet deep hashing method for speech retrieval

Qiuyu ZHANG, Yongwang WEN

2023, 43(9): 2910-2918. DOI: 10.11772/j.issn.1001-9081.2022081149

Asbtract ( )

HTML ( )

PDF (2003KB) ( )

Figures and Tables | References | Related Articles | Metrics

The existing deep hashing methods of content-based speech retrieval do not make enough use of supervised information and have the suboptimal generated hash codes， low retrieval precision and low retrieval efficiency. To address the above problems， a triplet deep hashing method for speech retrieval was proposed. Firstly， the spectrogram image features were used as the input of the model in triplet manner to extract the effective information of the speech feature. Then， an Attentional mechanism-Residual Network （ARN） model was proposed， that is， the spatial attention mechanism was embedded on the basis of the ResNet （Residual Network）， and the salient region representation was improved by aggregating the energy salient region information in the whole spectrogram. Finally， a novel triplet cross-entropy loss was introduced to map the classification information and similarity between spectrogram image features into the learned hash codes， thereby achieving the maximum class separability and maximal hash code discriminability during model training. Experimental results show that the efficient and compact binary hash codes generated by the proposed method has the recall， precision and F1 score of over 98.5% in speech retrieval. Compared with methods such as single-label retrieval method， the average running time of the proposed method using Log-Mel spectra as features is shorted by 19.0% to 55.5%. Therefore， this method can improve the retrieval efficiency and retrieval precision significantly while reducing the amount of computation.

Image copy-move forgery detection based on multi-scale feature extraction and fusion

Juntao CHEN, Ziqi ZHU

2023, 43(9): 2919-2924. DOI: 10.11772/j.issn.1001-9081.2022081288

Asbtract ( )

HTML ( )

PDF (1303KB) ( )

Figures and Tables | References | Related Articles | Metrics

In the field of image copy-move forgery detection， it is very challenging to locate the boundaries of tampered small objects accurately. Current deep learning-based methods locate forged regions by detecting similar content in images. However， these methods usually just transmit the final features extracted by the encoder to the decoder to generate the mask， and ignore more spatial information of forged regions contained in the high-resolution encoding features， resulting in inaccurate model output prediction results for the boundary identification of small objects. To address this problem， a detection network based on multi-scale feature extraction and fusion called SimiNet was proposed. Firstly， abundant features were extracted by the multi-scale feature extraction module. Secondly， a skip connection was added between the feature extraction module and the decoding module to bridge the gap between the encoding and decoding features， so as to identify the boundaries of small objects accurately. Finally， Log-Cosh Dice Loss function was used to take the place of cross entropy loss to reduce the impact of class-imbalance problem on detection results. Experimental results show that the F1 score of SimiNet on USCISI dataset reaches 72.54%， which is 3.39 percentage points higher than that of the suboptimal method CMSDNet （Copy-Move Similarity Detection Network）. It can be seen that SimiNet is more accurate for boundary identification of small objects and has better visualization.

Image tampering forensics network based on residual feedback and self-attention

Guolong YUAN, Yujin ZHANG, Yang LIU

2023, 43(9): 2925-2931. DOI: 10.11772/j.issn.1001-9081.2022081283

Asbtract ( )

HTML ( )

PDF (1998KB) ( )

Figures and Tables | References | Related Articles | Metrics

The existing multi-tampering type image forgery detection algorithms using noise features often can not effectively detect the feature difference between tampered areas and non-tampered areas， especially for copy-move tampering type. To this end， a dual-stream image tampering forensics network fusing residual feedback and self-attention mechanism was proposed to detect tampering artifacts such as unnatural edges of RGB pixels and local noise inconsistence respectively through two streams. Firstly， in the encoder stage， multiple dual residual units integrating residual feedback were used to extract relevant tampering features to obtain coarse feature maps. Secondly， further feature reinforcement was performed on the coarse feature maps by the improved self-attention mechanism. Thirdly， the mutual corresponding shallow features of encoder and deep features of decoder were fused. Finally， the final features of tempering extracted by the two streams were fused in series， and then the pixel-level localization of the tampered area was realized through a special convolution operation. Experimental results show that the F1 score and Area Under Curve （AUC） value of the proposed network on COVERAGE dataset are better than those of the comparison networks. The F1 score of the proposed network is 9.8 and 7.7 percentage points higher than that of TED-Net （Two-stream Encoder-Decoder Network） on NIST16 and Columbia datasets， and the AUC increases by 1.1 and 6.5 percentage points， respectively. The proposed network achieves good results in copy-move tampering type detection， and is also suitable for other tampering type detection. At the same time， the proposed network can locate the tampered area at pixel level accurately， and its detection performance is superior to the comparison networks.

Semi-supervised fake job advertisement detection model based on consistency training

Ruiqi WANG, Shujuan JI, Ning CAO, Yajie GUO

2023, 43(9): 2932-2939. DOI: 10.11772/j.issn.1001-9081.2022081163

Asbtract ( )

HTML ( )

PDF (2191KB) ( )

Figures and Tables | References | Related Articles | Metrics

The flood of fake job advertisements will not only damage the legitimate rights and interests of job seekers but also disrupt the normal employment order， which results in a poor user experience for job seekers. To effectively detect fake job advertisements， an SSC （Semi-Supervised fake job advertisements detection model based on Consistency training） was proposed. Firstly， the consistency regularization term was applied on all the data to improve the performance of the model. Then， supervised loss and unsupervised loss were integrated through joint training to obtain the semi-supervised loss. Finally， the semi-supervised loss was used to optimize the model. Experimental results on two real datasets EMSCAD （EMployment SCam Aegean Dataset） and IMDB （Internet Movie DataBase） show that SSC achieves the best detection performance when the labeled data are only 20， and the accuracy is increased by 2.2 and 2.8 percentage points compared with the existing advanced semi-supervised learning model UDA （Unsupervised Data Augmentation）， and is increased by 3.4 and 11.7 percentage points compared with the deep learning model BERT （Bidirectional Encoder Representations from Transformers）. At the same time， SSC has good scalability.

Super-resolution reconstruction algorithm of medical images based on dilated convolution

Zhong LI, Yajing WANG, Qiaomei MA

2023, 43(9): 2940-2947. DOI: 10.11772/j.issn.1001-9081.2023030381

Asbtract ( )

HTML ( )

PDF (3298KB) ( )

Figures and Tables | References | Related Articles | Metrics

To solve the problems such as blurred image details and insufficient utilization of global information in existing medical image super-resolution reconstruction， a medical image super-resolution reconstruction algorithm based on dilated convolution and improved hybrid attention mechanism was proposed. Firstly， depthwise separable convolution was combined with dilated convolution to extract image features at different scales by using perceptive fields with different sizes， thereby enhancing feature representation ability. Secondly， edge channel attention mechanism was introduced to fuse the edge information while extracting the high-frequency image features， thereby improving the reconstruction accuracy of the model. Thirdly， L1 loss and perceptual loss were mixed as the overall loss function in order to make the reconstructed image effect more consistent with human visual perception. Experimental results show that when the amplification factor is 3， compared with Super-Resolution Convolutional Neural Network （SRCNN） and VDSR （Very Deep convolutional networks Super-Resolution）， the proposed algorithm has the PSNR （Peak Signal-to-Noise Ratio） improved by 11.29% and 7.85% averagely and respectively， and the SSIM （Structural Similarity Index Measure） improved by 5.25% and 2.44% averagely and respectively. It can be seen that the proposed algorithm enhances the effect and texture features of the medical images， and provides more complete restoration of overall image structure.

Sparse reconstruction of CT images based on Uformer with fused channel attention

Mengmeng CHEN, Zhiwei QIAO

2023, 43(9): 2948-2954. DOI: 10.11772/j.issn.1001-9081.2022081242

Asbtract ( )

HTML ( )

PDF (5664KB) ( )

Figures and Tables | References | Related Articles | Metrics

Concerning the problem of streak artifacts generated in the sparse reconstruction of analytic method， a Channel Attention U-shaped Transformer （CA-Uformer） was proposed to achieve high-precision Computed Tomography （CT） sparse reconstruction. In CA-Uformer， channel attention and spatial attention in Transformer were fused， and with the dual-attention mechanism， image detail information was easier learnt by the network； an excellent U-shaped architecture was adopted to fuse multi-scale image information； a forward feedback network design was implemented by using convolutional operations， which further coupled the local information association ability of Convolutional Neural Network （CNN） and the global information capturing ability of Transformer. Experimental results show that CA-Uformer has the Peak Signal-to-Noise Ratio （PSNR）， Structural Similarity （SSIM） 3.27 dB and 3.14% higher， and Root Mean Square Error （RMSE） 35.29% lower than the classical U-Net， which is a significant improvement. It can be seen that CA-Uformer has sparse reconstruction with higher precision and better ability to suppress artifacts.

Stomach cancer image segmentation method based on EfficientNetV2 and object-contextual representation

Di ZHOU, Zili ZHANG, Jia CHEN, Xinrong HU, Ruhan HE, Jun ZHANG

2023, 43(9): 2955-2962. DOI: 10.11772/j.issn.1001-9081.2022081159

Asbtract ( )

HTML ( )

PDF (4902KB) ( )

Figures and Tables | References | Related Articles | Metrics

In view of the problems that the upsampling process of U-Net is easy to lose details， and the datasets of stomach cancer pathological image are generally small， which tends to lead to over-fitting， an automatic segmentation model for pathological images of stomach cancer based on improved U-Net was proposed， namely EOU-Net. In EOU-Net， based on the existing U-Net model， EfficientNetV2 was used as the backbone， thereby enhancing the feature extraction ability of the network encoder. In the decoding stage， the relations between cell pixels were explored on the basis of Object-Contextual Representation （OCR）， and the improved OCR module was used to solve the loss problem of the upsampled image details. Then， the post-processing of Test Time Augmentation （TTA） was used to predict the images obtained by rollover and rotations at different angles of the input image respectively， and then the prediction results of these images were combined by feature fusion to further optimize the output results of the network， thereby solving the problem of small medical datasets effectively. Experimental results on datasets SEED， BOT and PASCAL VOC 2012 show that the Mean Intersection over Union （MIoU） of EOU-Net is improved by 1.8， 0.6 and 4.5 percentage points respectively compared with that of OCRNet. It can be seen that EOU-Net can obtain more accurate segmentation results of stomach cancer images.

Remote sensing image pansharpening by convolutional neural network

Kunting LU, Rongrong FEI, Xuande ZHANG

2023, 43(9): 2963-2969. DOI: 10.11772/j.issn.1001-9081.2022091458

Asbtract ( )

HTML ( )

PDF (3323KB) ( )

Figures and Tables | References | Related Articles | Metrics

In the linear injection models of traditional Component Substitution （CS） and Multi-Resolution Analysis （MRA） in remote sensing image pansharpening， the relative spectral response of the sensor used for pansharpening is not considered. At the same time， the insufficient feature extraction of the original images caused by the deep learning-based methods results in the loss of spectral and spatial information in the fusion results. Aiming at the above problems， a pansharpening method combining traditional and deep learning methods was proposed， namely CMRNet. Firstly， CS and MRA were combined with Convolutional Neural Network （CNN） to achieve nonlinearity and improve the performance of pansharpening method. Secondly， the Residual Channel （RC） blocks were designed to realize the fusion and extraction of multi-scale feature information， and the feature maps of different channels were assigned different weights by using Channel Attention （CA） adaptively to learn more effective information. Experiments were conducted on QuickBird and GF1 satellite datasets. Experimental results show that on downscale QuickBird and GF1 datasets， compared with the classic method PanNet， CMRNet has the Peak Signal-to-Noise Ratio （PSNR） increased by 5.48% and 9.62% respectively， and other indexes also improved significantly， which verifies that CMRNet can achieve a better pansharpening effect.

Review of research on aquaculture counting based on machine vision

Hanyu ZHANG, Zhenbo LI, Weiran LI, Pu YANG

2023, 43(9): 2970-2982. DOI: 10.11772/j.issn.1001-9081.2022081261

Asbtract ( )

HTML ( )

PDF (1320KB) ( )

Figures and Tables | References | Related Articles | Metrics

Aquaculture counting is an important part of the aquaculture process， and the counting results provide an important basis for feeding， breeding density adjustment， and economic efficiency estimation of aquatic animals. In response to the traditional manual counting methods， which are time-consuming， labor-intensive， and prone to large errors， a large number of methods and applications based on machine vision have been proposed， thereby greatly promoting the development of non-destructive counting of aquatic products. In order to deeply understand the research on aquaculture counting based on machine vision， the relevant domestic and international literature in the past 30 years was collated and analyzed. Firstly， a review of aquaculture counting was presented in the perspective of data acquisition， and the methods for acquiring the data required for machine vision were summed up. Secondly， the aquaculture counting methods were analyzed and summarized in terms of traditional machine vision and deep learning. Thirdly， the practical applications of counting methods in different farming environments were compared and analyzed. Finally， the difficulties in the development of aquaculture counting research were summarized in terms of data， methods， and applications， and corresponding views were presented for the future trends of aquaculture counting research and equipment applications.

Table of Content