Search Result

Select

Image super-resolution network based on global dependency Transformer

Zihan LIU, Dengwen ZHOU, Yukai LIU

Journal of Computer Applications 2024, 44 (5): 1588-1596. DOI: 10.11772/j.issn.1001-9081.2023050636

Abstract （356）

HTML （8）

PDF （2858KB）（207）

Save

At present， the image super-resolution networks based on deep learning are mainly implemented by convolution. Compared with the traditional Convolutional Neural Network （CNN）， the main advantage of Transformer in the image super-resolution task is its long-distance dependency modeling ability. However， most Transformer-based image super-resolution models cannot establish global dependencies with small parameters and few network layers， which limits the performance of the model. In order to establish global dependencies in super-resolution network， an image Super-Resolution network based on Global Dependency Transformer （GDTSR） was proposed. Its main component was the Residual Square Axial Window Block （RSAWB）， and in Transformer residual layer， axial window and self-attention were used to make each pixel globally dependent on the entire feature map. In addition， the super-resolution image reconstruction modules of most current image super-resolution models are composed of convolutions. In order to dynamically integrate the extracted feature information， Transformer and convolution were combined to jointly reconstruct super-resolution images. Experimental results show that the Peak Signal-to-Noise Ratio （PSNR） and Structural Similarity Index （SSIM） of GDTSR on five standard test sets， including Set5， Set14， B100， Urban100 and Manga109， are optimal for three multiples （ $× 2$ ， $× 3$ ， $× 4$ ）， and on large-scale datasets Urban100 and Manga109， the performance improvement is especially obvious.

Table and Figures | Reference | Related Articles | Metrics

Select

Fish image classification based on positional overlapping patch embedding and multi-scale channel interactive attention

Wen ZHOU, Yuzhang CHEN, Zhiyuan WEN, Shiqi WANG

Journal of Computer Applications 2024, 44 (10): 3209-3216. DOI: 10.11772/j.issn.1001-9081.2023101466

Abstract （130）

HTML （1）

PDF （2604KB）（38）

Save

Underwater fish image classification is a highly challenging task. The traditional Vision Transformer （ViT） network backbone is limited to process local continuous features， and it does not perform well in fish classification with lower image quality. To solve this problem， a Transformer-based image classification network based on Overlapping Patch Embedding （OPE） and Multi-scale Channel Interactive Attention （MCIA）， called PIFormer （Positional overlapping and Interactive attention transFormer）， was proposed. PIFormer was built in a multi-layer format with each layer stacked at different times to facilitate the extraction of features at different depths. Firstly， the deep Positional Overlapping Patch Embedding （POPE） module was introduced to overlap and slice the feature map and edge information， so as to retain the local continuous features of the fish body. At the same time， position information was added for sorting， thereby helping PIFormer integrate the detailed features and build the global map. Then， the MCIA module was proposed to process the local and global features in parallel， and establish the long-distance dependencies of different parts of the fish body. Finally， the high-level features were processed by Group Multi-Layer Perceptron （GMLP） to improve the efficiency of the network and realize the final fish classification. To verify the effectiveness of PIFormer， a self-built dataset of freshwater fishes in East Lake was proposed， and the public datasets Fish4Knowledge and NCFM （Nature Conservancy Fisheries Monitoring） were used to ensure experimental fairness. Experimental results demonstrate that the Top-1 classification accuracy of the proposed network on each dataset reaches 97.99%， 99.71% and 90.45% respectively. Compared with ViT， Swin Transformer and PVT （Pyramid Vision Transformer） of the same depth， the proposed network has the number of parameters reduced by 72.62×10⁶， 14.34×10⁶ and 11.30×10⁶ respectively， and the FLoating point Operation Per second （FLOPs） saved by 14.52×10⁹， 2.02×10⁹ and 1.48×10⁹ respectively. It can be seen that PIFormer has strong fish image classification capability with reduced computational burden， achieving superior performance.

Table and Figures | Reference | Related Articles | Metrics

Select

Leukocyte detection method based on twice-fusion-feature CenterNet

Huan LIU, Lianghong WU, Lyu ZHANG, Liang CHEN, Bowen ZHOU, Hongqiang ZHANG

Journal of Computer Applications 2023, 43 (8): 2602-2610. DOI: 10.11772/j.issn.1001-9081.2022071009

Abstract （298）

HTML （19）

PDF （4702KB）（132）

Save

Leukocyte detection is difficult due to different shapes and degrees of staining of leukocytes during real detection process in complex scenarios. To solve the problem， a dual feature fusion CenterNet based leukocyte detection method TFF-CenterNet （Twice-Fusion-Feature CenterNet） was proposed. Firstly， the features of the backbone network were fused with the features of deconvolution layers through Feature Pyramid Network （FPN）. In this way， the feature extraction ability of the method was improved to solve the problems of individual differences and different degrees of staining of leukocytes. Then， aiming at the problem of severe imbalance between the image area of leukocytes and the background image area， the heatmap loss function was improved to enhance the focus on positive samples of leukocyte and improve detection mean Average Precision （mAP）. Finally， for the characteristics of the tiny target， random location， and cell adhesion of leukocyte images， coordinate attention and coordinate convolution were introduced to improve the attention and sensitivity of leukocyte location information. For leukocytes in complex scenarios， TFF-CenterNet achieves the mAP of 97.01% and the detection speed of 167 frame/s， which are 3.24 percentage points higher and 42 frame/s faster than those of CenterNet respectively. Experimental results show that the proposed method can improve the mAP of leukocyte detection in complex situations while achieving real-time requirements， and improves the robustness， so that this method can provide technical support for rapid automatic leukocyte detection in complementary medical diagnosis.

Table and Figures | Reference | Related Articles | Metrics

Select

Unsupervised time series anomaly detection model based on re-encoding

Chunyong YIN, Liwen ZHOU

Journal of Computer Applications 2023, 43 (3): 804-811. DOI: 10.11772/j.issn.1001-9081.2022010006

Abstract （721）

HTML （52）

PDF （1769KB）（349）

Save

In order to deal with the problem of low accuracy of anomaly detection caused by data imbalance and highly complex temporal correlation of time series， a re-encoding based unsupervised time series anomaly detection model based on Generative Adversarial Network （GAN）， named RTGAN （Re-encoding Time series based on GAN）， was proposed. Firstly， multiple generators with cycle consistency were used to ensure the diversity of generated samples and thereby learning different anomaly patterns. Secondly， the stacked Long Short-Term Memory-dropout Recurrent Neural Network （LSTM-dropout RNN） was used to capture temporal correlation. Thirdly， the differences between the generated samples and the real samples were compared in the latent space by improved re-encoding. As the re-encoding errors， these differences were served as a part of anomaly score to improve the accuracy of anomaly detection. Finally， the new anomaly score was used to detect anomalies on univariate and multivariate time series datasets. The proposed model was compared with seven baseline anomaly detection models on univariate and multivariate time series. Experimental results show that the proposed model obtains the highest average F1-score （0.815） on all datasets. And the overall performance of the proposed model is 36.29% and 8.52% respectively higher than those of the original AutoEncoder （AE） model Dense-AE （Dense-AutoEncoder） and latest benchmark model USAD （UnSupervised Anomaly Detection on multivariate time series）. The robustness of the model was detected by different Signal-to-Noise Ratio （SNR）. The results show that the proposed model consistently outperforms LSTM-VAE （Variational Autoencoder based on LSTM）， USAD and OmniAnomaly， especially in the case of 30% SNR， the F1-score of RTGAN is 13.53% and 10.97% respectively higher than those of USAD and OmniAnomaly. It can be seen that RTGAN can effectively improve the accuracy and robustness of anomaly detection.

Table and Figures | Reference | Related Articles | Metrics

Select

Formal analysis approaches of train control system based on Petri nets

LIU Jiankun SONG Wen ZHOU Tao

Journal of Computer Applications 2013, 33 (04): 1132-1135. DOI: 10.3724/SP.J.1087.2013.01132

Abstract （803）

PDF （789KB）（627）

Save

Formal approaches are construction methods with accurate mathematical semantics, which are based on strict mathematical proofs. Generally, Petri nets are considered as a class of computation models to model the concurrent behavior. Also, formal specifications and analysis of a system can be conveniently developed by Petri nets. However, it is difficult to model a train control system with prototype Petri nets. The difficulties can be solved by extended Petri nets with inhibitor arcs. Hence, some key problems of train control systems were modeled and analyzed by the computation models of extended Petri nets in this paper. Two control sub-systems, station management sub-system and interval operation sub-system. were proposed. The former performed the entering and leaving of trains from stations by cooperative control. The later executed the safety control of block regions in stations, the safety recovery of emergency situations such as lightning stroke and the loss of signals, and the management of railway crossings. Finally, the activity, reachability, and boundedness of the proposed models were analyzed by S-invariants.

Reference | Related Articles | Metrics