Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Multi-scale sparse graph guided vision graph neural networks
Zimo ZHANG, Xuezhuan ZHAO
Journal of Computer Applications    2025, 45 (7): 2188-2194.   DOI: 10.11772/j.issn.1001-9081.2024070910
Abstract38)   HTML0)    PDF (2247KB)(12)       Save

Recently, the Vision Graph neural network (ViG) has attracted considerable attention from the researchers in the field of computer vision, with graph construction being a key modeling approach in ViG. The existing popular K-Nearest Neighbor (KNN) graph construction approach is limited by fixed scale and quadratic computational complexity, making it difficult to model both local and multi-scale information in the image. To address this issue, a construction method of multi-scale sparse graph — MSSG (Multi-Scale Sparse Graph) was proposed. In this method, the KNN graph was decomposed into three sparse subgraphs of different scales along the channel dimension, achieving linear computational complexity while modeling both local and multi-scale information in the image effectively. To enhance the model’s global modeling capability, a global and local multi-scale information fusion strategy was proposed. Based on these methods, a vision architecture — MSViG (Multi-Scale Vision Graph neural network) was proposed. The results of image classification experiments on ImageNet-1K dataset demonstrate that MSViG outperforms the existing ViG. For example, the proposed MSViG-T achieves a 2.1 percentage points higher Top-1 classification accuracy compared to ViG-T, and it also shows significant performance improvements in downstream vision tasks such as object detection and instance segmentation compared to ViG.

Table and Figures | Reference | Related Articles | Metrics