Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Action recognition algorithm based on attention mechanism and energy function
Lifang WANG, Jingshuang WU, Pengliang YIN, Lihua HU
Journal of Computer Applications    2025, 45 (1): 234-239.   DOI: 10.11772/j.issn.1001-9081.2024010004
Abstract119)   HTML1)    PDF (1695KB)(335)       Save

Addressing the insufficiency of structural guidance in the framework of Zero-Shot Action Recognition (ZSAR) algorithms, an Action Recognition Algorithm based on Attention mechanism and Energy function (ARAAE) was proposed guided by the Energy-Based Model (EBM) for framework design. Firstly, to obtain the input for EBM, a combination of optical flow and Convolutional 3D (C3D) architecture was designed to extract visual features, achieving spatial non-redundancy. Secondly, Vision Transformer (ViT) was utilized for visual feature extraction to reduce temporal redundancy, and ViT cooperated with combination of optical flow and C3D architecture was used to reduce spatial redundancy, resulting in a non-redundant visual space. Finally, to measure the correlation between visual space and semantic space, an energy score evaluation mechanism was realized with the design of a joint loss function for optimization experiments. Experimental results on HMDB51 and UCF101 datasets using six classical ZSAR algorithms and algorithms in recent literature show that on the HMDB51 dataset with average grouping, the average recognition accuracy of ARAAE is (22.1±1.8)%, which is better than those of CAGE (Coupling Adversarial Graph Embedding), Bi-dir GAN (Bi-directional Generative Adversarial Network) and ETSAN (Energy-based Temporal Summarized Attentive Network). On UCF101 dataset with average grouping, the average recognition accuracy of ARAAE is (22.4±1.6)%, which is better than those of all comparison algorithm slightly. On UCF101 with 81/20 dataset segmentation method, the average recognition accuracy of ARAAE is (40.2±2.6)%, which is higher than those of the comparison algorithms. It can be seen that ARAAE improves the recognition performance in ZSAR effectively.

Table and Figures | Reference | Related Articles | Metrics
Multi-view stereo method based on quadtree prior assistance
Lihua HU, Xiaoping LI, Jianhua HU, Sulan ZHANG
Journal of Computer Applications    2024, 44 (11): 3556-3564.   DOI: 10.11772/j.issn.1001-9081.2023111661
Abstract72)   HTML1)    PDF (6054KB)(71)       Save

PatchMatch-based Multi-View Stereo (MVS) method can estimate the depth of a scene based on multiple input images and is currently applied in large-scale 3D scene reconstruction. However, the existing methods have lower accuracy and completeness in depth estimation in low-texture regions due to unstable feature matching and unreliable reliance on photometric consistency alone. To address the above problems, an MVS method based on quadtree prior assistance was proposed. Firstly, the image pixel values were used to obtain local textures. Secondly, a coarse depth map was obtained by Adaptive Checkerboard sampling and Multi-Hypothesis joint view selection (ACMH), which combined the structural information in the low-texture region to generate a priori plane hypothesis by using quadtree segmentation. Thirdly, by integrating the above information, a new multi-view matching cost function was designed to guide the low-texture regions for obtaining the best depth assumption, thereby improving the accuracy of stereo matching. Finally, comparison experiments were conducted with many existing traditional MVS methods on ETH3D, Tanks and Temples, and Chinese Academy of Sciences' ancient architecture datasets. The results demonstrate that the proposed method performs better, especially in ETH3D test dataset with error threshold of 2 cm, its F1 score and completeness are improved by 1.29 and 2.38 percentage points, respectively, compared with the current state-of-the-art multi-scale geometric consistency guided and planar prior assisted multi-view stereo method (ACMMP).

Table and Figures | Reference | Related Articles | Metrics