Search Result

Select

Camouflaged object detection by boundary mining and background guidance

Zhonghua LI, Gengxin ZHONG, Ping FAN, Hengliang ZHU

Journal of Computer Applications 2025, 45 (10): 3328-3335. DOI: 10.11772/j.issn.1001-9081.2024091324

Abstract （49）

HTML （0）

PDF （2003KB）（155）

Save

Since the camouflaged object is highly similar to the background， it is easily confused by background features， making it difficult to distinguish boundary information and extract object features. Current mainstream Camouflaged Object Detection （COD） algorithms mainly study the camouflage object itself and its boundaries， ignoring relationship between the image background and the object， and the detection results are not ideal in complex scenes. To this end， in order to explore potential connection between background and object， an camouflaged object detection algorithm by mining boundaries and background was proposed， called I2DNet （Indirect to Direct Network）. The algorithm consists of five parts： in the encoder， the initial raw data was processed； in the Boundary-guided feature Extracting and Mining Framework （BEMF）， more refined boundary features were extracted through feature processing and feature mining； in the Latent-feature Exploring Framework based on Background guidance （LEFB）， more salient features were explored through multi-scale convolution while based on attention， the Hybrid Attention Module （HAM） was designed to enhance selection of background features； in the Information Supplement Module （ISM）， the detailed information lost during feature processing was made up； in the Multi-task Co-segmentation Decoder （MCD）， the features extracted from different tasks and modules were fused efficiently and the final prediction results were output. Experimental results show that the proposed algorithm is better than the other 15 state-of-the-art models on three widely used datasets； especially on CAMO dataset， the proposed algorithm has the mean absolute error index dropped to 0.042.

Table and Figures | Reference | Related Articles | Metrics

Select

Video dynamic scene graph generation model based on multi-scale spatial-temporal Transformer

Jia WANG-ZHU, Zhou YU, Jun YU, Jianping FAN

Journal of Computer Applications 2024, 44 (1): 47-57. DOI: 10.11772/j.issn.1001-9081.2023060861

Abstract （488）

HTML （19）

PDF （2900KB）（1963）

Save

To address the challenge of dynamic changes in object relationships over time in videos， a video dynamic scene graph generation model based on multi-scale spatial-temporal Transformer was proposed. The multi-scale modeling idea was introduced into the classic Transformer architecture to precisely model dynamic fine-grained semantics in videos. First， in the spatial dimension， the attention was given to both the global spatial correlations of objects， similar to traditional models， and the local spatial correlations among objects’ relative positions， which facilitated a better understanding of interactive dynamics between people and objects， leading to more accurate semantic analysis results. Then， in the temporal dimension， not only the traditional short-term temporal correlations of objects in videos were modeled， but also the long-term temporal correlations of the same object pairs throughout the entire videos were emphasized. Comprehensive modeling of long-term relationships between objects assisted in generating more accurate and coherent scene graphs， mitigating issues arising from occlusions， overlaps， etc. during scene graph generation. Finally， through the collaborative efforts of the spatial encoder and temporal encoder， dynamic fine-grained semantics in videos were captured more accurately by the model， avoiding limitations inherent in traditional single-scale approaches. The experimental results show that， compared to the baseline model STTran， the proposed model achieves an increase of 5.0 percentage points， 2.8 percentage points， and 2.9 percentage points in terms of Recall@10 for the tasks of predicate classification， scene graph classification， and scene graph detection， respectively， on the Action Genome benchmark dataset. This demonstrates that the multi-scale modeling concept can enhance precision and effectively boost performance in dynamic video scene graph generation tasks.

Table and Figures | Reference | Related Articles | Metrics

Select

Integrated scheduling optimization of multiple data centers based on deep reinforcement learning

Heping FANG, Shuguang LIU, Yongyi RAN, Kunhua ZHONG

Journal of Computer Applications 2023, 43 (6): 1884-1892. DOI: 10.11772/j.issn.1001-9081.2022050722

Abstract （576）

HTML （16）

PDF （2415KB）（1022）

Save

The purpose of the task scheduling strategy for multiple data centers is to allocate computing tasks to different servers in each data center to improve the resource utilization and energy efficiency. Therefore， a deep reinforcement learning-based integrated scheduling strategy for multiple data center was proposed， which is divided into two stages： data center selection and task allocation within the data centers. In the multiple data centers selection stage， the computing power resources were integrated to improve the overall resource utilization. Firstly， a Deep Q Network （DQN） with Prioritized Experience Replay （PER-DQN） was used to obtain the communication paths to each data center in the network with data centers as nodes. Then， the resource usage cost and network communication cost were calculated， and the optimal data center was selected according to the principle that the sum of the two costs is minimum. In the task allocation stage， firstly， in the selected data center the computing tasks were divided and added to the scheduling queue according to the First-Come First-Served （FCFS） principle. Then， combining the computing device status and ambient temperature， the task allocation algorithm based on Double DQN （Double DQN） was used to obtain the optimal allocation strategy， thereby selecting the server to perform the computing task， avoiding the generation of hot spots and reducing the energy consumption of refrigeration equipment. Experimental results show that the average total cost of PER-DQN-based data center selection algorithm is reduced by 3.6% and 10.0% respectively compared to those of Computing Resource First （CRF） and Shortest Path First （SPF） path selection methods. Compared to Round Robin scheduling （RR） and Greedy scheduling （Greedy） algorithms， the Double DQN-based task deployment algorithm reduces the average Power Usage Effectiveness （PUE） by 2.5% and 1.7% respectively. It can be seen that the proposed strategy can reduce the total cost and data center energy consumption effectively， and realize the efficient operation of multiple data centers.

Table and Figures | Reference | Related Articles | Metrics

Select

Automated Fugl-Meyer assessment based on genetic algorithm and extreme learning machine

WANGJingli LI Liang YU Lei WANG Jiping FANG Qiang

Journal of Computer Applications 2014, 34 (3): 907-910. DOI: 10.11772/j.issn.1001-9081.2014.03.0907

Abstract （646）

PDF （775KB）（567）

Save

To realize automatic and quantitative assessment in home-based upper extremity rehabilitation for stroke, an Extreme Learning Machine (ELM) based prediction model was proposed to automatically estimate the Fugl-Meyer Assessment (FMA) scale score for shoulder-elbow section. Two accelerometers were utilized for data recording during performance of 4 tasks selected from shoulder-elbow FMA and 24 patients were involved in the study. Accelerometer-based estimation was obtained by preprocessing raw sensor data, extracting data features, selecting features based on Genetic Algorithm and ELM. Then 4 single-task models and a comprehensive model were built individually using the selected features. Results show that it is possible to achieve accurate estimation of shoulder-elbow FMA score from the analysis of accelerometer sensor data with a root mean squared prediction error value of 2.1849 points. This approach breaks through the subjective and time-consuming property of traditional outcome measures which rely on clinicians at hand and can be easily utilized in the home settings.