Search Result

Select

SAM Meibomian gland unified dense segmentation method with introduction of automatic prompt encoder

Ying JING, Ran LI, Zhuo JIANG, Ziyang FU, Jingyi DU, Qi LIU, Jihang LIU

Journal of Computer Applications 2026, 46 (5): 1667-1676. DOI: 10.11772/j.issn.1001-9081.2025050613

Abstract （42）

HTML （0）

PDF （3389KB）（4）

Save

The traditional Segment Anything Model （SAM） relies on manual prompts during segmentation of Meibomian gland images， making it difficult to handle issues such as dense glands， irregular shapes， and blurred boundaries. To address this， an improved model， namely ResSAM， was proposed. ResSAM eliminated the reliance on manual intervention by introducing an automatic prompt encoder. The backbone network was pruned and optimized to further enhance the model's segmentation efficiency. Focal Loss and Smooth IoU Loss were used for training optimization， and the SE （Squeeze-and-Excitation） and cross-attention mechanisms were integrated to reduce the impact of individual differences and blurred boundaries， thereby improving the model's segmentation accuracy. Experimental results on two self-built datasets， Lower Lid and Upper Lid， showed that ResSAM achieved the best performance in terms of the number of parameters and Giga FLoating-point OPerations （GFLOPs）； its segmentation results obtained the highest Dice scores （88.69% and 87.75%， respectively） and the highest Intersection-over-Union （IoU） values （79.69% and 78.58%， respectively）. The research results indicate that the ResSAM optimizes both efficiency and accuracy， supporting early prevention and clinical diagnosis of Meibomian Gland Dysfunction （MGD）.

Table and Figures | Reference | Related Articles | Metrics

Select

Segmentation network of coronary artery structure from CT angiography images based on multi-scale spatial features

Yingtao CHEN, Kangkang FANG, Jin’ao ZHANG, Haoran LIANG, Huanbin GUO, Zhaowen QIU

Journal of Computer Applications 2025, 45 (6): 2007-2015. DOI: 10.11772/j.issn.1001-9081.2024060853

Abstract （262）

HTML （5）

PDF （2646KB）（68）

Save

Owning to the complex morphological structure of coronary artery and the variations in acquisition conditions of Computed Tomography （CT） Angiography （CTA） images， image quality issues such as uneven distribution of image gray scale， motion artifacts and noise， result in missed judgements and misjudgement problems in segmentation of coronary artery structure. Therefore， a segmentation network of coronary artery structure from CTA images based on multi-scale spatial features — Three-Dimensional （3D） Multi-Scale Parallel Net （MSP-Net） was proposed. Firstly， in view of characteristics of large spatial span and small local proportion of coronary artery， a multi-scale parallel fusion network was used to extract global features and local features from coronary artery CTA images respectively for fusion to ensure the complete extraction of coronary artery structure features. Secondly， by adopting a coarse to fine idea in coronary artery reconstruction to enhance redundancy of the image features， thereby ensuring clear boundaries of coronary artery， and then the coronary artery structure was reconstructed using the fusion method of different scale features to enhance accuracy of the segmentation results， thereby reducing missed judgements and misjudgments. Finally， in order to accelerate training process of the network， supervision signals were adopted at different network depths by adopting deep supervision strategy to improve training efficiency. Experimental results show that in coronary artery automatic segmentation task， the average Dice Similarity Coefficient （DSC） of the proposed network reaches 87.16%， which is 4.04 and 2.31 percentage points higher than those of nnU-Net and Swin UNETR （Swin UNEt TRansformers）， and the average 95% Hausdorff Distance （HD95） of the proposed network reaches 3.69 mm， which is 14.43 mm and 13.75 mm lower than those of nnU-Net and Swin UNETR. It can be seen that the proposed network can improve segmentation accuracy of coronary artery structure effectively， and help clinicians to understand the coronary artery structure of patients more accurately， so as to evaluate the disease more effectively.

Table and Figures | Reference | Related Articles | Metrics

Select

Multi-focus image fusion network with cascade fusion and enhanced reconstruction

Benchen YANG, Haoran LI, Haibo JIN

Journal of Computer Applications 2025, 45 (2): 594-600. DOI: 10.11772/j.issn.1001-9081.2024030302

Abstract （399）

HTML （5）

PDF （2477KB）（1053）

Save

Aiming at the problem of semi-focus images caused by improper focusing of far and near visual fields during digital image shooting， a multi-focus image fusion Network with Cascade fusion and enhanced reconstruction （CasNet） was proposed. Firstly， a cascade sampling module was constructed to calculate and merge the residuals of feature maps sampled at different depths for efficient utilization of focused features at different scales. Secondly， a lightweight multi-head self-attention mechanism was improved to perform dimensional residual calculation on feature maps for feature enhancement of the image and make the feature maps present better distribution in different dimensions. Thirdly， convolution channel attention stacking was used to complete feature reconstruction. Finally， interval convolution was used for up- and down-sampling during the sampling process， so as to retain more original image features. Experimental results demonstrate that CasNet achieves better results in metrics such as Average Gradient （AG） and Gray-Level Difference （GLD） on multi-focus image benchmark test sets Lytro， MFFW， grayscale， and MFI-WHU compared to popular methods such as SESF-Fuse （Spatially Enhanced Spatial Frequency-based Fusion） and U2Fusion （Unified Unsupervised Fusion network）.

Table and Figures | Reference | Related Articles | Metrics

Select

Lightweight human pose estimation based on merge state space model

Zhuoran LI, Hua LI, Tong WANG, Chaozhe JIANG

Journal of Computer Applications 2025, 45 (10): 3179-3186. DOI: 10.11772/j.issn.1001-9081.2024091351

Abstract （214）

HTML （1）

PDF （2113KB）（423）

Save

In the field of Human Pose Estimation （HPE）， heatmap-based methods suffer from the problems of big quantization error， high computational complexity， and the need to post-process the heatmap. To address the above issues， with SimCC method of coordinate regression as a baseline， a lightweight HPE model based on Merge State Space Model （MSSM） was proposed， namely Lite-SimCC. Firstly， ShuffleNet V2 was adopted as the backbone network to replace the original HRNet （High-Resolution Net）， which simplified to a structure of single-branch form and realized lightweight model. Secondly， to reduce the loss of precision， a large kernel convolution was introduced to extract global feature information. Thirdly， an MSSM was further designed to handle both local and full long sequence features， so as to enhance representational ability of the key points. Finally， a soft-label based loss function was proposed to replace the traditional one-hot loss calculation method. Experimental results show that compared with the baseline method SimCC， Lite-SimCC has the parameters decreased by 87.1%， and the Average Precision （AP） improved by 1.4% on COCO2017 test set， and it is proved on MPII dataset that Lite-SimCC reduces parameters of the model effectively while guaranteeing detection precision.

Table and Figures | Reference | Related Articles | Metrics

Select

Review of research on aquaculture counting based on machine vision

Hanyu ZHANG, Zhenbo LI, Weiran LI, Pu YANG

Journal of Computer Applications 2023, 43 (9): 2970-2982. DOI: 10.11772/j.issn.1001-9081.2022081261

Abstract （910）

HTML （29）

PDF （1320KB）（361）

Save

Aquaculture counting is an important part of the aquaculture process， and the counting results provide an important basis for feeding， breeding density adjustment， and economic efficiency estimation of aquatic animals. In response to the traditional manual counting methods， which are time-consuming， labor-intensive， and prone to large errors， a large number of methods and applications based on machine vision have been proposed， thereby greatly promoting the development of non-destructive counting of aquatic products. In order to deeply understand the research on aquaculture counting based on machine vision， the relevant domestic and international literature in the past 30 years was collated and analyzed. Firstly， a review of aquaculture counting was presented in the perspective of data acquisition， and the methods for acquiring the data required for machine vision were summed up. Secondly， the aquaculture counting methods were analyzed and summarized in terms of traditional machine vision and deep learning. Thirdly， the practical applications of counting methods in different farming environments were compared and analyzed. Finally， the difficulties in the development of aquaculture counting research were summarized in terms of data， methods， and applications， and corresponding views were presented for the future trends of aquaculture counting research and equipment applications.

Table and Figures | Reference | Related Articles | Metrics

Select

Spatial-temporal traffic flow prediction model based on gated convolution

Li XU, Xiangyuan FU, Haoran LI

Journal of Computer Applications 2023, 43 (9): 2760-2765. DOI: 10.11772/j.issn.1001-9081.2022081146

Abstract （831）

HTML （29）

PDF （2271KB）（275）

Save

Concerning the problems that the existing traffic flow prediction models cannot accurately capture the spatio-temporal features of traffic data， and most models show good prediction performance in single-step prediction， and the prediction performance of models in multi-step prediction is not ideal， a Spatio-Temporal Traffic Flow Prediction Model based on Gated Convolution （GC-STTFPM） was proposed. Firstly， the Graph Convolution Network （GCN） combining with Gated Recurrent Unit （GRU） was used to capture the spatio-temporal features of traffic flow data. Then， a method of splicing and filtering the original data and spatio-temporal feature data by using gated convolution unit was proposed to verify the validity of spatio-temporal feature data. Finally， GRU was used as the decoder to make accurate and reliable prediction of future traffic flow. Experimental results on traffic dataset of Los Angeles Highway show that compared with Attention based Spatial-Temporal Graph Neural Network （ASTGNN） and Diffusion Convolutional Recurrent Neural Network （DCRNN） under single step prediction （5 min）， GC-STGCN model has the Mean Absolute Error （MAE） reduced by 5.9% and 9.9% respectively， and the Root Mean Square Error （RMSE） reduced by 1.7% and 5.8% respectively. At the same time， it is found that the prediction accuracy of this model is better than those of most existing benchmark models under three multi-step scales of 15， 30 and 60 min， demonstrating strong adaptability and robustness.

Table and Figures | Reference | Related Articles | Metrics

Select

Image instance segmentation model based on fractional-order network and reinforcement learning

Xueming LI, Guohao WU, Shangbo ZHOU, Xiaoran LIN, Hongbin XIE

Journal of Computer Applications 2022, 42 (2): 574-583. DOI: 10.11772/j.issn.1001-9081.2021020324

Abstract （721）

HTML （16）

PDF （2853KB）（338）

Save

Aiming at the low segmentation precision caused by the lack of image feature extraction ability of the existing fractional-order nonlinear models， an instance segmentation model based on fractional-order network and Reinforcement Learning （RL） was proposed to generate high-quality contour curves of target instances in the image. The model consists of two layers of modules： 1） the first layer was a two-dimensional fractional-order nonlinear network in which the chaotic synchronization method was mainly utilized to obtain the basic characteristics of the pixels in the image， and the preliminary segmentation result of the image was acquired through the coupling and connection according to the similarity among the pixels； 2） the second layer was to establish instance segmentation as a Markov Decision Process （MDP） based on the idea of RL， and the action-state pairs， reward functions and strategies during the modeling process were designed to extract the region structure and category information of the image. Finally， the pixel features and preliminary segmentation result of the image obtained from the first layer were combined with the region structure and category information obtained from the second layer for instance segmentation. Experimental results on datasets Pascal VOC2007 and Pascal VOC2012 show that compared with the existing fractional-order nonlinear models， the proposed model has the Average Precision （AP） improved by at least 15 percentage points， verifying that the sequential decision-based instance segmentation model not only can obtain the class information of the target objects in the image， but also further enhance the ability to extract contour details and fine-grained information of the image.

Table and Figures | Reference | Related Articles | Metrics

Select

Graph convolutional network method based on hybrid feature modeling

Zhuoran LI, Zhonglin YE, Haixing ZHAO, Jingjing LIN

Journal of Computer Applications 2022, 42 (11): 3354-3363. DOI: 10.11772/j.issn.1001-9081.2021111981

Abstract （829）

HTML （16）

PDF （3410KB）（219）

Save

For the complex information contained in the network， more ways are needed to extract useful information from it， but the relevant characteristics in the network cannot be completely described by the existing single?feature Graph Neural Network （GNN）. To resolve the above problems， a Hybrid feature?based Dual Graph Convolutional Network （HDGCN） was proposed. Firstly， the structure feature vectors and semantic feature vectors of nodes were obtained by Graph Convolutional Network （GCN）. Secondly， the features of nodes were aggregated selectively so that the feature expression ability of nodes was enhanced by the aggregation function based on attention mechanism or gating mechanism. Finally， the hybrid feature vectors of nodes were gained by the fusion mechanism based on a feasible dual?channel GCN， and the structure features and semantic features of nodes were modeled jointly to make the features be supplement for each other and promote the method's performance on subsequent machine learning tasks. Verification was performed on the datasets CiteSeer， DBLP （DataBase systems and Logic Programming） and SDBLP （Simplified DataBase systems and Logic Programming）. Experimental results show that compared with the graph convolutional network model based on structure feature training， the dual channel graph convolutional network model based on hybrid feature training has the average value of Micro?F1 increased by 2.43， 2.14， 1.86 and 2.13 percentage points respectively， and the average value of Macro?F1 increased by 1.38， 0.33， 1.06 and 0.86 percentage points respectively when the training set proportion is 20%， 40%， 60% and 80%. The difference in accuracy is no more than 0.5 percentage points when using concat or mean as the fusion strategy， which shows that both concat and mean can be used as the fusion strategy. HDGCN has higher accuracy on node classification and clustering tasks than models trained by structure or semantic network alone， and has the best results when the output dimension is 64， the learning rate is 0.001， the graph convolutional layer number is 2 and the attention vector dimension is 128.

Table and Figures | Reference | Related Articles | Metrics

Select

Simultaneous localization and mapping for mobile robots based on WiFi fingerprint sequence matching

Zhenghong QIN, Ran LIU, Yufeng XIAO, Kaixiang CHEN, Zhongyuan DENG, Tianrui DENG

Journal of Computer Applications 2022, 42 (10): 3268-3274. DOI: 10.11772/j.issn.1001-9081.2021081522

Abstract （696）

HTML （4）

PDF （2498KB）（299）

Save

Simultaneous Localization And Mapping （SLAM） is a research hotspot in robot localization and navigation. Reliable loop closure detection is critical for graph-based SLAM. However， loop closure detection by vision or Lidar is computationally expensive and has low reliability in large and complex environments. To solve this problem， a graph-based SLAM algorithm based on WiFi fingerprint sequence matching was proposed. In this algorithm， fingerprint sequences were used for loop closure detection. Since the fingerprint sequence contains data of multiple fingerprints， which is considered to be richer than a single fingerprint pair in the amount of information. Therefore， the traditional method based on single fingerprint pair matching was extended to fingerprint sequence matching， which greatly reduced the probability of false loop closure， thus ensuring the high accuracy of loop closure detection and satisfying high precision requirement of SLAM algorithm in large and complex environments. Two sets of experimental data （robots start from different starting points） were used to verify the proposed algorithm. The results show that the proposed algorithm is more accurate than Gaussian similarity method， and has the accuracy on the first and second set of data increased by 22.94% and 39.18% respectively. Experimental results fully verify the superiority of the proposed algorithm in improving the positioning accuracy and ensuring the reliability of loop closure detection

Table and Figures | Reference | Related Articles | Metrics

Select

Multiple kernel clustering algorithm based on capped simplex projection graph tensor learning

Haoyun LEI, Zenwen REN, Yanlong WANG, Shuang XUE, Haoran LI

Journal of Computer Applications 2021, 41 (12): 3468-3474. DOI: 10.11772/j.issn.1001-9081.2021061393

Abstract （764）

HTML （10）

PDF （6316KB）（218）

Save

Because multiple kernel learning can avoid selection of kernel functions and parameters effectively， and graph clustering can fully mine complex structural information between samples， Multiple Kernel Graph Clustering （MKGC） has received widespread attention in recent years. However， the existing MKGC methods suffer from the following problems： graph learning technique complicates the model， the high rank of graph Laplacian matrix cannot ensure the learned affinity graph to contain accurate c connected components （block diagonal property）， and most of the methods ignore the high-order structural information among the candidate affinity graphs， making it difficult to fully utilize the multiple kernel information. To tackle these problems， a novel MKGC method was proposed. First， a new graph learning method based on capped simplex projection was proposed to directly project the kernel matrices onto graph simplex， which reduced the computational complexity. Meanwhile， a new block diagonal constraint was introduced to keep the accurate block diagonal property of the learned affinity graphs. Moreover， the low-rank tensor learning was introduced in capped simplex projection space to fully mine the high-order structural information of multiple candidate affinity graphs. Compared with the existing MKGC methods on multiple datasets， the proposed method has less computational cost and high stability， and has great advantages in Accuracy （ACC） and Normalized Mutual Information （NMI）.

Table and Figures | Reference | Related Articles | Metrics

Select

Zero-watermarking algorithm based on cellular automata and sigular value decomposition

WU Weimin DING Ran LIN Zhiyi ZOU Qinhui

Journal of Computer Applications 2014, 34 (6): 1689-1693. DOI: 10.11772/j.issn.1001-9081.2014.06.1689

Abstract （425）

PDF （738KB）（391）

Save

Concerning the problem of low robustness of general watermarking algorithms in resisting JPEG compression and geometric transform attacks, a zero-watermarking algorithm based on Cellular Automata (CA) and Singular Value Decomposition (SVD) was proposed. Firstly, an image was transformed by 2-dimensional cellular automata transform and the low-frequency subband approximation image were isolated, then the CA parameters was saved as key. After that, the approximation image was sub-blocked, and the blocks were decomposed by SVD, then the zero-watermark was constructed by CA rule in SVD matrix. In image authentication, the image could be certificated by comparing the similarity of two watermarks with the threshold value. The experimental result shows that this algorithm has good invisibility and perfect robustness in resisting JPEG compression and geometric transform attacks.

Reference | Related Articles | Metrics

Select

Collaborative filtering algorithm based on real-time user feedback

He-gang FU Ran LI

Journal of Computer Applications 2011, 31 (07): 1744-1747. DOI: 10.3724/SP.J.1087.2011.01744

Abstract （1396）

PDF （650KB）（1024）

Save

Traditional memory-based collaborative filtering algorithm has the problem of bad scalability，while the model-based collaborative filtering algorithm，due to lagged updating hysterics，has the problem of bad recommendation. To solve the above problems，a collaborative filtering algorithm based on real-time users feedback was proposed，which achieved that recommender system can finish the real-time updating of the model data when a new rating was submitted by active user. Hence, recommender system can reflect the changing of user interest accurately. The experimental results indicate that the algorithm can improve the recommendation accuracy efficiently and reduce the recommendation time significantly.

Reference | Related Articles | Metrics

Select

Stereo pairs creation

Jun YANG JiCheng WANG Ran LIU

Journal of Computer Applications

Abstract （1921）

PDF （711KB）（1761）

Save

Stereo pair acquisition of a scene is the key to binocular stereo imaging. This paper presented a stereo pair creation method when 3D models were constructed. Using camera objects in 3DS MAX, the method started from a coordinate transformation of objects in the scene based on principle of binocular stereo vision. Then the method carried out the perspective transformation to create left image and right image respectively. The results of the experiment indicate the position of the two target cameras and the 3D model, together with the length of the base line is the key factor that affects the stereo effect. Changing the position of the target cameras and the 3D model may result in positive disparity or negative disparity stereo pairs. When the aspect ratio of AB to CO equals 0.05, the stereo effect of the stereo pairs created is better.