Search Result

Select

Visually guided word segmentation and part of speech tagging

Haiyan TIAN, Saihao HUANG, Dong ZHANG, Shoushan LI

Journal of Computer Applications 2025, 45 (5): 1488-1495. DOI: 10.11772/j.issn.1001-9081.2024050627

Abstract （127）

HTML （1）

PDF （1826KB）（35）

Save

Chinese Word Segmentation （WS） and Part-Of-Speech （POS） tagging can assist other downstream tasks such as knowledge graph construction and sentiment analysis effectively. Existing work typically only uses pure-text information for WS and POS tagging. However， the Web also contains many associated image and video information. Therefore， efforts were made to mine associated clues from this visual information to aid Chinese WS and POS tagging. Firstly， a series of detailed annotation standards were established， and with WS and POS tagging， a multimodal dataset VG-Weibo was annotated using the text and image content from Weibo posts. Then， two multimodal information fusion methods， VGTD （Visually Guided Two-stage Decoding model） and VGCD （Visually Guided Collapsed Decoding model）， with different decoding mechanisms were proposed to accomplish this joint task of WS and POS tagging. Among the above， in VGTD method， a cross-attention mechanism was adopted to fuse textual and visual information and a two-stage decoding strategy was employed to firstly predict possible word spans and then predict the corresponding tags； in VGCD method， a cross-attention mechanism was also utilized to fuse textual and visual information and more appropriate Chinese representation and a collapsed decoding strategy were used. Experimental results on VG-Weibo test set demonstrate that on WS and POS tagging tasks， the F1 scores of VGTD method are improved by 0.18 and 0.22 percentage points， respectively， compared to those of the traditional pure-text method's Two-stage Decoding model （TD）； the F1 scores of VGCD method are improved by 0.25 and 0.55 percentage points， respectively， compared to the traditional pure-text method's Collapsed Decoding model （CD）. It can be seen that both VGTD and VGCD methods can utilize visual information effectively to enhance the performance of WS and POS tagging.

Table and Figures | Reference | Related Articles | Metrics

Select

Vision foundation model-driven pixel-level image anomaly detection method

Zhenhua XUE, Qiang LI, Chao HUANG

Journal of Computer Applications 2025, 45 (3): 823-831. DOI: 10.11772/j.issn.1001-9081.2024091398

Abstract （114）

HTML （6）

PDF （3364KB）（97）

Save

While previous anomaly detection methods have achieved high-precision detection in specific scenarios， but their applicability is constrained by their lack of generalizability and automation. Thus， a Vision Foundation Model （VFM）-driven pixel-level image anomaly detection method， namely SSMOD-Net （State Space Model driven-Omni Dimensional Net）， was proposed with the aim of achieving more accurate industrial defect detection. Unlike the existing methods， SSMOD-Net achieved automated prompting of SAM （Segment Anything Model） without the need for fine-tuning SAM， making it particularly suitable for scenarios that require processing large-scale industrial visual data. The core of SSMOD-Net is a novel prompt encoder driven by a state space model， which was able to generate prompts dynamically based on the input image of SAM. With this design， the model was allowed to introduce additional guidance information through the prompt encoder while preserving SAM’s architecture， thereby enhancing detection accuracy. A residual multi-scale module was integrated in the prompt encoder， and this module was constructed based on the state space model and was able to use multi-scale and global information comprehensively. Through iterative search， the module found optimal prompts in the prompt space and provided the prompts to SAM as high-dimensional tensors， thereby strengthening the model’s ability to recognize industrial anomalies. Moreover， the proposed method did not require any modifications to SAM， thereby avoiding the need for complex fine-tuning of the training schedules. Experimental results on several datasets show that the proposed method has excellent performance， and achieves better results in mE （mean E-measure） and Mean Absolute Error （MAE）， Dice， and Intersection over Union （IoU） compared to methods such as AutoSAM and SAM-EG （SAM with Edge Guidance framework for efficient polyp segmentation）.

Table and Figures | Reference | Related Articles | Metrics

Select

Single image super-resolution method based on residual shrinkage network in real complex scenes

Ying LI, Chao HUANG, Chengdong SUN, Yong XU

Journal of Computer Applications 2023, 43 (12): 3903-3910. DOI: 10.11772/j.issn.1001-9081.2022111697

Abstract （338）

HTML （6）

PDF （3309KB）（167）

Save

There are very few paired high and low resolution images in the real world. The traditional single image Super-Resolution （SR） methods typically use pairs of high-resolution and low-resolution images to train models， but these methods use the way of synthetizing dataset to obtain training set， which only consider bilinear downsampling as image degradation process. However， the image degradation process in the real word is complex and diverse， and traditional image super-resolution methods have poor reconstruction performance when facing real unknown degraded images. Aiming at those problems， a single image super-resolution method was proposed for real complex scenes. Firstly， high- and low-resolution images were captured by the camera with different focal lengths， and these images were registered as image pairs to form a dataset CSR（Camera Super-Resolution dataset） of various scenes. Secondly， to simulate the image degradation process in the real world as much as possible， the image degradation model was improved by the parameter randomization of degradation factors and the nonlinear combination degradation. Besides， the dataset of high- and low-resolution image pairs and the image degradation model were combined to synthetize training set. Finally， as the degradation factors were considered in the dataset， residual shrinkage network and U-Net were embedded into the benchmark model to reduce the redundant information caused by degradation factors in the feature space as much as possible. Experimental results indicate that compared with the BSRGAN （Blind Super-Resolution Generative Adversarial Network） method， under complex degradation conditions， the proposed method improves the PSNR by 0.7 dB and 0.14 dB， and improves SSIM by 0.001 and 0.031 respectively on the RealSR and CSR test sets. The proposed method has better objective indicators and visual effect than the existing methods on complex degradation datasets.

Table and Figures | Reference | Related Articles | Metrics

Select

Multi-robot task allocation algorithm combining genetic algorithm and rolling scheduling

Fuqin DENG, Huanzhao HUANG, Chaoen TAN, Lanhui FU, Jianmin ZHANG, Tinlun LAM

Journal of Computer Applications 2023, 43 (12): 3833-3839. DOI: 10.11772/j.issn.1001-9081.2022121916

Abstract （583）

HTML （12）

PDF （2617KB）（289）

Save

The purpose of research on Multi-Robot Task Allocation （MRTA） is to improve the task completion efficiency of robots in smart factories. Aiming at the deficiency of the existing algorithms in dealing with large-scale multi-constrained MRTA， an MRTA Algorithm Combining Genetic Algorithm and Rolling Scheduling （ACGARS） was proposed. Firstly， the coding method based on Directed Acyclic Graph （DAG） was adopted in genetic algorithm to efficiently deal with the priority constraints among tasks. Then， the prior knowledge was added to the initial population of genetic algorithm to improve the search efficiency of the algorithm. Finally， a rolling scheduling strategy based on task groups was designed to reduce the scale of the problem to be solved， thereby solving large-scale problems efficiently. Experimental results on large-scale problem instances show that compared with the schemes generated by Constructive Heuristic Algorithm （CHA）， MinInterfere Algorithm （MIA）， and Genetic Algorithm with Penalty Strategy （GAPS）， the scheme generated by the proposed algorithm has the average order completion time shortened by 30.02%， 16.86% and 75.65% respectively when the number of task groups is 20， which verifies that the proposed algorithm can effectively shorten the average waiting time of orders and improve the efficiency of multi-robot task allocation.

Table and Figures | Reference | Related Articles | Metrics

Select

Clustering-based hyperlink prediction

Pengfei QI, Lihua ZHOU, Guowang DU, Hao HUANG, Tong HUANG

Journal of Computer Applications 2020, 40 (2): 434-440. DOI: 10.11772/j.issn.1001-9081.2019101730

Abstract （529）

HTML （1）

PDF （2588KB）（363）

Save

Hyperlink prediction aims to utilize inherent properties of observed network to reproduce the missing links in the network. Existing hyperlink prediction algorithms often make predictions based on entire network， and some link types with insufficient training samples data may be missed， resulting in imcomplete link types to be detected. To address this problem， a clustering-based hyperlink prediction algorithm named C-CMM was proposed. Firstly， the dataset was divided into clusters， and then the model was constructed for each cluster to perform hyperlink prediction. The proposed algorithm can make full use of the information contained in the observation samples of each cluster， and widen the coverage range of the prediction results. Experimental results on three real-world datasets show that the proposed algorithm outperforms a great number of state-of-the-art link prediction algorithms in prediction accuracy and efficiency， and has the prediction coverage more comprehensive.

Table and Figures | Reference | Related Articles | Metrics

Select

News topic mining method based on weighted latent Dirichlet allocation model

LI Xiangdong BA Zhichao HUANG Li

Journal of Computer Applications 2014, 34 (5): 1354-1359. DOI: 10.11772/j.issn.1001-9081.2014.05.1354

Abstract （527）

PDF （969KB）（553）

Save

To solve the problems such as low accuracy and poor interpretability of traditional news topic mining, a new method was proposed based on weighted Latent Dirichlet Allocation (LDA) that combined with the information structure characters of the news. Firstly, the vocabulary weights were improved from different angles and the composite weights were built, the more expressive words were got by extending the process of feature items generated by the LDA model. Secondly, the Category Distinguish Word (CDW) method was used to optimize the word order of the generated result, which could reduce the noise and the ambiguity of the topics and improve the interpretability of the topics. Finally, according to the mathematical characteristics of the probability distribution model of the topics, the topics were quantified in terms of the contribution degree from the documents to the topics and the topics weight probability to get the hot topics. The simulation results show that the false negative rate and false positive rate of the weighted LDA model drop by an average of 1.43% and 0.16% compared with the traditional LDA model, and the minimum standard price drops by an average of 2.68%. It confirms the feasibility and effectiveness of this method.

Reference | Related Articles | Metrics

Select

Fine-grained protection domain model in a process and its implementation

hao huang

Journal of Computer Applications

Abstract （2166）

PDF （837KB）（959）

Save

A fine-grained protection domains method was proposed to address the problem of dynamically changing a process’s capabilities. According to a process’s different access mode of its address space and system resources in its different executing phases, this model partitions it into multiple protection domains. Then it sets up access mode of address space for each of them, which makes it feasible to resist code injection attacks. Meanwhile, it integrates Mandatory Access Control (MAC) framework into it to provide the access control of system resources, which meets the security requirement of the system.