Search Result

Select

Consistency preserving age estimation method by ensemble ranking

Chun SUN, Chunlong HU, Shucheng HUANG

Journal of Computer Applications 2024, 44 (8): 2381-2386. DOI: 10.11772/j.issn.1001-9081.2023081173

Abstract （25）

HTML （2）

PDF （2290KB）（12）

Save

The traditional age estimation methods based on ranking and regression cannot effectively utilize the evolutionary characteristics of human faces and build correlation between different ranking labels. Moreover， using binary classification methods for age estimation may result in inconsistent ranking issues. To solve above problems， an age estimation method based on integrated ranking matrix encoding and consistency preserving was proposed to fully utilize the correlation between age and ranking value and suppress the problem of inconsistent ranking. A new indicator， the proportion of samples with inconsistent ranking， was proposed to evaluate the problem of inconsistent rankings in the two-class ranking method. First， age categories were converted into a ranking matrix form through a designed coding method. Then， the ResNet34 （Residual Network） feature extraction network was used to extract facial features， which were then learned through the proposed encoding learning module. Finally， the network prediction results were decoded into the predicted age of the image through a ranking decoder based on a metric method. The experimental results show that： the proposed method achieves a Mean Absolute Error （MAE） of 2.18 on MORPH Ⅱ dataset， and has better results on other publicly available datasets compared to methods also based on ranking and ordinal regression， such as OR-CNN （Ordinal Regression with CNN） and CORAL （COnsistent RAnk Logits）； at the same time， the proposed method decreases the proportion of samples with inconsistent ranking， and improves the measurement performance of ranking inconsistency by about 65% compared to the OR-CNN method.

Table and Figures | Reference | Related Articles | Metrics

Select

Distributed temporal index for temporal aggregation range query

Fanjun MENG, Bin HAN, Shucheng HUANG, Xiangdong MEI

Journal of Computer Applications 2024, 44 (6): 1848-1854. DOI: 10.11772/j.issn.1001-9081.2023060830

Abstract （138）

HTML （5）

PDF （1444KB）（111）

Save

In the era of big data and cloud computing， querying and analyzing temporal big data faces many important challenges. Focused on the issues such as poor query performance and ineffective utilization of indexes for temporal aggregation range query， a Distributed Temporal Index （DTI） for temporal aggregation range query was proposed. Firstly， random or round-robin strategy was used to partition the temporal data. Secondly， intra-partition index construction algorithm based on timestamp’s bit array prefix was used to build intra-partition index， and partition statistics including time span were recorded. Thirdly， the data partitions whose time span overlapped with the query time interval were selected by predicate pushdown operation， and were pre-aggregated by index scan. Finally， all pre-aggregated values obtained from each partition were merged and aggregated by time. The experimental results show that the execution time of intra-partition index construction algorithm of the index for processing data with density of 2 400 entries per unit of time is similar to the execution time for processing data with density of 0.001 entries per unit of time. Compared to ParTime， the temporal aggregation range query algorithm with index takes at least 22% less time for each step when querying the data in the first 75% of timeline and at least 11% less time for each step when executing selective aggregation. Therefore， the algorithm with index is faster in most temporal aggregate range query tasks and its intra-partition index construction algorithm is capable to solve data sparsity problem with high efficiency.

Table and Figures | Reference | Related Articles | Metrics

Select

Sensitive information detection method based on attention mechanism-based ELMo

Cheng HUANG, Qianrui ZHAO

Journal of Computer Applications 2022, 42 (7): 2009-2014. DOI: 10.11772/j.issn.1001-9081.2021050877

Abstract （883）

HTML （47）

PDF （973KB）（445）

Save

In order to solve the problems of low accuracy and poor generalization of the traditional sensitive information detection methods such as keyword character matching-based method and phrase-level sentiment analysis-based method， a sensitive information detection method based on Attention mechanism-based Embedding from Language Model （A-ELMo） was proposed. Firstly， the quick matched of trie tree was performed to reduce the comparison of useless words significantly， thereby improving the query efficiency greatly. Secondly， an Embedding from Language Model （ELMo） was constructed for context analysis， and the dynamic word vectors were used to fully represent the context characteristics to achieve high scalability. Finally， the attention mechanism was combined to enhance the identification ability of the model for sensitive features， and further improve the detection rate of sensitive information. Experiments were carried out on real datasets composed of multiple network data sources. The results show that the accuracy of the proposed sensitive information detection method is improved by 13.3 percentage points compared with that of the phrase-level sentiment analysis-based method， and the accuracy of the proposed method is improved by 43.5 percentage points compared with that of the keyword matching-based method， verifying that the proposed method has advantages in terms of enhancing identification ability of sensitive features and improving the detection rate of sensitive information.

Table and Figures | Reference | Related Articles | Metrics

Select

Text segmentation model based on graph convolutional network

Yuqi DU, Jin ZHENG, Yang WANG, Cheng HUANG, Ping LI

Journal of Computer Applications 2022, 42 (12): 3692-3699. DOI: 10.11772/j.issn.1001-9081.2021101768

Abstract （509）

HTML （25）

PDF （2746KB）（228）

Save

The main task of text segmentation is to divide the text into several relatively independent text blocks according to the topic relevance. Aiming at the shortcomings of the existing text segmentation models in extracting fine-grained features such as text paragraph structural information， semantic correlation and context interaction， a text segmentation model TS-GCN （Text Segmentation-Graph Convolutional Network） based on Graph Convolutional Network （GCN） was proposed. Firstly， a text graph based on the structural information and semantic logic of text paragraphs was constructed. Then， the semantic similarity attention was introduced to capture the fine-grained correlation between text paragraph nodes， and the information transmission between high-order neighborhoods of text paragraph nodes was realized with the help of GCN， so that the model ability of multi-granularity extraction of text paragraph topic feature representations was enhanced. The proposed model was compared with the representative model CATS （Coherence-Aware Text Segmentation）， and its basic model TLT-TS （Two-Level Transformer model for Text Segmentation）， which were commonly used as benchmarks for text segmentation task. Experimental results show that TS-GCN’s evaluation index P_k is 0.08 percentage points lower than that of TLT-TS without any auxiliary module on Wikicities dataset. And the proposed model has the P_k value decreased by 0.38 percentage points and 2.30 percentage points respectively on Wikielements dataset compared with CATS and TLT-TS. It can be seen that TS-GCN achieves good segmentation effect.

Table and Figures | Reference | Related Articles | Metrics

Select

Reinforced automatic summarization model based on advantage actor-critic algorithm

DU Xixi, CHENG Hua, FANG Yiquan

Journal of Computer Applications 2021, 41 (3): 699-705. DOI: 10.11772/j.issn.1001-9081.2020060837

Abstract （462）

PDF （975KB）（935）

Save

The extractive summary model is relatively redundant and the abstractive summary model often loses key information and has inaccurate summary and repeated generated content in long text automatic summarization task. In order to solve these problems, a Reinforced Automatic Summarization model based on Advantage Actor-Critic algorithm (A2C-RLAS) for long text was proposed. Firstly, the key sentences of the original text were extracted by the extractor based on the hybrid neural network of Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN). Then, the key sentences were refined by the rewriter based on the copy mechanism and the attention mechanism. Finally, the Advantage Actor-Critic (A2C) algorithm in reinforcement learning was used to train the entire network, and the semantic similarity between the rewritten summary and the reference summary (BERTScore (Evaluating Text Generation with Bidirectional Encoder Representations from Transformers) value) was used as a reward to guide the extraction process, so as to improve the quality of sentences extracted by the extractor. The experimental results on CNN/Daily Mail dataset show that, compared with models such as Reinforcement Learning-based Extractive Summarization (Refresh) model, a Recurrent Neural Network based sequence model for extractive summarization (SummaRuNNer) and Distributional Semantics Reward (DSR) model, the A2C-RLAS has the final summary with content more accurate, language more fluent and redundant content effectively reduced, at the same time, A2C-RLAS has both the ROUGE (Recall-Oriented Understudy for Gisting Evaluation) and BERTScore indicators improved. Compared to the Refresh model and the SummaRuNNer model, the ROUGE-L value of the A2C-RLAS model is increased by 6.3% and 10.2% respectively; compared with the DSR model, the F1 value of the A2C-RLAS model is increased by 30.5%.

Reference | Related Articles | Metrics

Select

Target detection of carrier-based aircraft based on deep convolutional neural network

ZHU Xingdong, TIAN Shaobing, HUANG Kui, FAN Jiali, WANG Zheng, CHENG Huacheng

Journal of Computer Applications 2020, 40 (5): 1529-1533. DOI: 10.11772/j.issn.1001-9081.2019091694

Abstract （476）

PDF （823KB）（459）

Save

The carrier-based aircrafts on the carrier deck are dense and occluded, so that the carrier-based aircraft targets are difficult to detect, and the detection effect is easily affected by the lighting condition and target size. Therefore, an improved Faster R-CNN (Faster Region with Convolutional Neural Network) carrier-based aircraft target detection method was proposed. In this method, a loss function with a repulsion loss strategy was designed, and combined with multi-scale training, pictures collected under laboratory condition were used to train and test the deep convolutional neural network. Test experiments show that compared with the original Faster R-CNN detection model, the improved model has a better detection effect on occluded aircraft targets, the recall increased by 7 percentage points, and the precision increased by 6 percentage points. The experimental results show that the proposed improved method can automatically and comprehensively extract the characteristics of carrier-based aircraft targets, solve the detection problem of occluded carrier-based aircraft targets, has the detection accuracy and speed which can meet the actual needs, and has strong adaptability and high robustness under different lighting conditions and target sizes.

Reference | Related Articles | Metrics

Select

One-site multi-table and cross multi-table frequent item sets mining with privacy preserving

LIN Rui ZHONG Cheng HUA Pei

Journal of Computer Applications 2013, 33 (12): 3437-3440.

Abstract （578）

PDF （666KB）（409）

Save

To achieve the goal that personal and original information is not disclosed to each other when several parties cooperatively mine several data tables at different computational sites, based on secure triple-party protocol, a triple-site cross multi-table frequent item sets mining algorithm with privacy preserving was proposed in distributed environment with multiple tables at each site. The proposed algorithm disturbed data by generating random numbers, mined frequent item sets of inter-site in parallel, and linked the data with equal-value by common link attribution of the tables among the sites and applied secure protocol to compute the global support of inter-site cross-table frequent item sets. The experimental results show that the proposed algorithm is efficient, and it can not only mine the cross multi-table frequent item sets, but also preserve the private data at each site.