Search Result

Select

Pareto distribution based processing approach of deceptive behaviors of crowdsourcing workers

PAN Qingxian, JIANG Shan, DONG Hongbin, WANG Yingjie, PAN Tingwei, YIN Zengxuan

Journal of Computer Applications 2019, 39 (11): 3191-3197. DOI: 10.11772/j.issn.1001-9081.2019051067

Abstract （425）

PDF （1013KB）（341）

Save

Due to the loose organization of crowdsourcing, crowdsourcing workers have deceptive behaviors in the process of completing tasks. How to identify the deceptive behaviors of workers and reduce their impact, thus ensuring the completion quality of crowdsourcing tasks, has become one of the research hotspots in the field of crowdsourcing. Based on the evaluation and analysis of the task results, a Weight Setting Algorithm Based on Generalized Pareto Distribution (GPD) (WSABG) was proposed for the unified type deceptive behaviors of crowdsourcing workers. In the algorithm, the maximum likelihood estimation of GPD was performed, and the dichotomy was used to approximate the zero point of the likehood function in order to calculate the scale parameter σ and shape parameter ε. A new weight formula was defined, and an absolute influence weight was given to each worker according to the feedback data of the crowdsourcing workers to complete the current task, and finally the GPD-based crowdsourcing worker weight setting framework was designed. The proposed algorithm can solve the problem that the difference between the task results data is small and the data are easy to be centered on the two poles. Taking the data of Yantai University students' evaluation of teaching as the experimental dataset, with the concept of interval transfer matrix proposed, the effectiveness and superiority of WSABG algorithm are proved.

Reference | Related Articles | Metrics

Select

Text sentiment analysis based on feature fusion of convolution neural network and bidirectional long short-term memory network

LI Yang, DONG Hongbin

Journal of Computer Applications 2018, 38 (11): 3075-3080. DOI: 10.11772/j.issn.1001-9081.2018041289

Abstract （3088）

PDF （906KB）（1849）

Save

Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) are widely used in natural language processing, but the natural language has a certain dependence on the structure, only relying on CNN for text classification will ignore the contextual meaning of words, and there is a problem of gradient disappearance or gradient dispersion in the traditional RNN, which limits the accuracy of text classification. A feature fusion model for CNN and Bidirectional Long Short-Term Memory (BiLSTM) was presented. Local features of text were extracted by CNN and global features related to text were extracted by BiLSTM network. The features extracted by the two complementary models were merged to solve the problem of ignoring the contextual semantic and grammatical information of words in a single CNN model, and the fusion model also effectively avoided the problem of gradient disappearance or gradient dispersion in traditional RNN. The experimental results on two kinds of datasets show that the proposed fusion feature model can effectively improve the accuracy of text classification.

Reference | Related Articles | Metrics

Select

Super pixel segmentation algorithm based on Hadoop

WANG Chunbo, DONG Hongbin, YIN Guisheng, LIU Wenjie

Journal of Computer Applications 2016, 36 (11): 2985-2992. DOI: 10.11772/j.issn.1001-9081.2016.11.2985

Abstract （785）

PDF （1313KB）（575）

Save

In view of the high time complexity of pixel segmentation, a super pixel segmentation algorithm was proposed for high resolution image. Super pixels instead of the original pixels were used as the segmentation processing elements and the characteristics of Hadoop and the super pixels were combined. Firstly, a static and dynamic adaptive algorithm for multiple tasks was proposed which could reduce the coupling of the blocks in HDFS (Hadoop Distributed File System) and task arranging. Secondly, based on the constraint in the distance and gradient on the super pixel formed by the boundary of super pixel block, a parallel watershed segmentation algorithm was proposed in each Map node task. Meanwhile, two merging strategies were proposed and compared in the super pixel block merging in the Shuffle process. Finally, the combination of super pixels was optimized to complete the final segmentation in the Reduce node task. The experimental results show that the proposed algorithm is superior to the Simple Linear Iterative Cluster (SLIC) algorithm and Normalized cut (Ncut) algorithm in Boundary Recall ratio (BR) and Under segmentation Error (UE), and the segmentation time of the high-resolution image is remarkably decreased.

Reference | Related Articles | Metrics

Select

Testing data generation method based on fireworks explosion optimization algorithm

DING Rui, DONG Hongbin, FENG Xianbin, ZHAO Jiahua

Journal of Computer Applications 2016, 36 (10): 2816-2821. DOI: 10.11772/j.issn.1001-9081.2016.10.2816

Abstract （578）

PDF （969KB）（555）

Save

Aiming at the problem of path coverage test data generation, a new test data generation method based on improved Fireworks Xxplosion Optimization (FXO) algorithm was proposed. First, key-point path method was used to represent the program paths, and the hard-covered paths were defined by the theoretical paths, easy-covered paths and infeasible paths; the easy-covered paths adjacent to the hard-covered paths and their testing data were recorded and used as part of the initial fireworks to improve convergence speed, and the remaining initial fireworks were created randomly. Then according to the individuals' fitness values, an adaptive blast radius was designed to improve convergence rate, and the thought of boundary value test was introduced to modify the border-crossing sparkles. Compared with other seven optimization algorithms that generate testing data, including fireworks explosion optimization with adaptive radius and heuristic information (NFEO), FEO, F-method, NF-method, etc, the simulation results show that the proposed algorithm has lower time complexity of calculating level, and better performance in convergence.

Reference | Related Articles | Metrics