Journal of Computer Applications ›› 2019, Vol. 39 ›› Issue (11): 3191-3197.DOI: 10.11772/j.issn.1001-9081.2019051067

• The 2019 CCF Conference on Artificial Intelligence (CCFAI2019) • Previous Articles     Next Articles

Pareto distribution based processing approach of deceptive behaviors of crowdsourcing workers

PAN Qingxian1,2, JIANG Shan2, DONG Hongbin1, WANG Yingjie2, PAN Tingwei2, YIN Zengxuan2   

  1. 1. College of Computer Science and Technology, Harbin Engineering University, Harbin Heilongjiang 150001, China;
    2. College of Computer and Control Engineering, Yantai University, Yantai Shandong 264005, China
  • Received:2019-05-24 Revised:2019-07-24 Online:2019-09-11 Published:2019-11-10
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (60903098,61502140,61572418).

基于Pareto分布的众包工人欺骗行为处理方法

潘庆先1,2, 江珊2, 董红斌1, 王莹洁2, 潘廷伟2, 殷增轩2   

  1. 1. 哈尔滨工程大学 计算机科学与技术学院, 哈尔滨 150001;
    2. 烟台大学 计算机与控制工程学院, 山东 烟台 264005
  • 通讯作者: 江珊
  • 作者简介:潘庆先(1979-),男,山东武城人,副教授,博士研究生,CCF会员,主要研究方向:人工智能、群智感知、众包;江珊(1994-),女,山东潍坊人,硕士研究生,主要研究方向:众包;董红斌(1963-),男,黑龙江哈尔滨人,教授,博士,CCF会员,主要研究方向:人工智能、机器学习、多Agent系统;王莹洁(1986-),女,吉林德惠人,副教授,博士,CCF会员,主要研究方向:时空众包;潘廷伟(1992-),男,山东临沂人,硕士研究生,主要研究方向:众包;殷增轩(1995-),男,山东青岛人,硕士研究生,主要研究方向:众包。
  • 基金资助:
    国家自然科学基金资助项目(60903098,61502140,61572418)。

Abstract: Due to the loose organization of crowdsourcing, crowdsourcing workers have deceptive behaviors in the process of completing tasks. How to identify the deceptive behaviors of workers and reduce their impact, thus ensuring the completion quality of crowdsourcing tasks, has become one of the research hotspots in the field of crowdsourcing. Based on the evaluation and analysis of the task results, a Weight Setting Algorithm Based on Generalized Pareto Distribution (GPD) (WSABG) was proposed for the unified type deceptive behaviors of crowdsourcing workers. In the algorithm, the maximum likelihood estimation of GPD was performed, and the dichotomy was used to approximate the zero point of the likehood function in order to calculate the scale parameter σ and shape parameter ε. A new weight formula was defined, and an absolute influence weight was given to each worker according to the feedback data of the crowdsourcing workers to complete the current task, and finally the GPD-based crowdsourcing worker weight setting framework was designed. The proposed algorithm can solve the problem that the difference between the task results data is small and the data are easy to be centered on the two poles. Taking the data of Yantai University students' evaluation of teaching as the experimental dataset, with the concept of interval transfer matrix proposed, the effectiveness and superiority of WSABG algorithm are proved.

Key words: crowdsourcing, quality control, generalized Pareto distribution, unified type deception, weight

摘要: 由于众包的组织模式自由松散,致使众包工人在完成任务的过程中存在欺骗行为。如何识别工人的欺骗行为并降低其影响,从而保障众包任务的完成质量,已经成为众包领域的研究热点之一。通过对任务结果的评估与分析,针对众包工人统一型欺骗行为,提出了一种基于广义Pareto分布(GPD)的权重设置算法(WSABG)。该算法对GPD进行极大似然估计,并用二分法逼近似然函数的零点以计算出尺度参数σ和形状参数ε。算法中定义了新的权重公式,并利用众包工人完成当前任务的反馈数据赋予每位工人一个绝对影响权重,最终设计出了基于GPD的众包工人权重设置框架。所提算法可以解决任务结果数据之间差异性小且容易集中在两极的问题。以烟台大学学生评教数据为实验数据集,提出了区间转移矩阵的概念,证明了WSABG算法的有效性和优势。

关键词: 众包, 质量控制, 广义Pareto分布, 统一型欺骗, 权重

CLC Number: