diff options
| -rw-r--r-- | ecai-2016-0657.txt | 78 |
1 files changed, 78 insertions, 0 deletions
diff --git a/ecai-2016-0657.txt b/ecai-2016-0657.txt new file mode 100644 index 0000000..55fd273 --- /dev/null +++ b/ecai-2016-0657.txt @@ -0,0 +1,78 @@ +Significance: 5 +Originality: 4 +Relevance: 7 +Technical quality: 3 +Presentation quality: 7 +Scholarship: +Confidence: 6 +Overall Recommendation: 5 + + +Comment to the authors: + +Summary of the paper +==================== + +This paper studies the problem of aggregating distributions from workers. For +example, an experimenter wants to use crowdsourcing to categorize a set of +documents: each document's topic is represented as a distribution over +J topics which the experimenter wishes to learn. + +The authors note that previous approaches: + +* either do not take into account the inherent quality and/or bias of the +different workers (for example LinOp, LogOp) + +* do not aggregate distributions over topics but only elicit a single category +from the workers for the documents they are assigned to. + +The model proposed by the authors aims to achieve the best of both worlds: the +bias/quality of each worker is captured through a confusion matrix which is +a JxJ matrix whose column i is a distribution over the J categories expressing +how the worker might miscategorise category i. The distribution elicitated from +the worker is then simply the vector-matrix product between the ground truth +distribution of the document and the confusion matrix of the worker. + +A standard Bayesian inference approach is then applied: the posterior +predictive distributive of the joint distribution of (document topics, worker +confusion matrices) is estimated via the EP algorithm. + +Experimental results confirm the validity of the approach. Workers biases are +simulated by introducing synthetic "spammers" in the dataset. Experiments show +that this approach compares favorably to prior approaches. + +Comments +======== + +The problem studied is very interesting and arguably one of the central +problems in crowdsourcing: even though the paper is presented through the lens +of document categorization, the abstract task and model used could capture many +other applications. It is also clear that trying to model the bias of workers +is crucial for most, if not all, crowdsourcing applications. + +The proposed model addresses the shortcomings of previous models that are +clearly identified by the authors. The proposed model can be seen as a natural +extension of them. A possible concern is the one of practicality: the success +of crowdsourcing platforms relies on being able to offer simple tasks to +workers. In that respect, asking the user to report a single category for each +document seems much simpler than asking them to report a distribution over +topics. This problem is mentioned in the conclusion of the paper: it seems that +the gain in expressivity could be outweighed by the additional noise introduced +by asking the workers to perform a more complex task. + +The paper is well and very clearly written overall. I only found one thing +confusing: it is not clear from the formal description of the Bayesian model, +and in particular Figure 1, that what is elicitated from the workers is +a distribution over topics: the way it is written c_{i,n} is a categorical +variable. The distribution of c_{i,n} corresponds to the product \Lambda_i \Pi +introduced in the previous page, but then it is not clear if what is being +reported by the workers is c_{i,n} itself or its distribution. I think this +should be clarified in future versions of this paper. + +The experiments are sound and establish a clear (favorable) comparison between +the proposed approach and prior work. One reason for concern here is the +problem of over-fitting: it seems that the model proposed in this paper has more +parameters than any of the previously suggested models. Given the experimental +setting, it is not clear to me how this effect should be quantifying, but it +would be interesting to see an experiment discussing the trade-off between +model complexity and generalization error. |
