CrowdForge: Crowdsourcing Complex Work

Aniket Kittur , Boris Smus , Susheel Khamkar , Robert E. Kraut, CrowdForge: crowdsourcing complex work, Proceedings of the 24th annual ACM symposium on User interface software and technology, October 16-19, 2011, Santa Barbara, California, USA

Discussion Leader: Shiwani Dewal

Summary

CrowdForge is a framework which enables the creation and completion of complex, inter-dependent tasks using crowd workers. At the time of writing the paper (and even today), platforms like Amazon Mechnical Turk facilitated access to a micro-workers who complete simple, independent tasks which require little or no cognitive effort. Complex tasks, traditionally, require more coordination, time and cognitive effort; especially for the person managing or overseeing the effort. These challenges become even more acute when crowd workers are involved.

To address this issue, the authors present their framework, CrowdForge, alongwith case studies which were accomplished through a web-based prototype. The CrowdForge framework is drawn from distributed computing (MapReduce) and consists of three steps, viz. partition, map and reduce. The partitioning step breaks a higher level task into single units of work. The mapping step involves the units of work being assigned to workers. The same task may be assigned to several workers to allow for improvements and quality control. The final step is reduction in which the units of work are combined into a single output, which is essentially the solution for the higher level task.

The framework was tested through several case studies. The first case study was about writing a Wikipedia article about New York City. Surprisingly, the articles produced by groups of workers across HITs, were rated, on an average, as high as the Simple English Wikipedia article on New York City and higher than full articles written by individuals as part of a higher paying HIT. Quality control was tested through further map and reduce efforts to merge results and through voting, and was deemed more effective through merged efforts. The second case study involved collating information for researching purchase decisions. The authors do not provide any information about the quality of the resulting information. The last case study dealt with the complex flow of turning an academic paper into a newspaper article for the general public. The paper discusses the steps used to generate news leads (the hook for the paper) and a summary of the researchers’ work, as well as the quality of the resulting work.

The CrowdForge approach looked very promising which was exemplified through the case studies. It also had a few disadvantages such as not supporting iterative flows, assuming that a task can, in fact, be broken down into single units of work and possible overlap between the results of a task due to the lack of communication between workers. The authors concluded by encouraging researchers and task designers to consider crowd sourcing for complex tasks, and push the limits of what they could accomplish through this market.

Reflections

The authors have identified an interesting gap in the crowd sourcing market- ability to get complex tasks completed. And although requesters probably may have broken down their tasks into HITs in the past and taken care of the combining of results on their end, CrowdForge’s partition-map-reduce framework seems like it could alleviate the challenge and streamline the process, to some extent.

I like the way the partition-map-reduce framework is conceptualized. It seems fairly intuitive and seems to have worked well for the case-studies. I am a little surprised (and maybe skeptical?) that the authors did not include the results of the second case study or more details for the rest of the third case study.

The other aspect I really liked about the paper was the effort to identify and test alternative or creative ways to solve common crowd sourcing problems. For example, the authors came up with the idea of using further map-and-reduce steps in the form of merging as an alternative to voting on solutions. Additionally, they came up with the consolidate and exemplar patterns for the academic paper case study, to alleviate the problems of the high complexity of the paper and the effort workers expected to put in.

The paper mentions in its section on limitations that there are tasks which either cannot be decomposed and that another market with skilled or motivated workers should be considered This also brings me back to the notion that perhaps crowd-sourcing in the future will look more like crowd-sourcing for a skill-set, a kind of skill-based consulting.

In conclusion, I think that the work presented in the paper looks very promising, and it would be quite interesting to see the framework being applied to other use-cases.

Discussion

1. The paper mentions that using further map and reduce steps to increase the quality of the output, as opposed to voting, generated better results. Why do you think that happened?

2. There may be tasks which are too complex to be decomposed, or decomposed tasks which require a particular skill set. Some crowd sourcing platforms accomplish this through having an “Elite Taskforce”. Do you think this is against the principles of crowd sourcing, that is, that a task should ideally be available to every crowd worker or is skill-based crowd sourcing essential?

3. CrowdForge breaks tasks up, whereas TurkIt allowed iterative work-flows and the authors talk about their vision to merge the approaches. What do you think would be some key benefits for such a merged approach?

4. The authors advocate for pushing the envelope when it comes to the kind of tasks which can be crowd sourced. Thoughts?

shiwani

Leave a Reply Cancel reply