01/29/20 – Vikram Mohanty – Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms.

Paper Authors: Donna Vakharia and Matthew Lease.

Summary

This paper gives a general overview of different crowdsourcing platforms and their key feature offerings, while centering around the limitations of Amazon Mechanical Turk (AMT), the most popular among the platforms. The factors which make requesters resort to AMT are briefly discussed, but the paper points out that these factors are not exclusive to AMT. Other platforms also offer most of these advantages, while offsetting some of AMT’s limitations such as quality control, automated task routing, worker analytics, etc.. The authors qualitatively assess these platforms, by comparing and contrasting on the basis of key criteria categories. The paper, by providing exposure to lesser-known crowdsourcing platforms, hopes to mitigate one plausible consequence of researchers’ over-reliance on AMT i.e. the platform’s limitations can sub-consciously shape research questions and directions.

Reflection

Having designed and posted a lot of tasks (or HITs) on AMT, I concur with the paper’s assessment of AMT’s limitations, especially no built-in gold standard tests, no support for complex tasks, task routing and real-time work. The platform’s limitations, essentially, is offloaded by the researcher’s time, efforts and creativity, which is now consumed to work around these limitations instead of other pressing stuff.
This paper provides a nice exposure to platforms that offer specialized and complex task support (e.g. CrowdSource supporting writing and text creation tasks). As platforms expand on supporting for different complex tasks, this would a) reduce the workload on requesters for designing tasks, and b) reduce the quality control tensions arising from poor task design.
Real-time crowd work, despite being an essential research commodity, still remains a challenge for crowdsourcing platforms. Even though this inability has resulted in toolkits like LegionTools [1] which facilitate real-time recruiting and routing of crowd workers on AMT, these toolkits are not the final solution. Even though many real-time crowd-powered systems have been built using this toolkit, they still remain prone to being bottle-necked by the toolkit’s limitations. These limitations may arise from lack of resources for maintaining and updating the software, which may have originated as a student-developed research project. Crowd platforms adopting such research toolkits into their workflow may solve some of these problems.
Sometimes, projects or new interfaces may require testing learning curve of its users. It does not seem straightforward to achieve that on AMT since it lacks support for maintaining a trusted worker pool. However, it seems to be possible on other platforms like ClickWorker and oDesk, which allow worker profiles and identities.
A new platform, called Prolific, was launched publicly in 2019, and alleviates some of the shortcomings of AMT such as fair pay assurance (with a minimum $6.50/hour), worker task recommendation based on experience, initial filters, and quality control assurance. The platform also provides functionalities for longitudinal/multi-part studies, which may seem difficult to achieve using the functionalities offered by AMT. The ability for longitudinal studies was not addressed for other platforms, either.
The paper was published in 2015 and highlighted the lack of automated tools. Since then, numerous services have come up and now offer human-in-the-loop functionalities, including Amazon and Figure Eight (formerly CrowdFlower).

Questions

The authors raise an important point that the most popularly used platform’s limitations can shape research questions and directions. If you were to use AMT for your research, can you think of how its shortcomings would affect your RQs and research directions? What would be the most ideal platform feature for you?
The paper advocates algorithms for task recommendation and routing, as has been pointed out in other papers [2]. What are some other deficiencies that can be supported by algorithms? (reputations, quality control, maybe?)
If you had a magic tool to build a crowdsourcing platform to support your research, along with bringing a crowd workforce, what would your platform look like (the minimum viable product)? And who’s your ideal crowd? Why would these features help your research?

Vikram Mohanty

I am a 3rd year PhD student in the Department of Computer Science at Virginia Tech. I work at the Crowd Intelligence Lab, where I am advised by Dr. Kurt Luther. My research focuses on developing novel tools that leverage the complementary strengths of Artificial Intelligence (AI) and collective human intelligence for solving complex, open-ended problems.

Summary

Reflection

Questions

Vikram Mohanty

Leave a Reply Cancel reply