Paper: Making Better Use of the Crowd: How Crowdsourcing Can Advance Machine Learning Research
Author: Jennifer Wortman Vaughan
Summary:
This is a survey paper that provides an overview of crowdsourcing research as it applies to the machine learning community. It first provides an overview of crowdsourcing platforms, followed by an analysis of how crowdsourcing has been used in ML research. Specifically, in generating data, evaluating and debugging models, in hybrid intelligent systems, and its use in behavioral experiments. The paper then reviews crowdsourcing literature that studies the behavior of the crowd, their motivations, and ways to improve work quality. In particular, the paper focuses on dishonest worker behavior, ethical payment for crowd work, and the communication and collaboration patterns of crowd workers. Finally, the paper concludes with a set of best practices to be followed for optimal use of crowdsourcing in machine learning research.
Reflection:
Overall, the paper provides a thorough and analytic overview of the applications of crowdsourcing in machine learning research, as well as useful best practices for machine learning researchers to better make use of crowdsourcing.
The paper largely focuses on ways crowdsourcing has been used to advance machine learning research, but also subtly talks about how machine learning can advance crowdsourcing research. This is interesting because it points to how these two fields are highly interrelated and co-dependent. For example, with the GalaxyZoo project, researchers attempted to optimize crowd effort, which meant that fewer judgements were necessary per image, allowing more images to be annotated overall. Other interesting uses of crowdsourcing were in evaluating unsupervised models and model interpretability.
On the other hand, I wonder what a paper that was more focused on HCI research would look like. In this paper, humans are placed “in the loop,” while in HCI (and the real world) it’s often the machine that is in the loop of a human’s workflow. For example, the paper states that hybrid intelligent systems “leverage the complementary strengths of humans and machines to expand the capabilities of AI.” A more human-centered version would be “[…] to expand the capabilities of humans.”
Another interesting point is that all the hybrid intelligent systems mentioned in Section 4 had their own metrics to assess human, machine, and human+machine performance. This speaks to the need for a common language for understanding and then assessing human-computer collaboration, which is described in more detail in [1]. Perhaps it is the unique, highly-contextual nature of the work that prevents more standard comparisons across hybrid intelligent systems. Indeed, the paper mentions this with regards to evaluating and debugging models, that “there is no objective notion of ground truth.”
Lastly, the paper talks about two relevant topics for this semester, the first is algorithmic aversion and how participants who were given more control in algorithmic decision-making systems were more accurate, not because the human judgements were more accurate, but because the humans were more willing to listen to the algorithm’s recommendations. I wonder if this is true in all contexts, and how best to incorporate this work into mixed-initiative user interfaces. The second topic of relevance is that the quality of crowd work naturally varied with payment. However, very high wages increased the quantity of work but not always the quality. Combined with the various motivations that workers have, it is not always clear how much to pay for a given task, necessitating the need for pilot studies—which this paper also heavily insists on. However, even if it was not explicitly mentioned, one thing is certain: we must pay fair wages for fair work [2].
Questions:
- What are some new best-practices that you learned about crowdsourcing? How do you plan to apply it in your project?
- How might you use crowdsourcing to advance your own research? Even if it isn’t in machine learning.
- Plenty of jobs are seemingly menial, e.g., assembly jobs in factories, working in a call center, delivering mail, yet no one has tried to make these jobs more “meaningful” and motivating to increase people’s willingness to do the task.
- Why do you think there is such a large body of work around making crowd work more intrinsically motivating?
- Imagine you are doing crowd work for a living, would you prefer to be paid more for a boring task, or paid less for a task masquerading as a fun game?
- How much do you plan to pay crowd workers for your project? Additional reference: [2].
- ML systems abstract away the human labor that goes into making it work, especially as seen in the popular press. How might we highlight the invaluable role played by humans in ML systems? By “humans,” I mean the developers, the crowd workers, the end-users, etc.
References:
[1] R. Jordon Crouser and Remco Chang. 2012. An Affordance-Based Framework for Human Computation and Human-Computer Collaboration. IEEE Transactions on Visualization and Computer Graphics 18, 12: 2859–2868. https://doi.org/10.1109/TVCG.2012.195
[2] Whiting, Mark E., Grant Hugh, and Michael S. Bernstein. “Fair Work: Crowd Work Minimum Wage with One Line of Code.” In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, vol. 7, no. 1, pp. 197-206. 2019.