Summary
The survey provides an overview of machine learning projects utilizing crowdsourcing research. The author focuses on four application areas where crowdsourcing can be used in machine learning research: data generation, models evaluation and debugging, hybrid intelligence systems, and behavioral studies to inform ML research. She argues that crowdsourced studies of human behavior can be valuable for understanding how end users interact with machine learning systems. Then, she argues that these studies are also useful to understand crowdworkers themselves. She explains that it is important to understand crowdworkers and how that would help in defining recommendations of best practices that can be used when working with the crowd. The case studies that she presents show how to effectively run a crowdwork study and provide additional sources of motivation for workers. The case studies also answer how common is dishonesty on crowdsourcing platforms and how to mitigate it when encountered. They also show the hidden social network of crowdworkers and unmask the misconception of independence and isolation in crowdworkers. The author concludes with new best practices and tips for projects that use crowdsourcing. She also emphasizes the importance of pilots to a project’s success.
Reflection
This paper focuses on answering the question: how crowdsourcing can advance machine learning research? It asks the readers to consider how machine learning researchers
think about crowdsourcing. Suggesting an analysis of multiple ways in which crowdsourcing
can benefit and sometimes benefit from machine learning research. The author focuses her attention on 4 categories:
- Data generation:
She analyzes case studies that aim to improve the quality of crowdsourced labels.
- Evaluating and debugging models:
She discusses some papers that used crowdsourcing in evaluating unsupervised machine learning models.
- Hybrid intelligence systems:
She shows examples of utilizing the “human in the loop” and how these systems are able to achieve more than would be possible with state of the art machine learning or AI systems alone because they make use of people’s skills and knowledge.
- Behavioral studies to inform machine learning research:
This category discusses interpretable machine learning models design, the impact of algorithmic decisions on people’s lives, and questions that are interdisciplinary in nature and require better understanding of how humans interact with machine learning systems and AI.
The remainder of her survey provides best practices for crowdsourcing by analyzing multiple case studies. She addresses dishonest and spam-like behavior, how to set payments for tasks, what are the incentives for crowdworkers, how crowdworkers can motivate each other, and the communication and collaboration between crowdworkers.
I find that the community of crowdworkers was the most interesting to read. We have always thought that they’re isolated and independent workers. Finding about the forums, how they promote good jobs, and how they encourage one another was surprising.
I also find the suggested tips and best practices suggested are beneficial for crowdsource task posters. Especially if they’re new to the environment.
Discussion
- What was something unexpected that you learned from this reading?
- What are your tips for new crowdsource platform users?
- What would you utilize from this reading into your project planning/work?