02/05/20 – Ziyao Wang – Making Better Use of the Crowd: How Crowdsourcing Can Advance Machine Learning Research

February 5, 2020 Ziyao Wang 2 Comments

The author did a survey about how crowdsourcing was applied in research on machine learning. Firstly, previous researches were reviewed to conclude categories for the application of crowdsourcing in the machine learning area. The applications were broken into four categories: data generation, evaluating and debugging models, hybrid intelligence systems and behavior studies to inform machine learning research. In each of the categories, the author discussed several specific areas. For each area, the author concluded several related types of research and made an introduction to each of the research. Finally, the author did analyze on understanding the crowd workers. Though crowdsourcing seemed to greatly help machine learning researches, the author did not ignore the problems in this system, such as dishonesty among workers. Finally, this survey gave researchers who focused on machine learning and applied to crowdsource four advice: maintain a three relationship with crowd workers, care about good task design and use pilots.

Reflection:

From this survey, readers can have a thorough view of the applications of crowdsourcing in machine learning research. It concludes most of the state-of-the-art in machine learning areas related to crowdsourcing. Traditional machine learning always facing problems like lack of data, models cannot be judged, lack of user feedback or system not trustable. However, with the application of crowdsourcing, all these problems can be solved with the help of crowdsourcing workers. Though this is only a survey of previous researches, it actually lets readers get a comprehensive view of this combination of technology.

This survey reminds us of the importance of reviewing previous works. When we want to do research about a topic, there will be thousands of researches which may help. However, it is impossible to view all the papers. Instead, if there is a survey that summarized all previous works and categorized them into several more specific categories, we can easily get a comprehensive view of the topic and new ideas may occur. In this paper, with the research of the four categories of application of crowdsourcing in machine learning, the author comes up with the idea to do research to understand the crowd and finally made suggestions for future researchers. Similarly, if we can do a survey of what we want to do as our projects, we may find out what is a need and what is novel in this field, which will lead to the success of the projects and the development of the field.

Also, it is important to consider critically. In this survey, though the author concluded numerous contributions of crowdsourcing towards machine learning researches, he still discussed the potential risk of this application, for example, dishonesty among workers. This is important for future researches and should not be ignored. In our projects, we should also think critically so that the drawbacks of the ideas we proposed can be judged fairly and the project can be practical and valuable.

Problems:

Which factors can contribute to a good task design?

Is there any solution that can solve the problem of dishonesty among workers instead of mitigating it?

In the experiments which aim to find out user reaction towards something, can the reaction of the paid workers be considered similar to the reaction of practical users?

02/05/2020 – Sukrit Venkatagiri – Making Better Use of the Crowd: How Crowdsourcing Can Advance Machine Learning Research

February 5, 2020 Sukrit Venkatagiri Leave a comment

Paper: Making Better Use of the Crowd: How Crowdsourcing Can Advance Machine Learning Research
Author: Jennifer Wortman Vaughan

Summary:
This is a survey paper that provides an overview of crowdsourcing research as it applies to the machine learning community. It first provides an overview of crowdsourcing platforms, followed by an analysis of how crowdsourcing has been used in ML research. Specifically, in generating data, evaluating and debugging models, in hybrid intelligent systems, and its use in behavioral experiments. The paper then reviews crowdsourcing literature that studies the behavior of the crowd, their motivations, and ways to improve work quality. In particular, the paper focuses on dishonest worker behavior, ethical payment for crowd work, and the communication and collaboration patterns of crowd workers. Finally, the paper concludes with a set of best practices to be followed for optimal use of crowdsourcing in machine learning research.

Reflection:
Overall, the paper provides a thorough and analytic overview of the applications of crowdsourcing in machine learning research, as well as useful best practices for machine learning researchers to better make use of crowdsourcing.

The paper largely focuses on ways crowdsourcing has been used to advance machine learning research, but also subtly talks about how machine learning can advance crowdsourcing research. This is interesting because it points to how these two fields are highly interrelated and co-dependent. For example, with the GalaxyZoo project, researchers attempted to optimize crowd effort, which meant that fewer judgements were necessary per image, allowing more images to be annotated overall. Other interesting uses of crowdsourcing were in evaluating unsupervised models and model interpretability.

On the other hand, I wonder what a paper that was more focused on HCI research would look like. In this paper, humans are placed “in the loop,” while in HCI (and the real world) it’s often the machine that is in the loop of a human’s workflow. For example, the paper states that hybrid intelligent systems “leverage the complementary strengths of humans and machines to expand the capabilities of AI.” A more human-centered version would be “[…] to expand the capabilities of humans.”

Another interesting point is that all the hybrid intelligent systems mentioned in Section 4 had their own metrics to assess human, machine, and human+machine performance. This speaks to the need for a common language for understanding and then assessing human-computer collaboration, which is described in more detail in [1]. Perhaps it is the unique, highly-contextual nature of the work that prevents more standard comparisons across hybrid intelligent systems. Indeed, the paper mentions this with regards to evaluating and debugging models, that “there is no objective notion of ground truth.”

Lastly, the paper talks about two relevant topics for this semester, the first is algorithmic aversion and how participants who were given more control in algorithmic decision-making systems were more accurate, not because the human judgements were more accurate, but because the humans were more willing to listen to the algorithm’s recommendations. I wonder if this is true in all contexts, and how best to incorporate this work into mixed-initiative user interfaces. The second topic of relevance is that the quality of crowd work naturally varied with payment. However, very high wages increased the quantity of work but not always the quality. Combined with the various motivations that workers have, it is not always clear how much to pay for a given task, necessitating the need for pilot studies—which this paper also heavily insists on. However, even if it was not explicitly mentioned, one thing is certain: we must pay fair wages for fair work [2].

Questions:

What are some new best-practices that you learned about crowdsourcing? How do you plan to apply it in your project?
How might you use crowdsourcing to advance your own research? Even if it isn’t in machine learning.
Plenty of jobs are seemingly menial, e.g., assembly jobs in factories, working in a call center, delivering mail, yet no one has tried to make these jobs more “meaningful” and motivating to increase people’s willingness to do the task.
1. Why do you think there is such a large body of work around making crowd work more intrinsically motivating?
2. Imagine you are doing crowd work for a living, would you prefer to be paid more for a boring task, or paid less for a task masquerading as a fun game?
How much do you plan to pay crowd workers for your project? Additional reference: [2].
ML systems abstract away the human labor that goes into making it work, especially as seen in the popular press. How might we highlight the invaluable role played by humans in ML systems? By “humans,” I mean the developers, the crowd workers, the end-users, etc.

References:
[1] R. Jordon Crouser and Remco Chang. 2012. An Affordance-Based Framework for Human Computation and Human-Computer Collaboration. IEEE Transactions on Visualization and Computer Graphics 18, 12: 2859–2868. https://doi.org/10.1109/TVCG.2012.195
[2] Whiting, Mark E., Grant Hugh, and Michael S. Bernstein. “Fair Work: Crowd Work Minimum Wage with One Line of Code.” In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, vol. 7, no. 1, pp. 197-206. 2019.

02/05/20 – Myles Frantz – Making Better Use of the Crowd: How Crowdsourcing Can Advance Machine Learning Research

February 5, 2020February 5, 2020 Miles Frantz Leave a comment

Throughout this paper, a solo Microsoft researcher created a seemingly comprehensive (and almost exhaustive it seems) methods in which Crowd Sourcing can be used to enhance and improve various aspects of Machine Learning. Not only limiting the study to one of the most common Crowdsourcing platforms; a multitude of other platforms were included within the study as well, including but not limited to: CrowdFlower, ClickWorker, and Prolific Academic. Through reading and summarizing around 200 papers, the key areas of affordance were categorized under 4 different categories; Data generation (the accuracy and quality of data being generated), Evaluating and debugging models (the accuracy of the predictions), Hybrid intelligence systems (the collaboration between human and a), and Behavioral studies to inform machine learning research (realistic human interactions and responses to ai systems). Each of these categories has several examples underneath of it, further describing various aspects, their benefits and their disadvantages. Included in these sub-categories are factors such as speech recognition, determining human behavior (and their general attitude) towards specific types of ads, and crowd workers inter communication. With these various factors laid out the author insists that the platform and the requestors ensure their crowd workers have a good relationship, have good task design, and are thoroughly tested.

I agree with the vastness and how comprehensive thus survey is. Many of the points seem to acknowledge most of the region of the research area. Furthermore, it also doesn’t seem this work could easily be set into a more compact state.

I do whole-fully agree with one of the lasting points of ensuring the consumers of the platform (requesters and the crowd workers) in a good and working relationship. There is a multitude of platforms overcorrecting or under correcting their issues upsetting their target audience and cliental, therefor creating negative press and temporarily dipping their stock. Such a leading example of this is Youtube and their child content, where people have been sending ads illegally towards children. Youtube in turn overcorrected and still ended up with negative press since they hurt several of their creators.

Though not a fault of the survey, I disagree with the methods of Hybrid Forecasting (“producing forecasts about geopolitical events”) and Understanding Reactions to Ads. These seem to be an unfortunate but inevitable outcome with how companies and potentially governments are attempting to predict and potentially get ahead of incidents. Advertisements are not as relatively bad, however in general it seems the practice of ensuring the perfect balance of targeting the user and creating the perfect environment for viewing an advertisement seems to be malicious and not for the betterment of humanity.

While impractically impossible, I would like to see what the industry has created in the aspect of Hybrid Forecasting. Without knowing how far this kind of technology has spread creates an imagination like a few Black Mirror episodes.
From the authors I would like to see which platforms host each of the subcategories of features. This could be done on the readers side though this might seem a study in and of itself.
My final question would be requesting a subjective comparison of the “morality” of each platform. This could be done in comparing the quality of the workers in their discussion or how strong the gamification is between platforms.

02/04/2020 – Akshita Jha – Making Better Use of the Crowd: How Crowdsourcing Can Advance Machine Learning Research

February 4, 2020February 5, 2020 Akshita Jha Leave a comment

Summary:
“Making Better Use of the Crowd: How Crowdsourcing Can Advance Machine Learning Research” by Vaughan is survey paper that provides an informative overview of the crowdsourcing research for the machine leaning community. There are four main application areas:
(i)Data Generation: This is made up of two types of work. The first type of data aggregation is where the several crowdworkers are assigned the same data point and asked to annotate it. The machine learning algorithm then aggregates this data and finalizes the response. The second type of research in data aggregation involves modifying the system to get quality responses from crowdworkers.
(ii)Evaluation and debugging of the model: The crowdsourced workers can help debug and evaluate unsupervised machine learning algorithms like topic modelling, LDA, generative models etc.
(iii)Hybrid systems that utilize both machines and humans to expand its capabilities: Humans and machines have complementary strengths which, if made proper use of, can result in effective systems that help humans as well as improve the machine’s understanding.
(iv)Crowdsourced behavioral experiments that gather data and improve our understanding of how humans would like to interact with machine learning systems: Behavioral experiments can help us understand what how humans would like to interact with the system and the changes that can be made to improve the end user satisfaction.

Reflections:
In my limited knowledge about crowdworkers, I was aware of their importance for data aggregation. The author does a good job highlighting other areas where machine learning researchers might benefit from utilizing the power of crowdworkers. What I found particularly interesting were the case studies making use of crowdworkers to debug models and evaluate their interpretability. When we think of “debugging” models and finding out flows in the system, we mostly try to view things from the developer’s point of view and rely on them completely to debug and evaluate the model’s performance. Using crowdworkers for the task seems like a useful application areas which more machine learning researchers should be aware of. These tasks might also be of greater interest to the crowdworkers because they are not repetitive and involve active participation of the crowdworkers. “Human debugging” can help the system by taking into account the crowdworkers feedback to uncover bottlenecks in machine learning models. Hybrid techniques that involve using human feedback also seems like a promising application area where the system relies extensively on human judgement to make the right decisions. This also puts more responsibility on the machine learning researchers to be creative and come up with unique ways to involve humans. Setting up pilot studies can help in this front. Pilot studies can prove useful as they demonstrate how a lay man interacts with a system and the gaps that exist which should be filled up by the researchers in order to ensure a cohesive experience for the end user. However, care should be taken to ensure that the effort put in by the crowdworkers for building these systems does not go unappreciated.

Questions:
1. Did you agree with the applications of crowdworkers presented in this survey?
1. What steps can be taken to make machine learning researchers aware of these potential applications?
2. Apart from fairly compensating the workers, what steps can be taken to value their contributions?

02/05/20 – Lulwah AlKulaib- Making Better Use of the Crowd

February 4, 2020February 5, 2020 Lulwah AlKulaib Leave a comment

Summary

The survey provides an overview of machine learning projects utilizing crowdsourcing research. The author focuses on four application areas where crowdsourcing can be used in machine learning research: data generation, models evaluation and debugging, hybrid intelligence systems, and behavioral studies to inform ML research. She argues that crowdsourced studies of human behavior can be valuable for understanding how end users interact with machine learning systems. Then, she argues that these studies are also useful to understand crowdworkers themselves. She explains that it is important to understand crowdworkers and how that would help in defining recommendations of best practices that can be used when working with the crowd. The case studies that she presents show how to effectively run a crowdwork study and provide additional sources of motivation for workers. The case studies also answer how common is dishonesty on crowdsourcing platforms and how to mitigate it when encountered. They also show the hidden social network of crowdworkers and unmask the misconception of independence and isolation in crowdworkers. The author concludes with new best practices and tips for projects that use crowdsourcing. She also emphasizes the importance of pilots to a project’s success.

Reflection

This paper focuses on answering the question: how crowdsourcing can advance machine learning research? It asks the readers to consider how machine learning researchers

think about crowdsourcing. Suggesting an analysis of multiple ways in which crowdsourcing

can benefit and sometimes benefit from machine learning research. The author focuses her attention on 4 categories:

Data generation:

She analyzes case studies that aim to improve the quality of crowdsourced labels.

Evaluating and debugging models:

She discusses some papers that used crowdsourcing in evaluating unsupervised machine learning models.

Hybrid intelligence systems:

She shows examples of utilizing the “human in the loop” and how these systems are able to achieve more than would be possible with state of the art machine learning or AI systems alone because they make use of people’s skills and knowledge.

Behavioral studies to inform machine learning research:

This category discusses interpretable machine learning models design, the impact of algorithmic decisions on people’s lives, and questions that are interdisciplinary in nature and require better understanding of how humans interact with machine learning systems and AI.

The remainder of her survey provides best practices for crowdsourcing by analyzing multiple case studies. She addresses dishonest and spam-like behavior, how to set payments for tasks, what are the incentives for crowdworkers, how crowdworkers can motivate each other, and the communication and collaboration between crowdworkers.

I find that the community of crowdworkers was the most interesting to read. We have always thought that they’re isolated and independent workers. Finding about the forums, how they promote good jobs, and how they encourage one another was surprising.

I also find the suggested tips and best practices suggested are beneficial for crowdsource task posters. Especially if they’re new to the environment.

Discussion

What was something unexpected that you learned from this reading?
What are your tips for new crowdsource platform users?
What would you utilize from this reading into your project planning/work?