Throughout this paper, a solo Microsoft researcher created a seemingly comprehensive (and almost exhaustive it seems) methods in which Crowd Sourcing can be used to enhance and improve various aspects of Machine Learning. Not only limiting the study to one of the most common Crowdsourcing platforms; a multitude of other platforms were included within the study as well, including but not limited to: CrowdFlower, ClickWorker, and Prolific Academic. Through reading and summarizing around 200 papers, the key areas of affordance were categorized under 4 different categories; Data generation (the accuracy and quality of data being generated), Evaluating and debugging models (the accuracy of the predictions), Hybrid intelligence systems (the collaboration between human and a), and Behavioral studies to inform machine learning research (realistic human interactions and responses to ai systems). Each of these categories has several examples underneath of it, further describing various aspects, their benefits and their disadvantages. Included in these sub-categories are factors such as speech recognition, determining human behavior (and their general attitude) towards specific types of ads, and crowd workers inter communication. With these various factors laid out the author insists that the platform and the requestors ensure their crowd workers have a good relationship, have good task design, and are thoroughly tested.
I agree with the vastness and how comprehensive thus survey is. Many of the points seem to acknowledge most of the region of the research area. Furthermore, it also doesn’t seem this work could easily be set into a more compact state.
I do whole-fully agree with one of the lasting points of ensuring the consumers of the platform (requesters and the crowd workers) in a good and working relationship. There is a multitude of platforms overcorrecting or under correcting their issues upsetting their target audience and cliental, therefor creating negative press and temporarily dipping their stock. Such a leading example of this is Youtube and their child content, where people have been sending ads illegally towards children. Youtube in turn overcorrected and still ended up with negative press since they hurt several of their creators.
Though not a fault of the survey, I disagree with the methods of Hybrid Forecasting (“producing forecasts about geopolitical events”) and Understanding Reactions to Ads. These seem to be an unfortunate but inevitable outcome with how companies and potentially governments are attempting to predict and potentially get ahead of incidents. Advertisements are not as relatively bad, however in general it seems the practice of ensuring the perfect balance of targeting the user and creating the perfect environment for viewing an advertisement seems to be malicious and not for the betterment of humanity.
- While impractically impossible, I would like to see what the industry has created in the aspect of Hybrid Forecasting. Without knowing how far this kind of technology has spread creates an imagination like a few Black Mirror episodes.
- From the authors I would like to see which platforms host each of the subcategories of features. This could be done on the readers side though this might seem a study in and of itself.
- My final question would be requesting a subjective comparison of the “morality” of each platform. This could be done in comparing the quality of the workers in their discussion or how strong the gamification is between platforms.