Who are the crowdworkers?: shifting demographics in mechanical turk

August 31, 2015August 31, 2015 Kurt Luther Leave a comment

Ross, Joel, et al. “Who are the crowdworkers?: shifting demographics in mechanical turk.” CHI’10 Extended Abstracts on Human Factors in Computing Systems. ACM, 2010.

Discussion Leader: Ananya Choudhury

Summary

This paper focusses on how demography of turkers is gradually shifting over a period of time. While previous research by Ipeirotis (2008) suggests worker population based primarily in United States, the study conducted in this paper reveals AMT marketplace becoming significantly international with Indians making up more than one-third of turking population. Based on different criteria like age, annual income, gender, and education this paper gives a comparison between Indian and US turkers. The study shows an increase in the number of highly educated, young, male Indian turkers compared to that in US. The results also tell that for most turkers (both in India and US) turking is an extra source of income, while for a significant number of Indian turkers turking is sometimes or always necessary to meet basic needs. Finally, based on these results the paper raises a few open-ended questions on ethics and authenticity of the data collected.

Reflection

This paper paints a pretty clear picture on who these turkers are. As mentioned in the paper, knowing turkers will help analyze survey results better. However I feel when data is collected in exchange of monetary benefits, the goal of a responder may shift from providing honest opinions to providing responses that will guarantee maximum benefits. Cultural background of turkers also play an important role. As mentioned in the paper, Indians are culturally more reluctant to present themselves as unemployed. If there is a tendency among Indians to not reveal their actual employment status, then the statistics provided in Figure 5 and 6 might not be accurate. This questions the credibility of turkers or data collected through mediums like AMT and analysis performed on these datasets. So knowing turkers may help analyzing data better, but if the knowledge per se is not accurate then how will it help the analysis?

Questions

Do you think knowing backgrounds of the workers will help researchers analysis data better?
Do you think collecting data in exchange of monetary benefits is the best way? What other ways can we devise so that we gather more genuine responses?
Is AMT or similar crowdsourcing mediums the right channel to collect survey data that cannot be validated? Can micro-volunteering be a better way to collect such information?
What is the reason that worker demography comprises of mostly Indians and Americans? Why is rest of the world still under-represented?

Human computation: a survey and taxonomy of a growing field

August 26, 2015 Kurt Luther Leave a comment

Alexander J. Quinn and Benjamin B. Bederson. 2011. Human computation: a survey and taxonomy of a growing field. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’11). ACM, New York, NY, USA, 1403-1412. DOI=10.1145/1978942.1979148 http://doi.acm.org/10.1145/1978942.1979148

Summary

This paper presents a classification of “human computation” and related concepts, including crowdsourcing, social computing, collective intelligence, and data mining. This work is motivated by a growing body of research that uses many of these terms interchangeably, resulting in potential confusion. By surveying a large body of literature, it proposes definitions for each of these areas and describes how they overlap and differ from one another. A Venn diagram helps illustrate these relationships. The authors suggest that human computation was popularized by Luis von Ahn’s 2005 dissertation and its defining trait is a computational process that uses human effort to solve problems that computer can’t. Crowdsourcing was coined by Jeff Howe and is defined as taking a job originally meant for one person and opening it up to a large group of people. The authors identify six key dimensions of human computation, citing examples for each. Three of these — motivation, human skill, and aggregation — are based on analysis of clusters of related projects with a defining attribute. The others — quality control, process order, and task-request cardinality — cut across project clusters. Finally, the authors suggest some opportunities for future work revealed by underexplored areas in their taxonomy. These include exploring new pairings of dimensions, new values for dimensions, and new kinds of work.

Reflection

This strikes me as a great overview of crowdsourcing and human computation research as it stood in 2011 when this paper was published. The definitions of crowdsourcing, human computation, social computing, et al. were not necessarily what I intuitively expected, but they seem reasonable and helpful. With these terms used and abused to mean so many different things, I appreciated a serious attempt to provide some clarity and common ground to researchers working in these areas. In my mind, social computing is the broadest category and encompasses most of these other concepts, but the authors’ more limited scope (humans interacting naturally, mediated through tech) was interesting. I thought the point that human computation need not be collaborative was interesting and hadn’t considered it before. I also note that the authors don’t cite Daren Brabham who is well known for his writings on crowdsourcing and what it means (and doesn’t mean). This would have been an interesting point of contrast but the authors may have skipped him purposefully or accidentally because he doesn’t often publish in CS/HCI venues. There are also a ton of new human computation and crowdsourcing projects in the 4 years since this was published; I wonder how well they fit into this taxonomy and/or require revisions. Certainly, the values for the “human skill” dimension have expanded to include more complex and creative tasks like visual design, writing, and even scientific research. I also note that “learning” or “feedback” are quality control mechanisms that aren’t listed but have become increasingly important.

Questions

Do we agree or disagree with the definitions provided here? Have they become more focused or broader in recent years?
What are some other crowdsourcing or human computation examples you can think of that aren’t listed here, and where would they fall in each of the dimensions? Can you think of some that don’t fit the existing values or dimensions?
What other fields might be included in the Venn diagram and how are they related?
Taxonomies like this are supposed to help researchers, in the authors’ own words, identify opportunities in unexplored or underexplored areas. What opportunities do you see here? Any the authors didn’t mention?
Amy Bruckman writes about the usefulness of category theory in describing different kinds of online communities as being more or less like a set of “prototypes”. How might this argument extend to human computation?
How does a paper like this help you (or not help you) understand the fields of crowdsourcing and human computation research?

The Future of Crowd Work

August 26, 2015 Kurt Luther 3 Comments

Aniket Kittur, Jeffrey V. Nickerson, Michael Bernstein, Elizabeth Gerber, Aaron Shaw, John Zimmerman, Matt Lease, and John Horton. 2013. The future of crowd work. In Proceedings of the 2013 conference on Computer supported cooperative work (CSCW ’13). ACM, New York, NY, USA, 1301-1318. DOI=10.1145/2441776.2441923 http://doi.acm.org/10.1145/2441776.2441923

Summary

In this paper, the authors ask the provocative question, “can we foresee a future crowd workplace in which we would want our children to participate?” To address this question, they review a large body of literature on crowdsourcing (over 100 papers) and incorporate commentary from a survey of 104 US and Indian crowd workers. The authors start with a consideration of tradeoffs of crowd worker vs. traditional work, and then synthesize 12 foci or challenges to the future of crowd work that are especially important. For each focus, the authors describe the goals, review some related work, and offer a proposal for what the future of crowd work should entail. The foci are crowd processes (workflow, task assignment, hierarchy, realtime, synchronous, and quality control); crowd computation (crowds guiding AIs, AIs guiding crowds, and platforms); and crowd workers (job design, reputation and credentials, and motivations and rewards). They conclude with 3 design goals that span multiple foci to provide clear step forwards. The first is to create career ladders that allow workers to advance to more complex and rewarding roles. The second is to help requesters design better tasks that workers will understand and enjoy more. The third is to facilitate learning, which offers the dual benefits of providing workers with new skills and requesters with the talent needed to complete their tasks. The authors conclude by emphasizing that both system design and careful study of the effects are needed, but crowd work provides an exciting new opportunity: to explore radically new kinds of organizations in a controlled experimental setting.

Reflection

This is an ambitious paper that covers a lot of ground (over quite a few pages), but it’s all valuable, important stuff. The challenge to imagine a future where we’d want ourselves or our children to be crowd workers is a wonderful provocation. At first it seems hard to imagine, maybe because I’ve seen so many unpleasant crowd tasks. But on further reflection, it’s an exciting vision of the future–one where people can, from the comfort of their homes, find any kind of work they want to do, and engage with it in a way that is financially rewarding and personally satisfying. I also appreciated the authors’ efforts at synthesizing such a large and diverse range of crowdsourcing papers. Just summarizing what’s been done is helpful per se, but the authors go much further by pointing out the drawbacks and opportunities to do better. I’m simultaneously amazed at the amount of crowdsourcing research that’s been conducted in just a few short years, and surprised at how much is still left to do. For example, the authors note that we have almost no idea if “algorithmic management” is better than traditional management techniques–what an interesting question. I find this inspiring as a researcher in this area. Finally, I appreciated that the authors raised some of the ethical concerns of working in this area, such as fair compensation, privacy, and power dynamics. I think they could have gone even further here. While I agree that all the foci are important, I think maybe the ethical concerns supercede all of them, or at least need to be embedded in each of them. Everything from quality control to hierarchies to AI-guided crowds raises serious questions about ethics and morality that need to be seriously considered from day one.

Questions

Would you want to be a crowd worker? Your children? Why or why not? What do you think is needed to make that vision a reality?
What are some of the potential advantages of crowd work over traditional work? Disadvantages?
What was a focus/challenge that you found particularly exciting or interesting? One that seemed especially difficult or hard to realize?
How could we make crowd labor more appealing than current traditional jobs?
Some of the proposals in the paper make crowd work look more like traditional work. If we follow this line of thinking, will we end up with something that looks just like today’s traditional jobs? Why or why not?
Did any of the proposals in the paper strike you as being particularly ethically worrisome? Why?
As you begin thinking about your project idea, which of these foci do you think you might contribute to?