01/29/20 – Vikram Mohanty – An Affordance-Based Framework for Human Computation and Human-Computer Collaboration.

Paper Authors: R. Jordon Crouser and Remco Chang

Summary

This paper provides an overview summary of some of the popular systems (back in 2012), which were built around human-computer collaboration. Based on this analysis, the authors uncover different key patterns in human and machine affordances, and propose an affordance-based framework that will help researchers think and strategize better about problems that can benefit from collaboration. Such an affordance-based framework, according to the authors, would enable easy comparison between systems via common metrics (discussed in the paper). In the age of intelligent user interfaces, the paper gives researchers a foundational direction or lens to break down problems and map the solution space in a meaningful manner.

Reflection

  1. This paper is a great reference resource for setting some foundational questions on human-computer collaboration – How do we tell if a problem would benefit from a collaborative solution? How do we decide which tasks to delegate to which party, and when? How do we compare different systems solving the same problem? At the same time, it also sets some foundational goals and objectives for a system rooted in human-computer collaboration. The paper illustrates all the concepts through different successful examples of systems, making it easy to visualize the bin in which your (anticipated) research would fit. 
  2. This paper makes a great motivating argument about developing systems from the problem space, rather than jumping directly to solutions, which may often lead to investment of significant time and energy into developing inefficient collaboration.
  3. The paper makes the case for evolving from a prior established framework (i.e. function allocation) for human-machine systems into the proposed affordance-based one. Even though they proposed this framework in 2012, which is also when deep learning techniques started becoming popular, I feel that this framework is dynamic and broad enough to accommodate the ubiquity of current AI and intelligent user interfaces.
  4. Following the paper’s direction of updating theories with technology’s evolution, I would argue for a “sequel” paper to discuss AI affordances as an extension to the machine affordances. This would require an in-depth discussion of the capacities and limitations of state-of-the-art AIs designed for different tasks, some of which currently fall under human affordances, such as visual perception (computer vision), creativity (language models), etc. While AIs may be far from being perfect in these tasks, they still provide imperfect avoidances. Inevitably, this also means re-focusing some of the human affordances described in the paper, and may be part of a bigger question i.e. “what is the role of humans in the age of AI?”. This also pushes the boundaries for what can be achieved with such hybrid interaction, e.g. AI’s last-mile problems [1].
  5. Currently, many different algorithms interact with human users via intelligent user interfaces (IUIs), and form a big part of decision-making processes. Over the years, researchers from different communities have pointed out how different algorithms can result in different forms of bias [2, 3] and have pushed for more fairness, accountability, transparency and interpretability of these algorithms in an effort to mitigate these biases. The paper, based in 2012, did not account for algorithms within machine affordances, and thus considered bias-free analysis as a machine affordance. 8 years later, to be able to detect biases still remains as somewhat more of a human affordance.

Questions

  1. Now, in 2020, how would you expand upon the machine affordances discussed in the paper?
  2. Does AI fit under machine affordances, or deserves a separate section – AI affordances? What kind of affordances does AI provide humans, and vice-a-versa? In other words, how do you envision this paper in current times? 
  3. For the folks working on AI or ML systems, is it possible for you to present the inaccuracies of the algorithms you are working on in descriptive, qualitative terms? Do you see human cognition, be it through novice or expert workers, as competent enough to fill in the gaps?
  4. Does this paper change the way you view your proposed project? If so, how does it change from before? Is it more in terms of how you present your paper?

Read More

01/29/20 – Vikram Mohanty – Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms.

Paper Authors: Donna Vakharia and Matthew Lease.

Summary

This paper gives a general overview of different crowdsourcing platforms and their key feature offerings, while centering around the limitations of Amazon Mechanical Turk (AMT), the most popular among the platforms. The factors which make requesters resort to AMT are briefly discussed, but the paper points out that these factors are not exclusive to AMT. Other platforms also offer most of these advantages, while offsetting some of AMT’s limitations such as quality control, automated task routing, worker analytics, etc.. The authors qualitatively assess these platforms, by comparing and contrasting on the basis of key criteria categories. The paper, by providing exposure to lesser-known crowdsourcing platforms, hopes to mitigate one plausible consequence of researchers’ over-reliance on AMT i.e. the platform’s limitations can sub-consciously shape research questions and directions. 

Reflection

  1. Having designed and posted a lot of tasks (or HITs) on AMT, I concur with the paper’s assessment of AMT’s limitations, especially no built-in gold standard tests, no support for complex tasks, task routing and real-time work. The platform’s limitations, essentially, is offloaded by the researcher’s time, efforts and creativity, which is now consumed to work around these limitations instead of other pressing stuff.
  2. This paper provides a nice exposure to platforms that offer specialized and complex task support (e.g. CrowdSource supporting writing and text creation tasks). As platforms expand on supporting for different complex tasks, this would a) reduce the workload on requesters for designing tasks, and b) reduce the quality control tensions arising from poor task design.
  3. Real-time crowd work, despite being an essential research commodity, still remains a challenge for crowdsourcing platforms. Even though this inability has resulted in toolkits like LegionTools [1] which facilitate real-time recruiting and routing of crowd workers on AMT, these toolkits are not the final solution. Even though many real-time crowd-powered systems have been built using this toolkit, they still remain prone to being bottle-necked by the toolkit’s limitations. These limitations may arise from lack of resources for maintaining and updating the software, which may have originated as a student-developed research project. Crowd platforms adopting such research toolkits into their workflow may solve some of these problems. 
  4. Sometimes, projects or new interfaces may require testing learning curve of its users. It does not seem straightforward to achieve that on AMT since it lacks support for maintaining a trusted worker pool. However, it seems to be possible on other platforms like ClickWorker and oDesk, which allow worker profiles and identities.  
  5. A new platform, called Prolific, was launched publicly in 2019, and alleviates some of the shortcomings of AMT such as fair pay assurance (with a minimum $6.50/hour), worker task recommendation based on experience, initial filters, and quality control assurance. The platform also provides functionalities for longitudinal/multi-part studies, which may seem difficult to achieve using the functionalities offered by AMT. The ability for longitudinal studies was not addressed for other platforms, either. 
  6. The paper was published in 2015 and highlighted the lack of automated tools. Since then, numerous services have come up and now offer human-in-the-loop functionalities, including Amazon and Figure Eight (formerly CrowdFlower)

Questions

  1. The authors raise an important point that the most popularly used platform’s limitations can shape research questions and directions. If you were to use AMT for your research, can you think of how its shortcomings would affect your RQs and research directions? What would be the most ideal platform feature for you?
  2. The paper advocates algorithms for task recommendation and routing, as has been pointed out in other papers [2]. What are some other deficiencies that can be supported by algorithms? (reputations, quality control, maybe?)
  3. If you had a magic tool to build a crowdsourcing platform to support your research, along with bringing a crowd workforce, what would your platform look like (the minimum viable product)? And who’s your ideal crowd? Why would these features help your research?

Read More

1/29/2020 – Jooyoung Whang – Human Computation: A Survey and Taxonomy of a Growing Field

This paper attempts to define a region where human computation belongs, including its definition and similar ideas. According to the paper’s quote from Von Ahn’s dissertation on human computation, it is defined as a way of solving a computation problem that a machine cannot yet handle. The paper compares human computation with crowdsourcing, social computing, and data mining and explain how they are similar but different. The paper continues to study the dimensions related to human computation, starting with motivation. These include factors such as pay, altruism, and joy. The next dimension that the paper discuss is quality control, the method of ensuring an above-threshold accuracy of human computation results. These included multi-response agreement, expert review, and automatic check. Then, the paper introduces how the gathered computations by many humans can be aggregated together to solve the ultimate problem. These included collection, statistical processing, improvement, and search. Finally, the paper discusses a few more small dimensions such as process order and task-request cardinality.

I enjoyed the paper’s attempt to generate a taxonomy for human computation which can be easily ill-defined. I think the paper did a good job at it by starting with the definition and breaking it down into major components. In the paper’s discussion about aggregation, it was interesting to me that they included “none”, which means the individual human computations by themselves are the major problem that the requester wants solved, and there is no need for aggregation of all the results. Another thing I found fascinating about the project was their mentioning of motivation for the humans performing the computation. Even though it is natural that people will not perform the tasks for nothing, it did not occur to me that this would be a major factor to consider when utilizing human computation. Of the list of possible motivations, I found altruism to be a humorous and unexpected category.

I was also reminded of a project that used human computation, called “Place” held in a community called Reddit, where a user of the community could place a colored pixel on a shared canvas once in a few minutes. The aggregation of human computation of “Place” would probably be considered as iterative improvement.

These are the questions that I could come up with while reading the paper:

1. The aggregation category “none” is very interesting, but I cannot come up with an immediate example. What would be a good case of utilizing human computation that doesn’t require aggregation of the results?

2. In the Venn diagram figure of the paper showing relationships between human computation, crowdsourcing, and social computing, what kind of problems would go into the region where all three overlap? This would be a problem where many people on the Internet with no explicit relation to each other socially interact and cooperate to perform computation that machines cannot yet do. The collected results may be aggregated to solve a larger problem.

3. Data mining was not considered a human computation because it was about an algorithm trying to discover information from data collected from humans. If humans sat together trying to discover information from data generated by a computer, would this be considered human computation?

Read More

1/29/2020 – Jooyoung Whang – An Affordance-Based Framework for Human Computation and Human-Computer Collaboration

In this paper, the author reviews more than 1200 papers to identify how to best utilize human-machine collaboration. Their field of study was visual analytics, but the paper was well-generalized to fit many other research areas. The paper discusses two foundational factors to consider when designing a human-machine collaborative system: Allocation and affordance. In the many papers that the authors reviewed, systematic methods of trying to appropriately allocate work for each human and computer in a collaborative setting was studied. A good rule was introduced by Fitts, but it was found outdated later due to the increasing computational power of machines. The paper decides that inspecting affordance rather than allocation is a better way to utilize human-machine collaborative systems. Affordance can be best understood as what something an agent is good at than others. For example, humans can provide excellent visual processing skills while computers accel at large-data processing. The paper also introduces some case studies where multiple affordances from each party was utilized.

I greatly enjoyed reading about each of the affordances that human and machine can each provide. The list of affordances that the paper provides will serve as a good resource to come back to when trying to design a human-machine collaborative system. One machine affordance that I do not agree with is bias-free analysis. In machine learning scenarios, a learning model is very often easily biased. Both humans and machines can be biased in analyzing something based on previous experience or data. Of course, it is the responsibility of the designer of the system to ensure unbiased models, but as the designer is a human, it is often impossible to avoid bias of some kind. The case study regarding the reCAPTCHA system was an interesting read. I always thought that CAPTCHAs were only used for security purposes, and not machine learning. After learning how it is actually used, I was impressed how efficient and effective the system is at both securing Internet access as well as digitalizing physical books.

The followings are the questions that I came up with while reading the paper:

1. The paper does a great job at summarizing what each a human and a machine is relatively good at. The designer, therefore, simply needs to select appropriate tasks from the system to assign to each human and machine. Is there a good way to identify what affordance the system’s task needs?

2. There’s another thing that humans are really good at compared to a machine: adapting. Machines, upon their initial programming, does not change their response to an event according to time and era while humans very much do. Is there a human-machine collaborative system that would have a task which would require the affordance “adaptation” from a human collaborator?

3. Many human-machine collaborative systems register the tasks that needs to be processed using an automated machine. For example, the reCAPTCHA system (the machine) samples a question and asks the human user to process it. What if it was the other way around where a human register a task and assigns the task to either a machine or a human collaborator? Would there be any benefits to doing that?

Read More

01/22/20 – Vikram Mohanty – Ghost Work

Summary

In the opening chapter of this book, the authors introduce the world of ghost workers, or gig workers, who have been thriving, almost unnoticed, in the shadows of modern-day software. Here, the authors give us an insight about how the gig economy has grown over the years, in terms of number of workers, where they come from, the different kinds of on-demand platforms and the features they offer, and the different kinds of jobs on these platforms. The highlight of these chapters is how these gig workers sit at the core of Artificial Intelligence (AI), either by providing training labels to build the AI models, or by filling when AI falls short of the job. The authors break down the hype of robots rising, and present the last-mile paradox of AI i.e. the race towards automation will never converge, nor take away all our jobs, but will result in the shifting of jobs from full-time employment towards gig work. They also draw historical analogies to factory work, piece work and outsourcing. Posting tasks for gig workers, instead of hiring people full-time, seems more appealing to companies and requesters for multiple reasons including automated hiring, evaluation and payment, low costs and overhead, and higher-quality work in a short time. The authors discuss the future of employment, which would look more like a world of ghost workers and not robots, and therefore, make the case for focusing our attention on improving this world of on-demand work.

Reflection

  1. The hype about AI has been majorly propelled by our media [3]. Companies, especially the rising number of AI start-ups, are partly to blame, as they have to portray themselves as an AI start-up for getting investors onboard. As they explore complex open-ended problems, these standard AI systems are almost certainly bound to fall short, leaving room for the “last-mile” to be filled in by on-demand online workers [1].
    1. Essentially, the efforts of these ghost workers are passed off as the genius of AI, without any credit, and thus propels the hype further. This adds on to the “algorithmic cruelty” of the ghost work platform APIs. 
    2. Sometimes, companies acknowledge the contributions of the experts who are responsible for training and shaping their AI models, but fall short of acknowledging the inevitable presence of these human experts in a predominantly AI workflow [2]. This, once again, can be attributed towards a need to portray themselves as an AI company. 
  2. Instead of the usual rhetoric about AIs taking over jobs, this book paints a realistic portrait about the future of employment, which involves a shift towards on-demand crowd work. This necessitates the need to redesign on-demand infrastructure with a goal of enhancing the quality of workers’ lives (illustrated in great detail in the later chapters). The chapters briefly discuss organizational hierarchy and worker collaboration, two important factors contributing towards LeadGenius’s success. These factors are also echoed by Kittur et al [4] as crucial for enhancing the value and meaning of crowd work. 
    1. It’s good to see politicians addressing issues related to gig economy [6, 7] and AI disrupting the traditional workplace [8]. 
  3. The gig workers need to be hypervigilant in order to get more work, and earn more money. Employers no longer need to actively assign jobs to employees. Therefore, the cost is now borne through a worker’s time and effort to seek and eventually get the job. This is one area where algorithms could help by connecting workers to jobs on these on-demand platforms [9]. Going one step further, ML models can be used to automatically assign workers to tasks where they can contribute [4].  
  4. The AI hype will inevitably blow up, and result in AI-infused systems to invariably rely on human intelligence for complementing (and completing) the shortcomings of AI. Human intelligence, employed either by on-demand labor or any other source, is an invaluable resource, and therefore, shouldn’t suffer the brunt of algorithmic cruelty. To address this, we need human-centered design of tools and platforms. Jeff Bigham, in his “The Coming AI Autumn” post, points out some areas where HCI research and practice could help in designing intelligent systems and making them useful for people. 
  5. These chapters illustrate many real-world examples where humans are needed to “backfill decision-making with their broad knowledge of the world” to make up for the AI’s limitations e.g. spike in search terms during a disaster, identifying hate speech, finding a great springtime wedding venue, face verification. Partially to blame is the predominant use of artificial/synthetic datasets for training AIs, which calls for designing problems rooted in the real world. 

Questions

  1. Let’s say, we have an organization with a human workforce using an AI system for solving a certain problem. The whole workflow involves a feedback channel from the human experts that is also used to train certain aspects of the AI systems, which going forward, may reduce the role of these human experts. Some of these experts may be AI critics as well. Should this organization be collecting data from these experts? If so, how should the workflow be designed? What are some of trade-offs?
  2. The book chapters raises some interesting points about global labor arbitrage and localization of data. Most of the AI systems being built are almost always deficient of data coming in from places/regions/countries with lower labor costs, and therefore, may be biased towards non-US data (face recognition, speech translation, etc.). Why is that the case? How should this be addressed?
  3. The API isn’t designed to listen to Ayesha”. (How does Ghost Work Work?) Has anyone been on the receiving end of algorithmic cruelty? What kind of systems or intelligent user interfaces did you wish for, if any? 
  4. How should journalists cover AI? How should AI claims be fact-checked? 

References

  1. The rise of ‘pseudo-AI’: how tech firms quietly use humans to do bots’ work https://www.theguardian.com/technology/2018/jul/06/artificial-intelligence-ai-humans-bots-tech-companies
  2. AI at the Speed of Real Time: Applying Deep Learning to Real-time Event Summarization https://www.dataminr.com/blog/ai-at-the-speed-of-real-time-applying-deep-learning-to-real-time-event-summarization
  3. An Epidemic of AI Misinformation https://thegradient.pub/an-epidemic-of-ai-misinformation/
  4. Kittur, A., Nickerson, J. V., Bernstein, M., Gerber, E., Shaw, A., Zimmerman, J., … & Horton, J. (2013, February). The future of crowd work. In Proceedings of the 2013 conference on Computer supported cooperative work (pp. 1301-1318).
  5. AI and automation will disrupt our world — but only Andrew Yang is warning about it https://thehill.com/opinion/technology/469750-ai-and-automation-will-disrupt-our-world-but-only-andrew-yang-is-warning
  6. Elizabeth Warren Takes On the ‘Gig Economy’ https://www.thenation.com/article/elizabeth-warren-takes-on-the-gig-economy/
  7. Pete Buttigieg just called out Uber and McDonald’s for their treatment of workers — and said beefing up unions is the best way to protect them https://www.businessinsider.com/pete-buttigieg-plan-to-overhaul-the-gig-economy-2019-7
  8. The Coming AI Autumn https://jeffreybigham.com/blog/2019/the-coming-ai-autumnn.html
  9. Prolific. https://www.prolific.co/

Read More