01/29/2020 – Bipasha Banerjee – An Affordance-Based Framework for Human Computation and Human-Computer Collaboration

Summary

The paper elaborates on an affordance-based framework for human computation and human-computer collaboration. It was published in 2012 in IEEE Transactions on Visualization and Computer Graphics. Affordances is defined as “opportunities provided to an organism by an object or environment”. They reviewed 1271 papers on the area and formed a collection of 49 documents that have state-of-the-art research work. They have grouped them into machine and human based affordances.

In human affordance, they talk about all the skills that humans have to offer namely, visual perception, visuospatial thinking, audiolinguistic ability, sociocultural awareness, creativity and domain knowledge. In machine affordances they discussed about large-scale data manipulation, collecting and storing large amount of data, effective data movement, bias-free analysis. There also is a separate case where a system makes use of multiple affordances like the reCAPTCHA and the PatViz projects. They have included some possible extensions that include human adaptability and machine sensing. The paper also describes the challenges in measuring complexity of visual analytics and the best way to measure work.

Reflection

Affordance is a new concept to me. It was interesting how the authors defined human vs machine affordance-based system along with systems that make use of both. Humans have a special ability that outperforms machines that is creativity and comprehension. Nowadays, machines have the capability to classify data, but this requires a lot of training samples. Recent neural network-based architectures are “data hungry” and using such system are extremely challenging when proper labelled data is lacking. Additionally, humans have a good capability of perception, where distinguishing audio, images, video are easy for them. Platforms like Amara do take advantage of this and employ crowd-workers to caption a video. Humans are effective when it comes to domain knowledge. Jargons specific to a community e.g., chemical names, legal domain, medical domains are difficult for machines to comprehend. Named entity recognizers help machines in this aspect. However, the error is still high. The paper does succeed in highlighting the positives of both systems. Humans are good in various aspects as mentioned before but are often prone to error. This is where machines outperform humans and can be used effectively by systems. Machines are good when dealing with a large quantity of data. Machine-learning based algorithms are useful to classify, cluster data or other services as necessary. Additionally, not having perception acts as a plus as humans do tend to get influenced from certain opinion. If it is a task that require political angle, it would be extremely difficult for humans to have an-unbiased opinion. Hence, both humans and machines have a unique advantage over the other. It is the task of the researcher to utilize them effectively.

Questions

  1. How to effectively decide which affordance is the best for the task at hand? Human or machine?
  2. How to evaluate the effectiveness of the system? Is there any global evaluation metric that can be implemented?
  3. When using both the systems how to separate task effectively?

Read More

01/29/2020 – Bipasha Banerjee – The Future of Crowd Work

Summary

The paper discusses crowd work and was presented at CSCW (Computer-supported cooperative work) in 2013. It proposes a framework that takes ideas from organizational behavior and distributed computing along with workers’ feedback. The authors of the paper consider the crowd sourcing platform to be a distributed system’s platform where each worker is considered to be analogous to a node in distributed system. This would help in partitioning tasks like parallel computing does. The ways shared resources can be managed, and allocation is also discussed well in this paper. The paper provides deep analysis on the kind of work crowd workers end up doing, the positives and the negatives of such work.

The paper outlines and identifies 12 research areas that form their model. This takes into account broadly, the future of crowd work processes, crowd computation and the crowd workers. Each of the broad topics addressed various subtopics from quality control to collaboration between workers. The paper also talks about how to create leaders in such systems, the importance of better communication and that learning, and assessment should be an integral part of such systems.

Reflection

It was an interesting read on the future of the crowd work. The approach to define the system as a distributed system was fascinating and a novel way to look at the problem. Workers do have a capability to act as “parallel processors” which make the system more efficient and would enable to do intensive tasks (like application development) effectively. Implementing theories from organizational behavior is interesting that it allows the system to better manage and allocate resources. The authors address various subtopics that talk about various issues in depth. It was a very informative read on where they incorporated background work on each of the research areas. I will be discussing some of the topics or problems that stood out to me.

Firstly, they spoke about processes. Assignment of work, management turns out to be a challenging task. In my opinion, a universal structure or hierarchy is not the way to go. In certain kinds of work or tasks it is needed to have a structure where hierarchy would prove to be useful. Work like software development, would benefit from a structure where the code is reviewed, and the quality is assessed by a separate person. Such work also needs a synchronous as people might have tasks dependent on each other.

Secondly, the paper discussed the future of crowd-computation. This included the discussion of AIs and how they can be used in the future to guide crowd working. AI in recent years have proved to be an important tool. Automatic text summarization can be used to help create “Gold standards”. Similarly, other NLP techniques could very well be used to extract information, annotate, summarize and provide other automatic services that can be used to integrate with the current human framework. This would create a human-in the loop system.

Lastly, the future of crowd workers is also an important topic to ponder. Crowd workers are often not compensated well. Similarly, requesters are often delivered sub-par work. The paper did mention that background verification is not always done properly for such “on-demand worker” as it is done for full-time employees from transcripts to interviews. This is a challenge. However, on-demand workers can be validated like Coursera does to validate students. They can be asked to upload documents for tasks that require specialization. This is in itself a task that can be carried out by contractors who verify documentation or create a turk job for the same.

Overall, this was an interesting read and research should be conducted in each of the areas to see how the system and work improves. It has the potential to create more jobs in the future with recruiters being able to hire people instantaneously.

Questions

  1. The authors only considered AMT and ODesk to define the framework. Would other platforms (like Amara, LeadGenuis) have greater/lesser issue which differ from the current needs?
  2. They mentioned about “oDesk Worker Diary” which takes snapshots of workers’ computer screen. How is the privacy and security addressed?
  3. Can’t credentials be verified digitally for specialized tasks?

Read More