Soylent and the Introduced Perspectives to Influence Time and Cost, Ownership, and Knowledge (Blog 7)

Bernstein, Michael S., et al. “Soylent: a word processor with a crowd inside.” Proceedings of the 23nd annual ACM symposium on User interface software and technology. ACM, 2010.

Summary:

This paper introduces Soylent, a crowdsourced word processing interface, with three main components: shortn, crowdproof, and the human macro. Shortn focuses on shortening written pieces with the intent of utilizing an author’s time better by giving this task to crowd workers. Crowdproof focuses on a distributed human-aided spelling, grammar, and style check. Lastly, the human macro focuses . Soylent follows a find-fix-verify pattern to help reduce poor results from crowd workers by splitting up the tasks of writing into chucks that are both manageable and mitigates poor results. Results of this work introduce complex issues of wait time on the resulting work, cost work on the document, legal ownership of writing, privacy through document exposure, and domain knowledge of the topic presented.

Reflection:

In this reflection, I am going to expand on the discussion of the paper. The issues listed below are what the authors of this paper believe to be future concerns of technologies like Soylent:

  • Wait Time
  • Cost
  • Legal ownership
  • Privacy
  • Domain Knowledge

Wait time and cost are a natural occurrence when one asks for some work to be done. But I want to expand on this a little more. Regarding wait time, the authors found it was acceptable in their evaluation. However, I wonder if over time this acceptance changes. It was clear that poor results afflicted Soylent at a high rate of around 30%. Which means for a very unlucky author they have a bout a 3rd of a chance to get a poor result where a wait time and costs is not justifiable. And if that happens multiple times over a period of time then frustration happens. Additionally, the cost of the tasks seems cheap when compared to a single salaried employee (example used in the paper), but I see a tradeoff of consistency in that respect. This introduces the question is a single person better or worse in terms of wait time and cost than the crowd? Indeed, it would be interesting to see businesses that utilize writers for shortening, grammar checking, and understanding the intentions of the author/boss.

Legal ownership is a tricky thing and is well addressed to the context of mechanical turk in that work done by the worker is owned by the employer of the task. However, crowd worker systems for creativity seems like a tricky situation legally. I believe current methods look into “who produced the [most] content?” or “who originated the idea?” as indicators of ownership. The ability to write is in fact a non-verbal form of communication to express thoughts and ideas. But who does said writing is credited as the main author. But for types of works like graphic arts, programs, or other digital artifacts there is even greater levels of what it means to have ownership of.  Maybe this is an opportunity to understand how collaboration could be utilized to legally understand crowdsourced work. Or a new licensing scheme is created.

I will leave privacy for another discussion as it did not interest me enough for this topic. But domain knowledge in crowd systems seems to have a lot of potential for work. A type of mistake when involving the crowd in a piece of work can be related to lack of knowledge to produce a good result (or even produce at all). In the context of citizen science, having the knowledge to correct do something is everything. Otherwise you run into issues like poor data. This issue has looked into pre-training sessions or post data validation methods to help mitigate this problem. Interestingly enough the find-fix-verify pattern is not super relevant here. Collecting data is very different editing written work. But the pattern did give me an idea to think of how to break down the tasks of citizens in citizen science to better create applications to help the user perform the necessary tasks. Maybe a collect-post-verify pattern is needed for citizen science.