Improving Crowd Innovation with Expert Facilitation

Chan, Joel, Steven Dang, and Steven P. Dow. “Improving Crowd Innovation with Expert Facilitation.”

Discussion Leader: Shiwani

Summary:

As the title suggests, this paper studies whether crowd innovation can be improved through expert facilitation. The authors created a new system which builds on the strategies used during face-to-face brainstorming and uses this to provide high-level “inspirations” to crowd-workers to improve their ideation process.

The first study compared the creativity levels of ideas generated by a “guided” crowd with ideas generated without any facilitation. The study showed that the ideators in the condition with expert facilitators generated more ideas, generated more creative ideas and exhibited more divergent thinking. The second study focused on the abilities of the facilitator and involved novice facilitators, keeping all other constraints same. Surprisingly, the facilitation seemed to negatively influence the ideation process.

Reflections:

This paper touches on and build on many things we have been talking about this semester. One of the key ideas behind <SystemName> is feedback, and its role in improving creativity.

I really liked the paper as a whole. Their approach of adapting expert facilitation strategies from face-to-face brainstorming to a crowd-sourcing application was quite novel and interesting. They took special efforts to make the feedback synchronous in order to “guide” the ideation, as with real-time brainstorming.

They make a strong case for the need for something such as <SystemName>. Their first point centers around the fact that the crowd-workers may be hampered because of inadequate feedback (as we have discussed before in the “feedback” papers). And the second point is that the existing systems were not built to scale. With <SystemName> the authors created a system to provide expert feedback while also ensuring it could scale by keeping the feedback at a higher level, rather than individualized.

The authors mention that a good system requires divergent thinking as well as convergence of ideas. The divergence prevents local minima in the ideation, and the convergence allows for growth of promising solutions into better ideas. This was an interesting way of looking at the creative process. And this situates their choice of  using a skilled facilitator as a tool.

The study was quite well-designed with clear use-cases. On one-hand they wished to study the effect of having a facilitator guide the ideation. And a second study captured the effect of the skill-level of the facilitator. The interface design was simplistic, both for the ideators and the facilitators. I liked the word-cloud idea for the facilitators- it is a neat way to present an overview/insight at such a scale. I also liked the “pull” model for inspiration, where the ideators were empowered to ask for inspiration whenever they felt the need for it as opposed to pre-determined check points. This deviates somewhat from the traditional brainstorming where experts choose when to intervene, but again, for the scale of the system and the fact that the feedback was not individualized, it makes sense.

 The authors do mention that their chosen use-case may limit the generalization of their findings, but the situational, social case was a good choice for an explorative study.

As with a previous paper we read, creativity was qualified by the authors, due to its subjective nature. Using novelty and value as evaluative aspects of creativity seems like a good approach, and I liked that the creativity score was a multiplication of these two to reflect their interactive effect.

Questions:

  1. A previous paper defined creativity in terms of novelty and practical (being of use to the user, and it being practically feasible to manufacture in today’s age), whereas this paper focused only on the “of value” aspect in addition to novelty. Do you think either definition is better than the other?
  2. The paper brings forth the interesting notion that “just being watched” is not sufficient to improve worker output. Do you think this is specific to creativity, and the nature of the feedback the workers received?
  3. For the purpose of scale, the authors gave “inspirations” as opposed to individualized feedback (like Shepherd did). Do you think the more granular, personalized feedback would be helpful in addition to this? In fact, would you consider the inspirations as “feedback”?

Read More

Improving Crowd Innovation with Expert Facilitation

Chan et al., “Improving Crowd Innovation with Expert Facilitation” CSCW’16

Discussion Leader (con): Nai-Ching

Summary

Although crowdsourcing has been shown to be useful for creativity tasks, the quality of creativity is still an issue. This paper demonstrates that the quality of crowdsourced creativity tasks can be improved by introducing experienced facilitators in real time work setting. The facilitators produce inspirations that are expected to facilitate the ideation. To measure the quality, divergence (fluency and breadth of search), convergence (depth of search) and creative outcomes (rated creativity of ideas) are used. The result from first experiment shows that with the help of experienced facilitators, both the number of generated ideas and max creativity of the output increase. The result of second experiment reveals that with novice/inexperienced facilitators, the creativity of the output is reduced. To further analyze the causes/reasons of the difference, the authors code the strategies that are used to generate the inspirations into categories including “Examples”, “Simulations” and “Inquiries”. While “Examples” and “Inquires” do not have significant effects on the output, “Simulations” are highly associated with higher max creativity of ideas. The authors also point out that the different intentions of experienced and novice facilitators might attribute to the different results of facilitation. The experienced facilitators tend to actually do the facilitating job while the inexperienced facilitators are more inclined to do the ideating job

 

Reflections

It seems to be contradictory that the paper first mentions that popularity and “rich get richer” effects might not be actual innovative potential but later on the facilitation dashboard, the keywords are sized by frequency which seems to be just another form of popularity.

It is not clear about the interaction between ideators and the “inspire me” function before the facilitator enters any inspiration. If there is no inspiration available, is the button disabled? And how do ideators know if there is new inspiration? Also, do facilitators know if ideators request inspiration? I think the “inspire me” function should help keep the workers and lower the attrition rate but based on the results, there is no significant difference between facilitated and unfacilitated conditions.

In addition, the increased creativity only happens at max creativity not including mean creativity. One the one hand, It makes sense as the authors argue that what innovators really care about is increasing the number of exceptional ideas and since it is more likely to get higher creativity with proper facilitation or say proper facilitation increases the potential of getting higher creativity, the proper facilitation is a good technique. On the other hand, it also shows the technique might not be reliable enough to avoid the manual effort of going through all the generated ideas to pick out the good ones (max creativity). This paper also reminds me of an earlier paper we discussed, “Distributed analogical idea generation: inventing with crowds”, which mainly increases the mean creativity and the change of max creativity is not reported. It might be possible to combine both techniques to both increase mean and max creativity of ideas.

It also seems to me that in addition to soliciting more ideas, keeping a good balance between divergence and convergence is also very important but I didn’t see in the future work section that it is important/helpful to show information of current breadth and depth of idea/solution space to the facilitator to help him/her divert the direction of inspirations.

It is interesting to see that one of the themes in ideators’ comments about inspirations provoking new frames of thinking about the problem but actually there is no significant difference of breadth between facilitated and unfacilitated conditions. So I wonder how general the theme is.

Questions

  • What reasons do you think cause the discrepancy between user perception and actual measurement of breadth search in the solution space?
  • What is the analogy between the technique from this paper and the technique from “Distributed analogical idea generation: inventing with crowds”?
  • Can most people appreciate creativity? That is, if a lot of people say something is creative, is it creative? Or if something is creative, do most people think it creative as well?

Read More

Combining crowdsourcing and learning to improve engagement and performance.

Dontcheva, Mira, et al. “Combining crowdsourcing and learning to improve engagement and performance.” Proceedings of the 32nd annual ACM conference on Human factors in computing systems. ACM, 2014.

Discussion Leader (Pro): Sanchit

Summary

This paper presented a crowdsourcing platform called LevelUp For Photoshop. This tool helps workers learn Photoshop skills and tools through a series of tutorials and then allows them to apply these skills to real world image examples from several non-profit organizations that require image touchups before uploading the images for use.

This sort of crowdsourcing platform is different in that it is aimed at completing creative tasks through the crowd but also allowing the crowd to learn a valuable skill that they can apply to other fields and scenarios outside of this crowdsourcing platform. The platform starts off every user with a series of very interactive and step-by-step guiding tutorials. These tutorials are implemented as an extension for Adobe Photoshop which allows the extension to monitor what tools and actions the users have taken. This creates a very easy-to-use and learn-from tutorial system because every action has some sort of feedback associated. The only thing this tool can’t do is judge the quality of the transformations of these images. That task however is extended onto other Amazon MTurk workers who look at a before/after set of images to determine the quality and usefulness of the picture editing job done by a crowd worker in LevelUp.

This paper presented a very thorough and detailed evaluation and study of this project. It involved 3 deployments where each contribution of the approach was added onto the plugin for user testing. The first deployment was only of the interactive tutorial. The authors measured the number of levels the players completed and got helpful feedback and opinions about the tutorial system. The second deployment added the challenge mode and evaluated the results with logs, MTurk worker quality checks and expert quality examination. These photo edits were scored using a point system between 1-3 for usefulness and novelty. The last deployment added real images from non-profit organizations. The test was to determine whether different organizations have a different effect on a user’s editing motivation and skills. The results weren’t as spectacular, but they were still positive in that the skills learned by the users were helpful.

Reflection

Usually crowdsourcing involves menial tasks that have little to no value outside of the platform service, but the authors in this paper designed a very unique and impressive methodology for users to both learn a new and useful skill like photo editing and then applying the skills to complete existing real-world photo editing tasks. They took advantage of the need for certain people to learn Photoshop or image editing and while teaching them were also able to accomplish a real-life photo editing task, thus killing two birds with one stone. Crowdsourcing doesn’t necessarily have to involve monotonous tasks and nor do crowd workers have to be paid monetarily. This is a creative approach where the incentive is the teaching and skills developed for photo editing along with having achievements and badges for completing specific tasks. They may not be as valuable as money, but it is enough incentive to garner interest and leverage the newly learned skills to accomplish an existing task.

The authors conducted extremely detailed surveys and collected feedback from a pool of approximately ten thousand works over a period of 3 years. This type of dedication for evaluation and the associated results of this study prove the usefulness of this type of crowdsourcing platform. It shows that not all crowd work has to be menial tasks and that users can actually learn a new skill and apply the work outside of crowd platforms. However, I do admit that the way these results were presented were non-trivial. The inclusion of graphs, charts or tables would have made it easier to follow along instead of interpreting the numerous percentages within the paragraphs.

By having MTurk workers and experts judge photo edits, they bring in the perspective of an average user and what their perception of quality or usefulness is and they also bring in the perspective of a professional to see how quality or usefulness is judged through their eyes. That, in my opinion, is a pretty strong group of evaluators for such a project especially considering the massive scale at which people volunteered and completed these evaluation tasks on MTurk.

Lastly, I was really impressed by the Photoshop extension that the authors developed. It looked very clean, sleek and easy to learn from because it doesn’t seem to intimidate users like the palette of tools that Photoshop presents. This sleekness can allow workers to retain the skills learned and apply it to future projects that they may have. I think photo editing is a fabulous skill to have for anyone. You can edit your photos to focus or highlight different areas of the pictures or to remove unwanted noise or extraneous subjects from an image. By having a straightforward, step-by-step and interactive tool such as LevelUp, one can really increase their Photoshop skillset by a huge margin.

Questions

  • How many of you have edited pictures on Photoshop and are “decent” at it? How many would like to increase your skills and try out a learning tool like this?
  • Having a great tutorial is a necessity for such a concept to work where people can both learn and apply those skills elsewhere without their hands being held. What features do you think such tutorials should have to make them successful?

Read More

Combining crowdsourcing and learning to improve engagement and performance.

Dontcheva, Mira, et al. “Combining crowdsourcing and learning to improve engagement and performance.” Proceedings of the 32nd annual ACM conference on Human factors in computing systems. ACM, 2014.

Discussion Leader (con): Ananya

Useful Link:  http://tv.adobe.com/watch/prototype/creative-technologies-lab-photoshop-games/

Summary

This paper discusses how crowdsourcing and learning can be combined to create an environment that benefits both worker and requester. The learning should be such that the skill developed helps workers not only in crowdsourcing context but also marketable in other contexts.

The authors developed a learning interface “LevelUp” on top of Adobe Photoshop. This interface presents an interactive step by step tutorial for photo editing as “missions”, ordered by increasing difficulty level. The tutorial provides sample images for users to work on or users can use their own images. The users are presented with one step at a time and the user has to finish this step to go to the next round. Each mission is associated with points and the number of points increase with the difficulty of the mission. Users can also earn badges on successfully completing a mission which they can share on social networking sites. The system also gives instant feedback to users on their progress. It has 12 tutorials, divided into three levels. At the end of each level, users test their skill in a challenge round. Images in the challenge round are supplied by requester organization.

The interface has two parts. First part is just the interactive tutorial and the challenge round comes in the next part. The challenge round was created to support crowdsourcing. Unlike the interactive part which presents a set of steps for improving an image, the challenge part just suggests improvements and also lets user improvise.

The implementation and results are divided across three deployments. Deployment 1 consisted only of the interactive tutorial. For evaluation, the authors measured number of missions completed by players, collected player’s feedback monthly, interviewed 6 players and finally compared user behavior before and after the game. Overall, this deployment received positive feedback with more than 60% completing atleast level 1.

The deployment 2 tested whether skills learnt in deployment 1 could be used in real world tasks. This included both interactive tutorial and challenge rounds. The authors performed 3 types of evaluations: 1. behavioral logs — number of images edited and types of edit performed, 2. MTurk workers compared original image to edited image and 3. experts examined the quality of edited images and rated between 1-3 on the basis on usefulness and novelty. The results were mixed. However images in challenge 3 received higher “more useful” rating than challenge 1. The authors derives that the users who went till level 3 were more motivated to learn and do a good job.

In deployment 3, the authors added real-images from requesters from 4 different organizations to the challenge round to find out if certain type of institutions would receive better results over others. They deployed this in two different versions — one that included detailed information about the requester organization and the other that just listed the name of the organization. They analyzed behavioral data that included details about the images edited, qualitative assessments from MTurkers, experts and two requesters, and survey data on user-experience. One of the requester assessed 76 images and rated 60% of edited images better than original and the other assessed 470 images and rated 20% of them better than original.

 

Reflection

When I read this paper, the first thing that came to my mind is “Where is the crowdsourcing aspect?”. The only crowdsourcing part was assessment done by MTurkers who needed no specific skill to do the task. Even that part was redundant since the authors were also getting assessment done by experts. I think the title of the paper and the claim of combining crowdsourcing and learning is misleading.

The participants were not crowd workers who were paid to do the job but rather people who wanted  to prettify their images. Now photoshop on its own being a bit overwhelming, LevelUp seemed to be an easy way to learn basics of photo editing. This is an anecdotal view. However this raises the same question that Dr. Luther (sorry Dr. Luther I might not have quoted you accurately) raised yesterday “Would the results differ if we randomly selected people to either play the game or do the task?”. Does the cause of any action influence the results? It would have been interesting ( and more relevant to the title of the paper) to see if MTurkers (who may or may not have interest in photo editing) were chosen as participants and asked to do the learning and take challenges. If they were not paid for the learning part, would they first sit through the learning part(tutorials) because the skills developed might help them somewhere else or they would directly jump to challenges round and take up the challenges because thats where the money is. Even the authors mentioned this point in ‘Limitations’.

The results presented were convoluted — too many correlation made with no clear explanation. I wish they had presented their analysis in some sort of visual format. Keeping track of so many percentages was hard, at least for me.

It is normally  interesting and easy to learn basic editing technics such as adjusting brightness, contrast, saturation, etc.  But to make novices interested in learning advanced technic is a test of the learning platform. The stats provided did not answer the question “what percentage of novices actually learnt advanced technics?” One of the results in deployment 2 says only 74% of users completed challenge 1, 57% challenge 2 and 39% challenge 3 with no explanation on why so less percentage of people continued till challenge 3 and what percentage of novices completed each challenge.

I am also not convinced with their measure of “usefulness”. Any image, even with basic editing, usually looks better than original and as per the definition, it work will get the highest “usefulness” rating. I wish they had a fourth measure in their scale, say 4, which depended on what kind of technics were used. The definition of “novelty” looked flawed too. I mean it works well in this scenario but a crowdsourcing platform like Amazon Mechanical Turk where  workers are used to getting paid for following instructions as nearly as possible may not show much novelty.

With all the issues that are there, still there were a few things I liked. I liked the idea of offering students to practice their skills not through sample use cases but through real scenarios where their effort may benefit someone or something. I also like the LevelUp interface, it is slick. And as I said earlier photoshop may be overwhelming, so an interactive step by step tutorial definitely helps.

Finally, I thought that the skill gained through such tutorials are good only for limited use or, as we have seen in previous discussions, for the task at hand. But without further knowledge or standard recognitions, I doubt how marketable these skills will be outside.

 

Questions

  • Do you also think there was no crowdsourcing aspect in the paper apart from a few guidelines mentioned in ‘Future work’?
  • Do think the skills developed in similar platforms  can be marketed as advanced skills? How would you change the platform so that the learning here can be used as a professional skill?
  • Do you think the results would have been different if the users were not interested participants but rather MTurkers who were paid participants?

Read More

‘This Is Not a Game’: Immersive Aesthetics and Collective Play – Pro

McGonigal, ‘This Is Not a Game’: Immersive Aesthetics and Collective Play

Dicussion leader (positive discussant): Anamary

Summary:

This paper discusses one instance of how an immersive game can help many people take collective action, and provides an analysis of immersive games and how such games can help map strategies in-game to challenges in the real world.

 

The first portion of the paper discusses Cloudmaker, an online gaming forum for a fictional puzzle game “The Beast”. The game is called an “immersive game” where media like movie trailers, dropped digital clues, like unusual credit attribution, to a rich complex puzzle game with no clear goal or reward. It has 3 core mysteries, 150 characters, with digital and in-person clues, like randomly broadcasting clues to players’ TVs.

 

Gamers play timed puzzles with a massive amount of clues, and solving the clues may deal with anything from programming to the arts, lending itself to be a crowdsourced endeavor. Puzzles meant to be solved in three months were solved in a day. Players were playing all the time, mixing in game elements into the real world, declaring itself that “this is not a game” (TING).

 

What is curious about the community’s 7132 members are their initial reactions to the 9/11 attacks. At the day’s end, the members felt empowered to help solve the mysteries surrounding the attacks, by posting threads like “The Darkest Puzzle” asking for the crowd to help solve the attacks. Many gamers in the crowd mentioned that the virtual game helped shaped their perception of the attacks, and have gained skills to solve the attacks. But the moderators noted that it’s dangerous to connect real-life to a game, and stopped the initial activity.

 

This example brings about two key questions to the piece:

  • What about “the Beast” helped encourage gamers to be confident that they could solve the attacks?
  • What qualities are in the Cloudmaker forum that helped gamers forget the reality of the situation and to debate between whether the game is virtual or real?

The second part answers these two questions. One key aspect about these TING games is that gamers are unsure what parts of real-life are a game and what are not, and this effect was so prevalent that so much so that gamers’ relationships, careers, and social lives were hampered by The Beast. Another similar TING game, Push, had a final solution, but many gamers were not satisfied and thought it had continued. Acting is believing, and these players kept on acting and believing in a game.

 

These gamers also developed strategies on these detective TING games that may be applied to crime-solving as well, such as researching sources, researching the sources themselves, do analysis and see if secondary information connects to hypotheses.  Additionally, gamers felt like they were a part of a collective intelligence, mentioning a sense of belonging into a giant think tank.

These key features (immersion, unsure whether in or out of the game, trained on related strategies, and sense of belonging) helped motivate and move a crowd towards problem-solving, which has several optimistic and negative consequences to them. This paper shows the promise of crowds in problem-solving using game design, and how to design games to motivate and retain these crowds to continue puzzle solving for free.

 

Reflection:

McGonigal’s core message that games can bring crowds together to help solve real-world problems, seems incredibly influential. This paper was published in 2003, and to my memory, games back then were a child’s toy that maybe trained kids to be violent. In the public’s mind, games are just a useless escapist hobby. But, this paper’s core message can be seen in crowdsourcing endeavors that promote public good and awareness, like FoldIt, various other examples seen in class, and in other fields that are more focused on problem-solving complex problems, like visual analytics.

Features that helped motivate TING games may be applied to other crowdsourcing endeavors as well. I remember one of our oldest papers, “Online Economies”, discussed how a sense of belonging helped nurture these communities, and this feature can be seen in Cloudmaker. It would be very interesting to see the more immersive features of TING games applied to crowdsourcing.

The gamers in Cloudmaker did not solve the crime, and there are many good reasons for this (protection of the gamers in an unprotected site, blurring of fiction and reality, false accusations afoot). This reminds me back to the Boston Bomber Reddit incident, where redditors made their own subreddit and collectively tried to solve the identity of the bombers. I hope the anti-paper presenter talks about this, but even the author can’t help but discuss the negatives associative with such a crowd solving crimes.

 

Initially, the subreddit was praised for publicizing key evidence, but ended up accusing and defaming many innocent people. I wonder if there are ways for law enforcement to collalborate better with crowds (which I’m sure is currently researched now!). The capabilities of these crowds is still fascinating, that a huge collective of puzzles meant to span 3 months, were all solved in a single day.

 

I loved the philosophical and psychological aspects employed in these games to summon crowds. The Sunken Cost fallacy is one where you keep on sinking money into a failed project, because you did it before. Similarly, these players were obsessed with the game for so long, that they still see everything as a game.

 

Could we map-reduce this kind of larger massive puzzle? Maybe some parts of these crimes could be broken down into smaller ones, but I imagine many aspects are interrelated.

 

I also wonder if augmented or pervasive games may better help gamers distinguish between reality and games and use their power for good. What if foldIt were combined with an always-on game? Much of the games discussed in the paper were geared towards detective-style games, but I wonder if this style could be employed to solve historical puzzles, public awareness challenges or even puzzles by the crowd, like “how much of the $X I paid in taxes went to what of my government?”.

 

Questions:

 

  1. In the cases of both 9/11 and the Boston bomber, the police usually are selective in what evidence is publicized since some evidence is uncertain. What are your thoughts on designing systems that help crowds collaborate better with law enforcement to harness the crowd’s problem-solving skills, in ways to help protect the crowd and prevent false accusations?
  2. Are there strategies to breaking down this problem solving task into smaller ones that crowds can do? Or does the entire crowd need to see the whole task at once to tackle it effectively, like the beast?
  3. Are there real-life puzzles that are important, but are not as life-threatening like crimes? I’m casting puzzles broadly in these questions. There are many complex challenges and issues that have multiple solutions, like wicked problems, that may be framed like a huge puzzle.
    1. How can these crowd-based games help or not help solve such puzzles?
    2. Can these puzzles be both given to the crowds and solved by the crowds? That is, can the crowd both supply and solve the puzzle?

 

Read More

ʻThis Is Not a Gameʼ: Immersive Aesthetics and Collective Play

Jane McGonigal

Summary

This paper describes the concept of immersive gaming.  In order to convey this concept, the author gives an example of an online group of gamers known as the Cloudmakers.  This online group of gamers were a group of people who enjoy games that involve solving puzzles.  As described in the paper, this group proudly adapted the theory of being a collective detective who employed all of the resources at their disposal to solve any mystery/puzzle that was presented to them (no matter how obscure).  It was around this time that a massive immersive game was created that catered to groups of people with similar interests: The Beast.  This game created an effective means of virtual immersion.  The entire point of this game was to make it as close to reality as possible.  The creators of this game went as far as denying the existence of the game itself in order to promote its underlying theme of a conspiracy.  This game’s popularity was stemmed from the fact that it went beyond strictly online gaming and offline lives of its players in order to promote the augmented reality of this game.    It facilitated the need for collaboration among all of its players because it created such a complex network of puzzles that one person alone could not possibly solve all the problems.

Next, this paper gave a brief section on the difference between immersive and pervasive gaming.  Although they have many similar characteristics, these two differ in one fundamental manner: immersive games attempt to disguise the existence of the game to create a more realistic sense of a conspiracy, whereas a pervasive game is promoted and openly marketed to gain attraction.  In addition, immersive games encourage collaboration whereas the (Nokia Game) provided incentives to solvers of the game (which implicitly limited collaboration).  The Beast was a very complex network of puzzles, whereas the Nokia Game was simple enough that a single player could solve the game.

This paper then states some of the side effects of creating such immersive games.  It briefly tells of games as becoming too addictive and could potentially harm peoples’ lives.  However, it also emphasizes the players’ burning desire to keep the game play going and them consistently trying to make a conspiracy when one simply does not exist.  It makes the case that if these players have such a burning desire to solve complex puzzles, why not utilize their expertise and intelligence on real world problems?  Instead of fabricating conspiracies, why not apply them to the problems that governments currently face in order to come up with solutions?

Reflections

This paper was very interesting because I didn’t know that communities such as this existed.  I have heard of clans and groups forming in MMORPG’s such as World of Warcraft, however, never a game whose sole purpose was to be disguised so much as to make players question whether it was “not just a game”.  I appreciate the fact that the author pointed out some of the downfalls of this type of gaming.  These types of games can become highly addictive and cause massive amounts of personal damage to the gamers’ lives.  In addition, it can help to create a sense of paranoia to an already flustered society that we currently live in.   The fact that players are so willing to jump into the flames to solve any problem that is being thrown at them means that they may be manipulated at any point to solve real world problems without them ever knowing it.  However, this seems to be a double-edged sword.  If communities such as the Cloudmakers were put to solve a real-world task, they might stumble upon something that was not meant for the public, causing mass hysteria and/or as Rheingold stated: create a mob mentality.  I know that this example is a bit of a stretch, but one could almost consider Anonymous roughly similar to Cloudmaker.  They are a group of hackers/activists who are actively working on solving a problem and/or uncovering some truth that is meant to stay hidden.  I believe that if we were to employ these type of games, it could quickly turn into a form of attack.  For example, if there was a task published to hack into company X’s website (part of the game), and the players succeeded, this could potentially cause much harm to the company.  But who would be the person to get blamed? Would it be the person who got tricked into hacking the website in the first place or the pseudo game designer that left a vague clue that may or may not be interpreted.  This paper stated that the online community is very intelligent and that it greatly surpassed the game-maker’s expectations, if this intelligence was put to malicious use, it could have some potentially disastrous results. A great example of this could be the users of Reddit who falsely accused someone of being behind the Boston bombing.

Questions

  1. How do you draw a line to distinguish game from reality
  2. Should such an addicting type of game be banned
  3. Is it wise to employ online intelligence to solve sensitive problems
  4. Wouldn’t this create a constant sense of paranoia and eventually lose faith in the government?

 

Read More

To Play or not to Play: Interactions between Response Quality and Task Complexity in Games and Paid Crowdsourcing

Krause, M., Kizilcec, R.: To Play or not to Play: Interactions between Response Quality and Task Complexity in Games and Paid Crowdsourcing. Conference on Human Computation and Crowdsourcing, San Diego, USA (2015)

Pro Discussion Leader: Adam

Summary

This paper takes a look at how the quality of paid crowdsourcing compares to the quality of crowdsourcing games. There has been a lot of research comparing expert work to non-expert crowdsourcing quality. The choice is a trade off between price and quality, with expert work costing more but having higher quality. You may need to pay more non-expert crowdworkers to get a comparable level of quality to a single paid expert worker. This same trade off may exist with paid versus game crowd work. There has been research in the cost of making a crowdsourcing take gamified, but nothing comparing the quality of game to paid crowd work.

The authors achieve this by creating two tasks, a simple one and a complex one, and create a game version and paid version for the crowd to complete. The simple task is a simple image labeling task. Images were found by searching Google image search on 160 common nouns. Then for each image, the nouns on the webpage the image was found are tallied. More frequently occurring nouns are considered more relevant to the image. For the game version, they modeled it after Family Feud, so that more relevant labels were given more points. The paid task was simple in that it asked workers to provide a keyword for an image.

In addition to the simple task, the authors wanted to see how quality compared in a more complicated task. To do this, they created a second task that had participants look at a website and provide a question that could be answered by the content of the website. The game version is modeled after Jeopardy, with high point values assigned for more complex websites. The paid version had a predetermined order the tasks were completed in, but in both cases participants completed all of the same tasks.

Overall, the quality of the game tasks was significantly higher than the paid tasks. However, when broken down between simple and complex tasks, only the complex task had significantly higher quality for the game version. The quality for the complex task was about 18% higher for the game task, as rated by the selected judges. The authors suggest one reason for this is the selectiveness of game players. Paid workers will do any task as long as it pays them well, but game players will only play the games that actually appeal to them. So only people really interested in the complex task played the game, leading to higher engagement and quality.

Reflections

Gamification has the potential to generate a lot of useful data from crowd work. While one of the benefits of gamification is that you don’t have to pay workers, it still has a cost. Creating an interesting enough game is not an easy or cheap process, and the authors take that into consideration when framing game crowdsourcing. They are essentially comparing game participants to expert crowd work. It has the potential to generate higher quality work, but at a higher cost than paid non-expert crowd work. However, there difference is that with the game, it’s more of a fixed cost. So if you need to collect massive amounts of data, the cost of creating the game may be amortized to the point where it’s cheaper than paid non-expert work, with at least of high if not higher quality.

I really like how the authors try to account for any possible confounding variables between game and paid crowdsourcing tasks. We’ve seen from some of our previous papers and discussions that feedback can play a large role in the quality of crowd work. It was very important that the authors considered this by incorporating feedback into the paid tasks. This provides much more legitimacy to the results found in the paper.

There is also a simplicity to this paper that makes it very easy to understand and buy into. The authors don’t create too many different variables to study. They use a between-subjects design to avoid any cross-over effects. And there analysis is definitive. There were enough participants to give them statistically significant results and meaningful findings. The paper wasn’t weighed down with statistical calculations like some papers are. They keep the most complicated statistical discussion to two short paragraphs to appease any statisticians who might question their results, but their calculations for the comparisons of quality between the two conditions is very straightforward.

Questions

  • Games have a fixed cost for creation, but are there any other costs that should be considered when deciding whether to go the route of game crowdsourcing versus paid crowdsourcing?
  • Other than instantaneous feedback, are there any other variables that could affect the quality between paid and game crowd work?
  • Was there any other analysis the others should have performed or any other variables that should have been considered, or were the results convincing enough?

Read More

A Critique of: “To Play or not to Play: Interactions between Response Quality and Task Complexity in Games and Paid Crowdsourcing”

R. K. Markus Krause, “To Play or not to Play: Interactions between Response Quality and Task Complexity in Games and Paid Crowdsourcing,” 2015.

Devil’s advocate: Will Ellis

Summary

In this paper, Krause and Kizilcec ask the research questions, “Is the response quality higher in games or in paid crowdsourcing?” and, “How does task complexity influence the difference in response quality between games an paid crowdsourcing?” To answer these questions, the authors devise and carry out an experiment where they test four experimental treatments between 1,262 study participants. Each experimental group has either a simple or complex task set to perform and either performs the task set as a web browser game or as paid crowdwork. As participants self-selected for each treatment and were sourced from online resources—Newgrounds and Kongregate in the case of players and Crowdflower in the case of workers—rather than recruited from a general population and assigned an experimental treatment, the number of participants in each group varies widely. However, for each group, 50 participants were selected at random for analysis.

The authors employed human judges to analyze the quality of responses of the selected participants and used this data to form their conclusions. The simple task consisted of labeling images. Authors employed the ESP game as the gamefied version of this task, having participants earn points by guessing the most-submitted labels for a particular image. Paid crowdworkers were simply given instructions to label each image and were given feedback on their performance. The complex task consisted of participants generating “questions” to given text excerpts, which was meant to mimic the game show Jeopardy. In fact, authors employed a Jeopardy-like interface in the gamefied version of the task. Players selected text excerpts with a particular category and difficulty from a table, and attempted to generate questions, which were automatically graded for quality (though not “ground truth”). On the other hand, paid crowdworkers were given each text in turn and asked to create a question for each. Answers were evaluated in the same automated way as the gamefied task, and workers were given feedback with the opportunity to revise their answers.

In their analysis of their data, authors found that while there was no statistically significant difference in quality between players and workers for the simple task, there was a statistically significant 18% increase in response quality for players over workers for the complex task. Authors posit that the reason for this difference is that, since players choose to play the game, they are interested in the task itself for its entertainment quality. Workers, on the other hand, choose to do the task for monetary reward and are less interested in the quality of their answers. While it is easier to produce quality work for simple tasks with little engagement in the work, higher quality work for complex tasks can be achieved by gamefying such tasks and recruiting interested players.

Critique

The authors’ conclusions rest in large part on data gathered from the two complex task experiments, which ask participants to form Jeopardy-style “questions” as “answers” to small article excerpts. This is supposed to contrast to the simple task experiments using the ESP game, which was developed as a method for doing the work of labeling pictures. However, the authors do not give justification that the Jeopardy game, serving as the complex task experimental condition, is an appropriate contrast to the ESP game.

The ESP game employs as its central mechanic an adaptation of Family Feud-style word guessing. It is a tried and true game mechanic with the benefit that it can be harnessed for the useful work of labeling images with keywords, as was discussed in [Ahn and Dabbish, 2004]. On the surface, the authors’ use of the Jeopardy game mechanic seems similar, but I believe they’ve failed to use it appropriately in two ways that ultimately weaken their conclusions. Firstly, the mechanic itself seems poorly adapted to the work. A text excerpt from an article is not a Jeopardy-style “answer”, and one need only read the examples in the paper to see the “questions” that participants produce based on those answers make no sense in the “Jeopardy” context. Such gameplay did induce engagement in self-selected players, producing quality answers in the process, but it should not be surprising that, in the absence of the game, this tortured game mechanic failed to induce engagement in workers and, thus, failed to produce answers of quality equal to that of the entertainment incentive experimental condition.

This leads into what I believe is the second shortcoming of the experiment, which is that the complex task, as paid work, is unclear and produces nothing of clear value, both of which likely erode worker engagement. Put yourself in the position of someone playing the game version of this task, and assume that, after a few questions, you find it fun enough to keep playing. You figure out the strategies that allow you to achieve higher scores, you perform better, and your engagement is reinforced. Now put yourself in the position of a worker. You’re asked to, in the style of Jeopardy, “Please write down a question for which you think the shown article is a good response for.” From the paper, it’s clear you’re not then presented with a “Jeopardy”-style answer but instead the first sentence of a news article. This is not analogous to answering a Jeopardy question, and what you may write has no clear or even deducible purpose. It is little wonder that, in an effort to complete the task, bewildered workers would try to only do what is necessary to get their work approved. Compare this to coming up with a keyword for an image, as in the simple paid experimental condition. In that task, what is expected is much clearer, and even a modestly computer-literate worker could suppose the benefit of their work is improving the labeling of images. In short, while it may indeed be the simplicity of a task that induces paid workers to produce higher quality work and the difficulty of a task that causes them to produce lower quality work, this experiment may only show that workers produce lower quality work for confusing and seemingly pointless tasks. A better approach may be to, as with the ESP game, turn complex work into a game instead of trying to turn a game into complex work.

Read More

Show Me the Money! An Analysis of Project Updates during Crowdfunding Campaigns

Xu, A., Yang, X., Rao, H., Huang, S.W., Fu, W.-T., Bailey, B.P.: Show me the Money! An Analysis of Project Updates during Crowdfunding Campaigns. In: Accepted: The Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Toronto, Canada (2014)

Discussion Leader: Mauricio

Summary

This paper presents an analysis on project updates for crowdfunding campaigns and the role they play in the outcome of the campaign. Project updates’ original intent is to be a form of communication from project creators to keep funders aware of the progress of the campaign. The authors analyzed the content and usage patterns of updates on Kickstarter campaigns, and elaborated a taxonomy of seven types of updates. Furthermore, they found that specific uses of updates had stronger associations with campaign success than the actual project’s description. They conclude the paper by talking about design implications for designers of crowdfunding systems in order to better support the use of updates.

The authors sampled 8,529 campaigns and found that the chance of success of a project without updates was 32.6% vs. 58.7% when the project had updates. By analyzing how creators use updates, they identified the following themes in updates: Progress Report, Social Promotion, New Content, New Reward, Answer Questions, Reminders, and Appreciation. In their study, they collected 21,234 publicly available updates, and then proceeded to assign themes to these updates.

They also divided campaign duration in three phases: initial, middle, and final; and each update was assigned to one of them. Taking into account the theme of the update and when the update was posted, they arrived at very interesting findings. Reminder updates offered the most significant influence when it comes to campaign success and Answer Questions updates had the least influence. New Reward Updates were more likely to increase the chance of success than New Content updates. These two kinds of updates indicate that the project creators have revised the project in some way; so this shows that offering new rewards is more effective than changing the project itself. Looking into the representation of the project, they found that update representation is more predictive of success than the representation of the project page. In terms of timing, they found that high number of Social Promotion updates in the initial phase, high number of Progress Report updates in the middle phase, and high number of New Reward updates in the final phase all are positively correlated with success.

Finally, the authors discuss design implications for crowdfunding systems to better support campaigns. They suggest that these systems should provide templates for each of the types of updates available. They also mention that these platforms should offer guidance to project creators so that they better elaborate their updates, e.g., provide update guidelines, allow creators to learn from prior successful examples, help creators develop strategies for advertising their campaigns, and guide creators as to when to post what type of update.

Reflections

This paper shows a very interesting take on crowdsourcing campaigns. Though prior work shows that project representation is important, the authors point that more emphasis should be put into the representation, themes, and timing of updates for campaign success. I think this is very interesting because crowdsourcing platforms don’t put too much emphasis on updates. For example, for Kickstarter, their top rule for success is to create a project video on the project page. Though they found that number of updates was higher in successful campaigns than unsuccessful campaigns, I do wonder if there can be “too many” updates and if this can lead to campaign failure. It would be interesting to see if a very high number of updates can become annoying to funders to the point of causing a negative correlation with campaign success; and if so, what type of updates are the ones that annoy people the most. I imagine it can be very difficult to design an experiment around this, since researchers would have to complete the proposed project if they get the funding.

One of the most interesting finding that the authors arrived at is the difference between posting New Reward and New Content updates. New Content updates are for changes in the project itself; this can be viewed as improving the product to attract customers. New Reward updates are for new rewards to attract funders; this can be viewed as offering discounts to attract customers. When the authors first posted the question of which one would be more effective (before arriving to their results), I thought that New Content updates would be more effective, as I saw New Rewards as a form of desperation by the project creators to try to reach their funding goal, which would show that the project is not going well. But I was proven wrong, as New Reward updates were shown to be more likely to increase the chance of success. This seems to indicate that, since people already pledged to the content of the project, they are not really interested in more new content, but in new rewards. However, according to their findings, there were more New Content than New Reward updates. Project creators, therefore, would need to focus more on revising reward levels when it comes to having more chances of success.

In addition, for New Reward updates, a high number of updates in the final phase was positively correlated with campaign success. One reason could be that the initial reward offered served as a reference point, and additional rewards change funders’ perceptions and affects their pledge decisions. I think this is related to the “anchoring effect”, which refers to the human tendency to rely heavily on the first piece of information in making subsequent judgments.

I also like the design implications that they elaborated, but I wonder if they can become too much of a burden for crowdfunding platforms to implement. In addition, it can also become an annoyance for the project creators, as being prompted when to post what kind of update, being given guidelines as to what and when to say things in social media, etc., can become too intrusive.

Questions

  • Do you think that a crowdfunding campaign can provide “too many” updates? If so, what type of updates should creators avoid posting in high numbers and high frequency?
  • If you were to start a crowdfunding campaign to fund the project related to your research or your project for this class, what types of updates and rewards would you give your funders and potential funders?
  • From the perspective of crowdfunding platforms such as Kickstarter, do you think it is worth it to implement all the design implications mentioned in this paper?
  • If you have contributed to a crowdfunding campaign in the past, what were the reasons that you contributed? And, did the updates the creators provided influenced you one way or the other?

Read More

Understanding the Role of Community in Crowdfunding Work

Hui, Greenberg, Gerber. “Understanding the Role of Community in Crowdfunding Work” Proceedings of the 17th ACM Conference on Computer supported cooperative work & social computing. ACM, 2014.

Discussion Leader: Sanchit

Crowdsourcing example: Ushahidi – Website

Summary:

This paper discusses several popular crowdfunding platforms and the common interactions that project designers have with  the crowd in order for them to get properly funded and supported. The authors describe crowdfunding as a practice designed to solicit financial support from a distributed network of several hundred to thousands of supporters on the internet. The practice is a type of entrepreneurial work in that they both require “discovery, evaluation, and exploitation of opportunities to introduce novel products, services, and organizations”. With crowdfunding, a niche has to be discovered and evaluated so the target crowd can be convinced to provide financial support in return for a novel product or service that would benefit both the supporters and the project initiator.

Crowdfunding in recent times is almost entirely dependent on online communities like Facebook, Twitter and Reddit. The authors talk about the importance of having a large online presence because word of mouth through the internet travels much faster than any other medium. By personally reaching out to people in social media, project creators allow a trustworthy relationship to develop between the crowd and them and this can lead to more people funding the project.

The authors conducted a survey of 47 crowdfunding project creators that ranged from a variety of different project ideas and backgrounds. Some creators ended up having successful crowdfunding projects and made a good margin to continue developing and distributing their proposed product. Others weren’t as lucky since some people lacked a strong online presence which turns out to be one of the most important aspects of having a successful crowdfunding project.

According to the authors, a crowdfunding project requires five tasks in the project’s lifespan. (1) Preparing the initial campaign design and ideas, (2) testing the campaign material, (3) publicizing the project the public through social media, (4) following through with project promises and goals, and (5) giving back to the crowdfunding community. It turns out that coming up with a novel idea or product is a very small portion of the entire story of crowdfunding. The process of designing an appealing campaign was very daunting for several creators because they had never worked with video editing or design software before. Ideas for design and promotion mostly came from inspiration blogs and even paid mentors. Testing these campaign ideas was done through an internal network of supporters and some even skipped the step to instead gain feedback when they eventually got supporters. Publicizing depended largely on weather or not the product got picked up by a popular news source or social media account. If creators got lucky, they would have enough funding to support their project and be able to deliver the product to the supporters. However, even this task was difficult for the majority of the creators who were working alone on the project and didn’t have enough resources to add additional people for assistance. Lastly, almost all creators wished to give back to the crowdfunding community by funding projects that their supporters create in the future or by providing advice to future crowdfunding creators.

 

Reflection:

Overall, I thought the paper was a fairly straightforward summary and overview of what happens behind-the-scenes in a crowdfunding project. I have personally seen several Kickstarter campaigns for cool and nifty gadgets primarily through Reddit or Facebook. This shows that unless someone actively looks for crowdsourcing projects, a majority of these projects are stumbled upon through social media websites by other people. Popularity plays a huge part in the success of a crowdfunding project and it makes perfect sense that it does. Having a product that is popular amongst a majority of people will become funded quicker, so creating a product and convincing campaign associated with it is equally important. These social engineering tasks aren’t everyone’s cup of tea though. I can totally relate to the author’s comments on artistic people having a better fundraising background than scientific researchers which allows them to create a much more convincing campaign and have a very forward approach in trying to recruit support using social media platforms. These skills aren’t really drilled into researchers to convince peers that their research is important since their work should speak for itself.

While reading through the paper I also noticed how much additional baggage and onus one has to take responsibility for in order to get a project funded. Creating videos, posters, graphics, t-shirts, gifts and eventually/hopefully delivering the final product to customers is a very demanding process. It’s no wonder that some of these people spend part-time job hours just maintaining their online presence. I personally don’t see this being used as a primary source of income because there is way too much overhead and risk involved to expect any sort of reasonable payback. This is especially true when most of the funded money is used for creating and delivering product and then eventually giving back the money to other community projects. With crowdsourcing platforms such as Amazon MTurk, there is at least a guarantee that some amount of money will be made, no matter how small. If you play the game smart, then at the very least it’s easy beer money. With crowdfunding, a project gaining enormous traction, let alone reaching its goal is a big gamble dependent on a lot of other variable factors than just pure objective work-skill.

The tools and websites designed to aid crowdfunding campaigns are definitely helpful and are honestly expected to exist at this point. Whenever there is a crowd-based technology, I feel like Reddit immediately forms a subreddit dedicated to it and there is constant chatter, suggestions and ideas for success. Similarly, people who want to help themselves and others develop tools to make project development easier and stress free. These tools and forums are great places for general advice, but I agree with the authors in that it is not personal. The idea of having an MTurk based feedback system for crowdfunding campaigns is a brilliant and easy-to-implement one. Just linking the project page and asking for feedback for a higher than average cost will provide a lot of detailed suggestions to help convince future supporters to fund a project.

Overall, the idea of crowdfunding is great, but I wish the paper touched on the fees that Kickstarter and some other crowdfunding platforms take to provide this service to people. It is a cost that people should consider when and if deciding to start a crowdfunding project no matter how big or small.

Discussion:

  1. Have you guys contributed to crowdfunding projects? Or ever created a project? Any interesting project ideas that you found?
  2. Do you agree with the occupational gap the author hinted at? i.e. Artistic project creators have an easier time than scientific project creators for crowdfunding.
  3. Thoughts on having incentives for donating or funding a larger amount than other people? Good idea or will people be skeptical of the success of the project regardless and still donate the minimum amount?
  4. Would you use Kickstarter to donate for poverty/disaster-stricken areas than donate to a reputable charity? There are several donation based projects and I wonder why people would trust those more than charities.

Read More