Improving Crowd Innovation with Expert Facilitation

Chan et al., “Improving Crowd Innovation with Expert Facilitation” CSCW’16

Discussion Leader (con): Nai-Ching

Summary

Although crowdsourcing has been shown to be useful for creativity tasks, the quality of creativity is still an issue. This paper demonstrates that the quality of crowdsourced creativity tasks can be improved by introducing experienced facilitators in real time work setting. The facilitators produce inspirations that are expected to facilitate the ideation. To measure the quality, divergence (fluency and breadth of search), convergence (depth of search) and creative outcomes (rated creativity of ideas) are used. The result from first experiment shows that with the help of experienced facilitators, both the number of generated ideas and max creativity of the output increase. The result of second experiment reveals that with novice/inexperienced facilitators, the creativity of the output is reduced. To further analyze the causes/reasons of the difference, the authors code the strategies that are used to generate the inspirations into categories including “Examples”, “Simulations” and “Inquiries”. While “Examples” and “Inquires” do not have significant effects on the output, “Simulations” are highly associated with higher max creativity of ideas. The authors also point out that the different intentions of experienced and novice facilitators might attribute to the different results of facilitation. The experienced facilitators tend to actually do the facilitating job while the inexperienced facilitators are more inclined to do the ideating job

 

Reflections

It seems to be contradictory that the paper first mentions that popularity and “rich get richer” effects might not be actual innovative potential but later on the facilitation dashboard, the keywords are sized by frequency which seems to be just another form of popularity.

It is not clear about the interaction between ideators and the “inspire me” function before the facilitator enters any inspiration. If there is no inspiration available, is the button disabled? And how do ideators know if there is new inspiration? Also, do facilitators know if ideators request inspiration? I think the “inspire me” function should help keep the workers and lower the attrition rate but based on the results, there is no significant difference between facilitated and unfacilitated conditions.

In addition, the increased creativity only happens at max creativity not including mean creativity. One the one hand, It makes sense as the authors argue that what innovators really care about is increasing the number of exceptional ideas and since it is more likely to get higher creativity with proper facilitation or say proper facilitation increases the potential of getting higher creativity, the proper facilitation is a good technique. On the other hand, it also shows the technique might not be reliable enough to avoid the manual effort of going through all the generated ideas to pick out the good ones (max creativity). This paper also reminds me of an earlier paper we discussed, “Distributed analogical idea generation: inventing with crowds”, which mainly increases the mean creativity and the change of max creativity is not reported. It might be possible to combine both techniques to both increase mean and max creativity of ideas.

It also seems to me that in addition to soliciting more ideas, keeping a good balance between divergence and convergence is also very important but I didn’t see in the future work section that it is important/helpful to show information of current breadth and depth of idea/solution space to the facilitator to help him/her divert the direction of inspirations.

It is interesting to see that one of the themes in ideators’ comments about inspirations provoking new frames of thinking about the problem but actually there is no significant difference of breadth between facilitated and unfacilitated conditions. So I wonder how general the theme is.

Questions

  • What reasons do you think cause the discrepancy between user perception and actual measurement of breadth search in the solution space?
  • What is the analogy between the technique from this paper and the technique from “Distributed analogical idea generation: inventing with crowds”?
  • Can most people appreciate creativity? That is, if a lot of people say something is creative, is it creative? Or if something is creative, do most people think it creative as well?

Read More

Bringing semantics into focus using visual abstraction

Zitnick, C. Lawrence, and Devi Parikh. “Bringing semantics into focus using visual abstraction.” Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on. IEEE, 2013.

Discussion Leader: Nai-Ching Wang

Summary

To solve the problem of relating visual information and linguistic semantics of an image, the paper proposes to start studying with abstract images instead of real images to avoid complexity and low-level noise in real images. By using abstract images, it makes it possible to generate and reproduce same or similar images depending on the need of study while it is nearly impossible to do so with real images. This paper demonstrates this strength of using abstract images by recruiting different crowd users on Amazon Mechanical Turk to 1) create 1002 abstract images, 2) describe the created abstract images and 3) generate 10 images (from different crowd users) for each description. With this process, images with similar linguistic semantic meaning are then produced because they are created from the same description. Because the parameters of creation of the abstract images are known (or can be detected easily), the paper is able to find semantic importance of visual features derived from occurrence, person attributes, co-occurrence, spatial location and depth ordering of the objects in the images. The results also show that suggested important features have better recall than using low-level image features such as GIST and SPM. This paper also shows that visual features are highly related to text used to describe the images.

Reflections

Even though crowdsourcing is not the main focus of the paper, it is very interesting to see how crowdsourcing can be used and be helpful in other research fields. I really like the idea of generating different images with similar linguistic semantic meaning to find important features that determine the similarity of linguistic semantic meaning. It might be interesting to see the opposite way of study, that is, generating different descriptions with similar/same images.

For the crowdsourcing part, the quality control is not discussed in the paper probably due to its focus but it would be surprising if there was no quality control of the results from crowd workers during the study because as we discussed during class, we know maximizing compensation within a certain amount of time is an important goal for crowdsourcing markets such as Amazon Mechanical Turk. As we can imagine how to achieve that goal by submitting very short description and random placement of clip art. In addition, if multiple descriptions are required for one image, then how is the final description selected?

I can also see other crowdsourcing topics related to the study in the paper. It would be interesting to see how different workflows might affect the results. For example, ask the same crowd worker to do all the three stages vs. different crowd workers for different stages vs. different crowd workers to work collaboratively. With the setting, we might be able to find individual difference and/or social consensus in linguistic semantic meaning. In section 6, it seems to me that this part is somewhat similar to the ESP game and the words might be constrained to some types based on the need of research.

Overall, I think this paper is a very good example to show how we can leverage human computation along with algorithmic computation to understand the human cognition.

Questions

  • Do you think in real images, the reader of the images will be distracted by other complex features such that the importance of some features will decrease?
  • As for the workflow, what are the strengths and drawbacks of using same crowd users to do all the 3 stages vs. using different crowd users for different stages?
  • How do you do the quality control of the produced images with descriptions? For example, how do you make sure the description is legitimate for the given image?
  • If we want to turn the crowdsourcing part into a game, how will you do it?

Read More

Distributed Analogical Idea Generation: Inventing with Crowds

Lixiu Yu, Aniket Kittur, and Robert E. Kraut. 2014. Distributed analogical idea generation: inventing with crowds. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’14). ACM, New York, NY, USA, 1245-1254.

 

Discussion Leader: Nai-Ching Wang

Summary

This paper introduces a 4-step process, distributed analogical idea generation (identify examples, generate schemas, identify new domains and generate new ideas), to increase the possibility of production of creativity by introducing analogical transfer. There are two issues of current ways of producing new and good ideas which use quantity to exchange quality. The first issue is that rewards are usually only given to the best ideas ignoring contribution made by other participants. The other issue is that the exchange of quantity to quality is usually not stable and inefficient because we do not know how many is good enough. This paper uses three experiments to test the effectiveness of the proposed process. The result of the first experiment shows the quality (composed of practicality, usefulness, novelty) of creativity generation is better with expert-produced schemas. The result of second one shows the number of similar examples increases the quality of induced schemas from the crowd while contrasting examples are not as useful. The result of the third one shows different qualities of schemas produced in Exp. 2 affect the last step, idea generation. The three experiments confirm that the proposed process leads to better ideas than example-based methods.

Reflections

This paper starts to address the “winner takes all” issue we have been discussing in class, especially for the design/creativity domain. It seems that we now have a better way to evaluate/appreciate each person’s contribution and decrease unnecessary/inefficient effort. In general, I like the design of the three experiments, each of which deals with a specific aspect of the overall study. In experiment 3, it is shown that good schemas will help produce better ideas. It will be interesting to see how good the experimenter-generated schemas are, especially when we can compare the quality in terms of scores to the results of experiment 2. Unfortunately, this information is not available in the paper. The distributed process presented in the paper is very impressive because it decomposes a larger process into several smaller components that can be operated separately. It would be interesting if there is a comparison of idea quality between traditional way and the way used in the paper. It would also be interesting to see the quality between assembly line and artisan processes because the latter might provide learning opportunities and thus provide higher quality results although the process is not as flexible/distributed as assembly line.

Questions

  • What are the benefits/shortcomings for the raters to discuss or be trained for judging?
  • Do you think design guidelines/heuristics are similar to schemas mentioned in the paper? How similar/different?
  • In experiment 3, what reasons do you think there is an example associated with either a good or bad schema? Why not just use good or bad schemas?
  • This paper mostly focuses on average quality. For creativity tasks, do you think that is a reasonable measure?

Read More

Crowd synthesis: extracting categories and clusters from complex data

Paul André, Aniket Kittur, and Steven P. Dow. 2014. Crowd synthesis: extracting categories and clusters from complex data. In Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing (CSCW ’14). ACM, New York, NY, USA, 989-998.

Discussion Leader: Nai-Ching Wang

Summary

This paper proposes a two-stage approach to guide crowd workers to produce accurate and useful categories from unorganized and ill-structured text data. Although there are automatic techniques available already to group text data in terms of topics, manual labeling is still required for inferring meaningful concepts by analysts. Assuming crowd workers to be transient, inexpert and conflicting, this work is motivated by two major challenges of harnessing crowd members to synthesize complex tasks. One is to produce expert work without requiring domain knowledge. The other one is to enforce global constraints with only local views available to the crowd workers. The proposed approach deals with the former challenge by introducing re-representation stage, which consists of different combinations of classification, context and comparison including raw text, classification (Label 1), classification+context (Label 10) and comparison/grouping. The latter challenge is coped with by introducing an iterative clustering stage, which shows existing work (categories) to subsequent crowd workers to enforce global constraints. The results show that classification with context (Label 10) approach produces the most accurate categories with most useful level of abstraction.

Reflections

This paper resonates our discussion about human and algorithmic computation pretty well because it points out why humans are required in the synthesis process and algorithmic computation was really used to demonstrate this point. This paper also mentions potential conflicts among crowd workers but as we can see in this paper that there are also conflicts between the professionals (the two raters). This makes me wonder if there are really right answers. Unfortunately, this paper does not include comparisons among crowd workers’ work to understand how conflicting their answers are. It would also be interesting to see and compare the consistencies of experts and crowd workers. Another interesting result is that the raw text condition is almost as good as the classification plus context condition except for the quality of abstraction. It feels that by combining previously-discussed person-centric strategies, the raw text condition might perform as well as the classification plus context condition or even outperforms it. In addition, the choice of 10 items for context and grouping at Stage A seems arbitrary. Based on the results, it seems more context hints better results but is that true? Or there is a best/least amount of context? Also, for grouping, the paper also mentions that the selection of groups might (greatly) affect the results so it would be interesting to see how different selections affect the results. As for the results of coarse-grained recall, it seems strange that the paper does not disclose the original values even though the authors think the result of coarse-grained recall is valuable.

Questions

  • The global constraints are enforced by showing existing categories to subsequent workers. How do you think about this idea? Any issues this approach might have? What is your solution?
  • The paper seems to hint that characters in the labels can be used to measure levels of concepts. Do you agree? Why? What else measures will you suggest for defining levels of concepts?
  • How will you expect quality control to be conducted?

Read More

We Are Dynamo: Overcoming Stalling and Friction in Collective Action for Crowd Workers

Niloufar Salehi, Lilly C. Irani, Michael S. Bernstein, Ali Alkhatib, Eva Ogbe, Kristy Milland, and Clickhappier. 2015. We Are Dynamo: Overcoming Stalling and Friction in Collective Action for Crowd Workers. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI ’15). ACM, New York, NY, USA, 1621-1630. DOI=10.1145/2702123.2702508 http://doi.acm.org/10.1145/2702123.2702508

Discussion leader: Nai-Ching Wang

Summary

This paper proposes structured labor to overcome two obstacles impeding collective action through a year-long ethnographic field study. By participating deeply in the ecology of human computation instead of being outside observers, the authors find out although there were already forums to help better work condition, there are still several challenges preventing further collective action such as trust, privacy, risk of loss of jobs, diverse purposes of working. To support collective action in the Amazon Mechanical Turk community, Dynamo is designed to provide three affordances including trust and privacy, assembling a public, and mobilizing. With several cases on Dynamo platform, the authors identify two intertwined threats, stalling and friction, for collective action. To overcome the two threats, this paper proposes structured labor including “Debates with deadlines”, “Act and undo”, “Produce Hope”, and “Reflect and Propose” and demonstrates how the structured labor can be used in real cases.

Reflection

I see several connections from this paper to our previous topics and in-class discussions. Based on definitions from Quinn and Bederson’s paper, this paper falls into the categories of social computing and crowdsourcing and thus collective intelligence in the sense that the crowd workers on Amazon Mechanical Turk form an online community and (ideas of) social movements are crowdsourced to the community instead of some designated leaders or consultants. In the discussion of the question “Can we foresee a future crowd workplace in which we would want our children to participate?”, this paper might have provided some possibility for the crowd workers to take collective action to strive for a fair labor marketplace. This paper seems to be a good example of human-computer collaboration, too. Interestingly, Dynamo (the computer) is designed with the concept of affordances naturally in-line with the affordance-based framework discussed previously. Turkers along with admins provide human affordances such as sociocultural awareness, creativity and domain knowledge. Examples in this paper also demonstrate that crowd workers can complete not only micro-tasks as seen from last week’s discussion but also brainstorming ideas and quality writing. In addition, from this paper, it also seems that traditional management is better than algorithmic management for such topic at least for now.

Questions

  • Do you think campaigns in the paper successful? Why or Why not? What else do you think important for success of collective action?
  • For the labor of action (debates with deadlines, act and undo, the production of hope, reflect and propose),
    • Which (part) do you think that might be addressed automatically by computer algorithms? In other words, which parts are the ones that really need human computation (at least for now)?
    • Can these tasks be further divided into smaller tasks?
    • How can that possibly be done?
  • How do we find a trustworthy person to be the “moderator”? Or say, how do we decide if a person is trustworthy enough to be the “moderator”?
  • Can we delegate some moderation to computer? Which part? And how?

Read More