Standing on the Schemas of Giants: Socially Augmented Information Foraging

Paper:

Kittur, A., Peters, A. M., Diriye, A., & Bove, M. (2014). Standing on the Schemas of Giants: Socially Augmented Information Foraging. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing (pp. 999–1010). New York, NY, USA: ACM.

Leader: Emma

Summary

In this article, Aniket Kittur, Andrew M. Peter, Abdigani Diriye and Michael R. Bove describe new methods for usefully collating the “mental schemata” developed by Internet users as they work to make sense of information they gather online. They suggest that it may be useful to integrate these sense-making facilities to the extent that they can be meaningfully articulated and shared. Toward this end, they provide a number of related hypotheses that endorse a social dynamic in the production of frameworks that assist individuals in understanding web content. The authors depart from the presumption that individuals “acquire” and “develop” frameworks (which they usually refer to as “mental schemas”) as they surf the ‘net. They ask: “how can schema acquisition for novices be augmented?,” and to some degree, the rest of the article is a response to this question.

Much of this article is a technical whitepaper of sorts: the authors propose a supplement to the web tool Clipper (several variations of which I found through a Google search — this one seems exemplary: https://chrome.google.com/webstore/detail/clipper/offehabbjkpgdgkfgcmhabkepmoaednl?hl=en ) that incorporates their suspicions about the benefits of the social integration of mental schemas.  As they explain, Clipper is a web add-on (specifically, I think it’s a browser add-on) that appears as an addition to the browser interface. Displayed as a text-input box, Clipper encourages users to share their mental schemas by asking for specific types of information about the content users encounter: “item,” “valence, “dimension” (p. 1000). Here, “item” refers to the object users are researching — the authors use the example of a Canon camera — “dimension” is a feature of the item — the example is picture quality — and “valence” is a sentiment that describes the user’s experience with or opinion of the dimension (like “good” or “bad”). So the phrase “the Canon T2i [item] was good [valence] in terms of picture quality [dimension]” would be a typical Clipper input.

As the authors point out, Clipper initially worked only on an individual user → framework basis. “Users foraged for information completely independently from others,” they note (p. 1000). Their addition to Clipper is “asynchronous social aggregation,” a feature that incorporates dimensions from other users to bolster the usefulness of such a tool. With social aggregation, dimensions can be auto-suggested, and users can have access to a pool of knowledge about the “mental schemas” of so many others as they have similar experiences online. The authors offer that more frequently-input dimensions are generally more valuable in terms of sensemaking, and the augmentation to Clipper that they propose would display and collate information on dimensions according to their popularity.

After this, the authors give contextual background to their perspectives on socially augmented online sensemaking. They review relevant  contemporary research on information seeking, social data, and social and collaborative sensemaking (p. 1001) to support their hypotheses about the usefulness of socially augmenting Clipper. Then, the article moves to a discussion of the interface design and features, which include autocomplete, dimension hints, a workspace pane that hovers over web pages, and a review table where users can see a final view of the clips the user has produced during their web searching activities.

The next part of the article fully describes the multiple hypotheses that underscore the rationale of socially augmenting Clipper. The hypotheses fall into three basic categories: the first is about how the social aggregation of dimensions should lead to overlaps; the second is about the social use and virality of overlapping dimensions; the third is about the objective usefulness and timeliness of this information. The authors then describe the conditions of their experiments with the tool (p. 1004), and provide an assessment of their hypotheses based on this experiment. Overall, their hypotheses proved to be accurate while leaving some room for further research: “our results indicated that the dimensions generated by users showed significant overlap, and that dimensions with more overlap across users were rated as more useful,” they tell us (p. 1008), a prelude to this self-judgment: “our results provide an important step towards a future of distributed sensemaking.” At the end, they acknowledge a number of potential drawbacks, most of which emanate from conditions of variability and subjectivity among users. 

(This is a good place for me to begin my reflection…)

Reflection

This article is very rote and straightforward. (As I mentioned, parts of it read like a technical whitepaper). With that in mind, it’s not the kind of piece that lends itself to strong opinion. If I have any, it’s a mildly negative feeling that is not so much based on the authors’ intentions or the tool’s efficacy as on the presumptions at the core of their method. The notion of a “mental schema” in particular is an under-investigated concept. I’m not sure with what authority they make statements like “users build up mental models and rich knowledge representations that capture the structure of a domain in ways that serve their goals” (p. 999). Obviously they provide citations, but they’re now squarely in the field of psychology, where falsifiable knowledge is elusive and (I’d argue) it is unethical to present this information as fact, at least without further commentary on this. How a “rich knowledge representation” is different from that which simply goes by the name “knowledge” escapes me — honestly, I think it’s a just a convenient conflation. That type of unusual language (and a lot of vaguely-explained jargon) pervades their writing. I dislike it because 1) it offers an air of scientific dignity to some of their claims about the way humans make sense of information, whereas what’s really needed is further exploration of the psychological literature on which it’s based and 2) it’s bad writing. It sounds unnatural and confusing.

Moving away from a basic critique of writing style and language choice — I would have appreciated this more if the authors had gone into further detail about the types of information for which this is useful. I immediately took umbrage at the idea that social data necessarily means improved user experience when making sense of online content. The ethos of “social” and “sharing” underscores the business model of the web, which encourages people to constantly give their (highly profitable) data over to platforms that have a monopoly, and which function largely on network effects. Facebook and Google are as profitable as they are because they emphasize a social dynamic to user interaction, the feeling that the internet is always a community, and to not use these tools would mean being left out of the web experience. So I’m immediately suspicious of tools that simply reproduce this mindset rather than articulating and commenting on it (although I understand that social web use is now so naturalized that my take on may too erudite to be useful in a broad critique). Having said this, on a less penetrating level, I understand where this could be useful. For instance, I appreciate sites like Yelp and user product ratings when shopping online. It’s just that not everything that users do online can be analogized with wanting to make a purchase.

Questions

  1. Based on the part on p. 1003 where they discuss motivational factors in “noticing and using social data:”  why would users want to contribute to this project? Is it the same reason for working on websleuthing projects, Wikipedia, and free/open source software? If not, what are the key differences between all these tools that rely on crowdsourcing knowledge?
  2. For what types of items would this be most appropriate? The authors make frequent reference to a camera, but what about less concrete objects? Are there items that challenge hypotheses such as “dimensions that are shared across more people will be more useful,” and can we theorize why that might be?
  3. What if this leads to a winnowing effect where majority rule effectively pushes people away from domains that they may have been interested in?
  4. What is the relationship between socially augmented information foraging via the Clipper add-on and a) upvoting (à la Reddit and Metafilter, if anyone remembers what that is!) and b) algorithmic social media timeline prioritization (à la Twitter and Facebook)?
  5. Hypothesis 3.2 (p. 1006) states that “The social condition will generate more prototypical and more useful dimensions earlier than the non-social condition.” But what is this usefulness is partially a function of user suggestibility? As an appendage to this point, and a more general meta-comment on this paper — the authors are clearly addressing psychological matters when they discuss “mental schema.” What are the assumptions they are making the way “mental schemas” are created and used, and does this embed a priori bias into the tool?