An Update on NLTK — Web Based, GUI NLTK via WebNLP!

November 30, 2017December 12, 2017 emma

Tool: WebNLP at http://dh.mi.ur.de/

My demonstration tomorrow is essentially a re-do of my earlier NLTK demo. For this round, I’ve been looking at the tool WebNLP. Unfortunately, the website on which it is hosted hasn’t been loading all evening. The white paper that accompanies this tool is very interesting, and reflects some of my experiences trying to work with NLTK as a humanities/social sciences researcher. Hopefully the site will be up again soon, but for everyone’s edification, I thought that a blog post about the WebNLP white paper would be productive.

Here’s the paper: https://www.researchgate.net/publication/266394311_WebNLP_-_An_Integrated_Web-Interface_for_Python_NLTK_and_Voyant

This includes a description of WebNLP’s functionalities, but is also a rationale for the development of a web-based GUI NLTK program. The authors write:

Most of these [NLP] tools can be characterized as having a fairly high entry barrier, confronting non-linguists or non-computer scientists with a steep learning curve, due to the fact that available tools are far from offering a smooth user experience (UX)…

The goal of this work [the development of WebNLP] is to provide an easy-to-use interface for the import and processing of natural language data that, at the same time, allows the user to visualize the results in different ways. We suggest that NLP and data analysis should be combined in a single interface, as this enables the user to experiment with different NLP parameters while being able to preview the outcome directly in the visualization component of the tool. (235-236)

They go on to describe how WebNLP works, visualized in this graphic:

Visualization of WebNLP functionality

As we can see, WebNLP joins Python NLTK with the program Voyant to create a user-friendly (i.e. no coding or command line interface requirements) tool for NLP that is sophisticated enough for scholarly research. The fact that it’s web-based seems to be a benefit, too; I’d imagine that a local application would require the user to install Python, which could be problematic.

WebNLP is based on JavaScript and the front-end framework Bootstrap. I don’t know if it’s open source — I couldn’t find it on Github and the paper doesn’t mention that. As far as I can tell, the only place it is hosted is at the link shared above. It doesn’t seem extremely difficult to implement, and given its potential usefulness, I (of course!) think that it should be hosted at a stable site — or that the code should be opened up to allow other to host WebNLP applications and iterate on it. Right now, it is limited to sentence tokenization, part-of-speech tagging, stop-word filter, and lemmatization. I think there is more that NLTK can do. The visualization output, meanwhile, can produce the following: Wordclouds, bubblelines, type frequency lists, scatter plots, relationships and type frequency charts. Even just as an experiment or learning tool, it might be useful to think about how else these data might be visualized.

All this being said, I really do hope http://dh.mi.ur.de/ returns soon. In any case, it’s encouraging to see that something I thought should already exist… already does. And the paper has been useful for my semester-long project.

CREDBANK: A Large-Scale Social Media Corpus with Associated Credibility Annotations

November 27, 2017December 12, 2017 emma

Article: CREDBANK: A Large-Scale Social Media Corpus with Associated Credibility Annotations: https://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/view/10582/10509

Summary

This is the whitepaper for Credbank, a system (which the authors refer to as a “corpus”) for systematically studying the phenomenon of social media as a news source. Credbank specifically investigates Twitter: it relies on real-time tracking and “intelligent routing” of tweets to crowdsourced human annotators to determine Tweet credibility. The trial of Credbank assessed in the paper took place over three months, and comprised “more than 60M tweets grouped into 1049 real-world events, each annotated by 30 Amazon Mechanical Turk workers for credibility (along with their rationales for choosing their annotations)” (258). Tanushee Mitra and Eric Gilbert correctly note that credibility assessment has received a great deal of attention in recent years, and the paper pays due respect to the other work done in this arena. Toward the end of their “related work” section they note that their contribution is unique in its mobilization of real-time analysis.

The bulk of the paper describes their method of collecting and analyzing Twitter data in real time, beginning with a pre-processing schema that screens tweets through tokenization, stop word and spam removal processes. This is key to their use of LDA (latent dirichlet allocation), which finds similarities between various word strings (in this case, tweets) and inductively generates topic models from them. Humans intervene in this process quickly thereafter: MTurk workers are used to confirm whether the tweets gathered actually relate to a newsworthy event; they mention that purely computational approaches often lead to false positives (261). The authors include information on their understanding of what counts as measurably “credible” (262) before disclosing that, in the process of running these trials, they also discovered the number of MTurkers necessary to approximate an expert’s judgment. That number is 30 per event (263). Through their statistical analyses of events annotated by 1,736 Turkers, they arrive at the conclusion that — basically — events discussed on Twitter have an alarmingly low rate of credibility: the highest percentage of agreement on the “certain accuracy” of tweets stood at 50% (for 95% of tweets), and the percentage of tweets / percentage of agreement-on-accuracy ratio followed the same pattern (only 55% of tweets had 80% certain accuracy agreement) (264).

The authors conclude with a macroscopic assessment on factors implicated in current and future research on this topic. This includes the temporal dynamics — it seems reoccuring events, such as sports events, had a lower overall credibility — the role of social network and mass media in impacting credibility ratings, the viability of a distribution-based (normal curve) model of credibility, other strategies used to confirm credibility, and the role that supplementary data may play.

Analysis

The authors ostensibly use the term “corpus” because they believe that the major contribution of Credbank is the dataset. Although the dataset is perhaps the more obviously practical offering, their methodology — the combination of theory and practice, well-explicated in the steps they take to arrive at their data — seems to be the most instructive for those interested in advancing knowledge on crowdsourcing and social media-as-news credibility in a more general sense. To me, Credbank is not so much a dataset is an example of theory in practice. Its shortcomings have implications for the way assumptions about human expression on social media may require more careful consideration before being operationalized in systems lke the one seen here.

Their use of topic modeling/LDA seems notable to me, and a place where we can use the outcomes (evidently, tweets aren’t very credible) to tweak the theoretical assumptions. I think they may want to revisit their use of tokenization and stop words in order to account for “the nuances associated with finding a single unique credibility label for an item,” a problem that they believe impacts the viability of credibility to be modeled along a normal curve.

Questions

Given our prior discussions about social media credibility as a news source, what is different about Credbank? Is there anything specific to its functionality that makes you think it more or less trustworthy?
What do we think about the functions utilized in the data preprocessing (their methods of spam removal, tokenization and stop words)? Can we identify any way in which this might affect the system to deleterious effect?
To return to a prior issue, since it comes up in this paper: what do we think about the use of financial incentives here? Could this taint the annotations?
They frequently discuss the use of “experts” here, but do not identify who they are. Do we see this as a weakness of the paper — and perhaps more interestingly, are there any real experts in this arena?
Is there a way to crowdsource credibility annotation of tweets that does not rely on inductive preprocessing? I would suggest that tokenization, stop words and other filters distorts the assessment of tweets to the point where this system can never be functionally practical for the purposes of real social sciences research..

Journalists as Crowdsourcerers: Responding to Crisis by Reporting with a Crowd

October 30, 2017December 12, 2017 emma

Article: “Journalists as Crowdsourcerers: Responding to Crisis by Reporting with a Crowd:” https://link.springer.com/article/10.1007%2Fs10606-014-9208-z

Summary

This article is about professional journalists in a community deeply affected by Hurricane Irene, and how their role developed from that of typical journalists to leaders of a crowdsourced communication and self-advocacy movement among a population in crisis. This process was shaped by the rural surroundings (the Catskill Mountains of upstate New York) of the journalists and the community they served. In this case, the extenuating problems of uneven IT infrastructures were deeply implicated in the way they developed ad-hoc, on-the-fly content, as well as the development of provisional ethical guidelines about journalism and reporting. A major innovation from this team is their conceptualization of “human powered mesh network,” which introduces humans as “nodes” in a peer-to-peer mesh IT infrastructure.

Social media became central to the emphasis on humans in the network. In their explanation of the role of social media in emergency situations, it becomes clear that the concept of “human infrastructure” — of which the human-powered mesh network is a subcategory — could not exist without social media. Platforms that connect individuals across the Internet and gain power vis-á-vis network effects create the conditions for this “human infrastructure.”

The authors give a detailed account of how Hurricane Irene affected the Catskills region before turning to an exploration of how local journalists used the Internet during this time. They describe the use of live-blogging — real time reporting and data-sharing by journalists who had started an online local news site called The Watershed Post. The platform that developed in the wake of Hurricane Irene out of a real-time news feed on the Watershed Post — powered by software called CoverItLive and simply called “the Liveblog” — became essential to the dissemination of emergency-response information. In their methodology section, which followed the situational description, they explain that they arrived at a blend of qualitative analyses of journalist interviews and the digital record of the Liveblog.

In both the Liveblog and through speaking with the journalists, the importance of social media and amplification of the message via popular platforms like Twitter is evident. The Watershed Post editors established a presence for their Liveblog on Twitter At one point, this message is posted out on the Liveblog:

If we lose power during the storm, this live feed will continue to automatically pull in Twitter updates from other local newspapers, but we’ll be unable to post (for obvious reasons). Cross your fingers.

And since they did indeed lose power, this became critical. The redundant, always-on chain of communication supported by social media infrastructure allowed the Liveblog to balloon out. Guest comments on the blog also rose to prominence as a major source of information. It got to the point that moderating these comments became a task in and of itself, and moderators began to assume the role of public-facing authorities and gatekeepers in a situation where accuracy is of the essence.

At the end, the authors note a discrepancy between the presumptions of HCI researchers and the way information flowed organically in this case (and, by extrapolation, how it might flow in similar situations — they note the Virginia Tech tragedy in 2007 as an analogous example). Resisting strong fidelity to one side or another led them toward the hybrid concept of the human-powered mesh network: insights from both can inform our thinking on this in our roles journalists, citizens, technologists, and (more often than not these days ) some blend of the three.

Analysis

Prefatory note: this hit pretty close to home for me, because I was living in the Catskill Mountains when Hurricane Irene happened, and my roommate was working on building mesh networks. He was constantly frustrated by the geographical barriers to peer-to-peer networking (i.e., mountains!). Also, his work was disrupted when we had to evacuate our apartment…

This isn’t just a gratuitous anecdote. The human/”meatspace” (including geographical/topographical/old-infrastructure concerns) factor in IT innovations often gets left out of the conversation when we look at their revolutionary potential. Yet in this case (as is often the case with non-IT inventions), where crisis was the mother of invention, the need to design for complex and worst-case scenarios gave traction to the development of a concept (human-powered mesh networks). This has a wide span of applications — and will possibly become more important as mobile technology proliferates, and facility with disaster management and response may become more important (if we believe the science on climate change).

That’s why I think crowdsourcing is a rich topic: it is, basically, a technological concept that needs human actors en masse in order to work. With that will always come required considerations that emphasize multiplicity. Although the developers behind some major softwares don’t have to design their tech to be as accessible as possible, we can’t talk about (for example) mTurk without discussing ethics and accessibility, since it needs to work for lots of different kinds of people. Likewise, human-powered mesh networks place unique requirements on the way we think about human/physical-world factors. In emphasizing the crowd, the multiple, it necessarily becomes a more just conversation.

In a case where accuracy could mean life or death, trust is absolutely essential. The authors indicate this toward the end of their discussion. Although we have, in this class, looked toward automated means of fact-checking and verification, I’d like to propose something a bit different. The writers make this observation:

“Human infrastructuring in this case also included developing, adapting and communicating shared practices, and establishing a shared sense of ownership within the collaboration.”

With a shared sense of ownership comes (at least in theory) a shared responsibility to be accountable for the impact of the information you spread. The human-powered mesh network, and those who adopt roles of relative power within it — as comment moderators, contributors to maps, or those who Tweet/post a lot on blogs, etc — runs on the assumption of ethics and good faith among those who participate. Automating fact-checking and information accuracy is one thing, but in focusing on how we can give this role to computers, perhaps we forget that networks of humans — both before and after the digital turn — have a decent track record in spreading reliable information when it really matters.

Questions

How does infrastructure topography change the way we think about the spread of information? Does the fact that access to Internet necessities (like electricity, working computers, and of course Internet itself) varies globally make it impossible to have a global conversation about concepts like the ones proposed in this article?
In times of crisis, would you rather automate fact-checking or rely on humans?
Since they discuss how useful the crowdsourced map was — are some forms of data representation better suited to crowdsourcing than others? For example, is it better (broadly construing “better;” it can mean easier, more efficient, more accurate, and so on) to crowdsource a map, or other imagistic data representations, than textual information? Does this change based on the context — e.g, it may be more effective to crowdsource textual information not in a time of emergency?
What do we think of “infrastructure as a verb,” (“to infrastructure”), the notion that infrastructure is a constant, active process rather than a static object? What implications does this reframing have for HCI?

“So You’ve Been Publicly Shamed,” Chapter 4

October 10, 2017December 12, 2017 emma

Summary

Chapter Four of Jon Ronson’s So You’ve Been Publicly Shamed begins with the story of Justine Sacco. Sacco was a public relations specialist working in New York City. In December 2013, on a flight to South Africa, she sent out the following message to her Twitter audience (roughly 170 people):

“Going to Africa. Hope I don’t get AIDS. Just kidding. I’m white!” (68)

Ronson correctly observed that this tweet was offensive and badly-worded, but — as he tells the story of what happened to Sacco afterwards — he makes it clear that he does not think it was hate speech. Put simply, that message destroyed Sacco’s life. Not long after she posted it, she became subject to a moral trial by hundreds of thousands of people across the world. She was immediately charged with racism and insensitivity to the AIDS crisis, and her Twitter profile — along with other information about her available online — was ransacked for further evidence of her moral shortcomings. At some point, the tweet was retweeted by Gawker media journalist Sam Biddle to 15,000 followers.

This tweet cost Sacco her career (at the time, she was employed at what she identified as her “dream job”) and led to personal invasions of the nth degree. For example, she had boarded a plane immediately after tweeting; by the time she touched down in South Africa, there was already a stranger waiting for her at the airport to snap her photo. Google searches for her name jumped from 30 or so a month to over a million (71). She had to take refuge in her apartment, essentially going into hiding.

Ronson emphasized that at the time of the book’s writing (in 2015), Sacco was reluctant to speak with journalists. She was afraid of being further misunderstood. It seems that his explicit sympathy toward her — and willingness to paint her as a human being who was not racist or sociopathic, but merely, perhaps, with bad taste in humor — is the only reason he was able to interview her for the book.

The author’s interest in the Sacco case led him to sit down with a man named Ted Poe. After telling Sacco’s story, Ronson recounts his time with Poe, a legal prosecutor notorious for serving absurd and arguably over-the-top punishments to defendants. These sentences were specifically designed to shame people. For example, one of Poe’s punishments stipulated that a young man who killed two people in a drunk driving incident walk around with a sign declaring his crime in front of high schools and bars once a month for two years (p. 82).

Intriguingly, while Ronson had expected Poe to be an absolute monster, he found the prosecutor’s explanation of these unusual punishments to be “annoyingly convincing” (86). Perhaps even more interesting is the fact that the young man who was sentenced to self-shame for manslaughter later came to be grateful for the punishment. By his assessment, Poe had saved him from a lifetime of incarceration and given his life purpose by facilitating his ability to serve as a warning (87). The chapter concludes with a discussion between Poe and Ronson on the fact that shaming based on trial-by-Internet is much worse than legally authorized shaming.

Analysis

This book couldn’t have come too soon. Two personal contacts of mine who work in news media have both been effectively kicked off Twitter for badly-worded commentary, charged with moral indiscretion — even though they’re both a far cry from the hate speech-mongers that pervade the Internet and somehow never get policed. I mention this specifically because I myself am not that close to the media; I’m not uniquely disposed to having personal familiarity with this scenario. Many of us probably have first or secondhand experiences with online shaming to various degrees.

While Justine Sacco should have been a bit more wise, she certainly didn’t deserve what happened to her. The fact is, on some level, her shamers probably knew that. At some point Ronson points out that some among the angry mob must have known that the decisive Tweet did not emanate from xenophobia, but was a shoddy attempt at poking fun of white privilege. He writes that “people [must have chosen] to willfully misunderstand it for some reason” (74). We can speculate as to why so many people would indulge this vindictive mentality— they get to feel like they’re a part of something, and perhaps as if though they’re on moral high ground. It’s the pleasure of righteousness, perhaps. What is key to me, here, is that the Internet uniquely empowers this impulse. Which makes me wonder if it reveals a secret about human nature, a tendency toward mob mentalities that most of us would prefer not to think about.

Along these lines, Ronson observes that many are willing to get on board with appearance rather than reality: “It didn’t matter if she was a privileged racist, as long as she sort of seemed like she was.” By now, popular discourse on “truthiness” (thank you, Stephen Colbert) has brought this situation to light. But the simply acknowledgement of the Internet falsehood-machine doesn’t mean that it’s being dismantled. If anything, it’s becoming more powerful. Our current President has used lack of falsifiability to his advantage — charges of “fake news!” to discredit fiat media have a tangible impact on the public, and our understanding of politics. Ronson points out that through Twitter and other platforms, “every day a new person emerges as a magnificent hero or a sickening villain” (78-79). Under these conditions, nuance is often lost, and communication about subtle and complex issues breaks down. I’m not sure that anything even remotely important can be declared on Twitter unless the speaker is willing to fight a war.

Questions:

Is shame-based punishment ever acceptable? In what contexts — online, in a court of law, among close friends and family? Is it more acceptable for certain types of offense than others?
Should popular web platforms, especially Twitter, take a more active role in policing hate speech? If they did, would this help to stymy misplaced outrage?
Do you think there’s a way to automate detection of hate speech online, and would this be desirable?
Is the “justice system” of the Internet really as lawless as Ted Poe says it is, or are there discernible “rules” (patterns) we can follow to mitigate the potential of being shamed or harassed online?

NLTK: The Natural Language Toolkit

October 4, 2017December 12, 2017 emma

The Natural Language Toolkit (NLTK) is a suite of function libraries for the Python programming language. Each of these are designed to work with natural language corpora — bodies of text generated from human speech or writing. NLTK assists natural language processing (NLP), which the NLTK developers define broadly as “computer manipulation of natural language.”

Uses for the various libraries contained within the NLTK suite include, but are not limited to: data mining, data modeling, building natural-language based algorithms and exploring large natural language corpora. The NLTK site notes the following: “[NLTK] provides basic classes for representing data relevant to natural language processing; standard interfaces for performing tasks such as part-of-speech tagging, syntactic parsing, and text classification; and standard implementations for each task which can be combined to solve complex problems.” Moreover, there is extensive documentation that “covers every module, class and function in the toolkit, specifying parameters and giving examples of usage.”

I discovered NLTK because I am interested in automated sentiment and valence analysis. Having had a very brief exposure to Python, I was aware that is has a gentle learning curve, and have been confident that I can learn enough Python to use some of its more simple sentiment analysis functions. However, I’m rusty with programming, and I keep running into roadblocks. Ultimately my goal is to use one or more of NLTK’s sentiment analyses libraries to explore certain natural language datasets, although I’m not there yet.

However, I can give some instructions and screenshots from what I’ve found along the way.

Getting started with NLTK:

Download and install Python. The latest Python installation packages can be found at python.org, which includes OS-specific instructions.
Download and install NLTK from NLTK.org, which also includes OS-specific instructions.
For reference, there is a free introductory and practice-oriented book on NLTK here: http://www.nltk.org/book/
The final step I can advise on is to import data and select the library from within the NLTK suite to work on it.

I myself haven’t gotten this far yet. It was difficult for me to install NLTK on my Mac and Linux machines using the NLTK instructions given for Unix-based systems. I think I have the program properly installed, but I’m not sure.

For further demonstrative purposes, however, here are screenshots of the documentation for two NLTK sentiment analysis tools. The first is from “Sentiment Analyzer,” which was developed to be broadly applicable in NLP, and and the second two are from “Vader,” designed to work on text from social media.

Documentation from NLTK Sentiment Analyzer

Documentation from NLTK library "Vader" — Documentation from NLTK library “Vader” (1)

Okay, I think that’s all I have for now. I intend to keep working with NLTK throughout the semester — so if anybody is skilled with Python, or interested in valence/sentiment analysis, it would be great to talk with you about this!

Shodan (tech demo)

September 24, 2017December 12, 2017 emma

Tool: Shodan. www.shodan.io

Shodan is arguably the most invasive tool we’ve encountered so far. In essence, it is a search engine for Internet-connected devices. Its sources are HTTP/HTTPS, FTP(port 21), SSH (port 22), Telnet (port 23), SNMP (port 161), SIP (port 5060),and Real Time Streaming Protocol (which is where things get unambiguously creepy). To my knowledge, the ports listed are all the defaults associated with those protocols.

The types of data it gathers include information about the device that it sends back to the client— including IP address, type of server, and code documents associated with the device (I personally found a lot of HTML text documents). Shodan finds this by scanning the Internet for publicly open or unsecured devices, and then providing a search engine interface to access this information. Users without Shodan accounts, which are free, can see up to ten search results; those with accounts get up to fifty. For further access, you need to pay a fee and provide a reason for use.

The “reason for use” is pretty key. From the vast array of online articles that have been published about Shodan since its launch in 2013, one gets two distinct pictures of Shodan: in the first, this is a tool that assists law enforcement officials, researchers (broadly construed) and business professionals interested in learning about how their products are being used. In the second, it’s a way to get unauthorized access to all sorts of information, including live webcam streams and other obviously invasive flows of information. It was very, very easy for me to use Shodan to access what I believe to be security cameras inside personal residences. Shodan also offers an open API to allow other tools to access its entire database.

Here’s how to get started:

Sign up for an account at shodan.io. (All you need is an email address).
Use the search bar at the top of the screen to input a query. Anything can go here, although for those just curious to see what Shodan can do, a geographical location or a type of device seems to make sense. Searching for “webcam” will indeed pull up live webcam streams, as well as information about the camera.
(Well, 2.5). If you’re out of search query ideas, the “Explore” feature will pull up popular search terms.

That’s pretty much it!

In the space of a few minutes, I was able to spy through a Norwegian weather camera, into a hospital in Taiwan, what appeared to be an office in Russia — where I watched two bored-looking employees have a conversation — into a few houses, and in an MIT dorm room. As it is, I only got video, not audio, although Real Time Streaming Protocol appears to support audio as well. That could have been the way the cameras work.

The legality of this is questionable. But in the words of a tech-savvy friend I talked to about this, “if you’re not in the blackmail business, you probably won’t arouse any suspicion.”

I will reserve further commentary for now.

Standing on the Schemas of Giants: Socially Augmented Information Foraging

September 17, 2017December 12, 2017 emma

Paper:

Kittur, A., Peters, A. M., Diriye, A., & Bove, M. (2014). Standing on the Schemas of Giants: Socially Augmented Information Foraging. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing (pp. 999–1010). New York, NY, USA: ACM.

Leader: Emma

Summary

In this article, Aniket Kittur, Andrew M. Peter, Abdigani Diriye and Michael R. Bove describe new methods for usefully collating the “mental schemata” developed by Internet users as they work to make sense of information they gather online. They suggest that it may be useful to integrate these sense-making facilities to the extent that they can be meaningfully articulated and shared. Toward this end, they provide a number of related hypotheses that endorse a social dynamic in the production of frameworks that assist individuals in understanding web content. The authors depart from the presumption that individuals “acquire” and “develop” frameworks (which they usually refer to as “mental schemas”) as they surf the ‘net. They ask: “how can schema acquisition for novices be augmented?,” and to some degree, the rest of the article is a response to this question.

Much of this article is a technical whitepaper of sorts: the authors propose a supplement to the web tool Clipper (several variations of which I found through a Google search — this one seems exemplary: https://chrome.google.com/webstore/detail/clipper/offehabbjkpgdgkfgcmhabkepmoaednl?hl=en ) that incorporates their suspicions about the benefits of the social integration of mental schemas. As they explain, Clipper is a web add-on (specifically, I think it’s a browser add-on) that appears as an addition to the browser interface. Displayed as a text-input box, Clipper encourages users to share their mental schemas by asking for specific types of information about the content users encounter: “item,” “valence, “dimension” (p. 1000). Here, “item” refers to the object users are researching — the authors use the example of a Canon camera — “dimension” is a feature of the item — the example is picture quality — and “valence” is a sentiment that describes the user’s experience with or opinion of the dimension (like “good” or “bad”). So the phrase “the Canon T2i [item] was good [valence] in terms of picture quality [dimension]” would be a typical Clipper input.

As the authors point out, Clipper initially worked only on an individual user → framework basis. “Users foraged for information completely independently from others,” they note (p. 1000). Their addition to Clipper is “asynchronous social aggregation,” a feature that incorporates dimensions from other users to bolster the usefulness of such a tool. With social aggregation, dimensions can be auto-suggested, and users can have access to a pool of knowledge about the “mental schemas” of so many others as they have similar experiences online. The authors offer that more frequently-input dimensions are generally more valuable in terms of sensemaking, and the augmentation to Clipper that they propose would display and collate information on dimensions according to their popularity.

After this, the authors give contextual background to their perspectives on socially augmented online sensemaking. They review relevant contemporary research on information seeking, social data, and social and collaborative sensemaking (p. 1001) to support their hypotheses about the usefulness of socially augmenting Clipper. Then, the article moves to a discussion of the interface design and features, which include autocomplete, dimension hints, a workspace pane that hovers over web pages, and a review table where users can see a final view of the clips the user has produced during their web searching activities.

The next part of the article fully describes the multiple hypotheses that underscore the rationale of socially augmenting Clipper. The hypotheses fall into three basic categories: the first is about how the social aggregation of dimensions should lead to overlaps; the second is about the social use and virality of overlapping dimensions; the third is about the objective usefulness and timeliness of this information. The authors then describe the conditions of their experiments with the tool (p. 1004), and provide an assessment of their hypotheses based on this experiment. Overall, their hypotheses proved to be accurate while leaving some room for further research: “our results indicated that the dimensions generated by users showed significant overlap, and that dimensions with more overlap across users were rated as more useful,” they tell us (p. 1008), a prelude to this self-judgment: “our results provide an important step towards a future of distributed sensemaking.” At the end, they acknowledge a number of potential drawbacks, most of which emanate from conditions of variability and subjectivity among users.

(This is a good place for me to begin my reflection…)

Reflection

This article is very rote and straightforward. (As I mentioned, parts of it read like a technical whitepaper). With that in mind, it’s not the kind of piece that lends itself to strong opinion. If I have any, it’s a mildly negative feeling that is not so much based on the authors’ intentions or the tool’s efficacy as on the presumptions at the core of their method. The notion of a “mental schema” in particular is an under-investigated concept. I’m not sure with what authority they make statements like “users build up mental models and rich knowledge representations that capture the structure of a domain in ways that serve their goals” (p. 999). Obviously they provide citations, but they’re now squarely in the field of psychology, where falsifiable knowledge is elusive and (I’d argue) it is unethical to present this information as fact, at least without further commentary on this. How a “rich knowledge representation” is different from that which simply goes by the name “knowledge” escapes me — honestly, I think it’s a just a convenient conflation. That type of unusual language (and a lot of vaguely-explained jargon) pervades their writing. I dislike it because 1) it offers an air of scientific dignity to some of their claims about the way humans make sense of information, whereas what’s really needed is further exploration of the psychological literature on which it’s based and 2) it’s bad writing. It sounds unnatural and confusing.

Moving away from a basic critique of writing style and language choice — I would have appreciated this more if the authors had gone into further detail about the types of information for which this is useful. I immediately took umbrage at the idea that social data necessarily means improved user experience when making sense of online content. The ethos of “social” and “sharing” underscores the business model of the web, which encourages people to constantly give their (highly profitable) data over to platforms that have a monopoly, and which function largely on network effects. Facebook and Google are as profitable as they are because they emphasize a social dynamic to user interaction, the feeling that the internet is always a community, and to not use these tools would mean being left out of the web experience. So I’m immediately suspicious of tools that simply reproduce this mindset rather than articulating and commenting on it (although I understand that social web use is now so naturalized that my take on may too erudite to be useful in a broad critique). Having said this, on a less penetrating level, I understand where this could be useful. For instance, I appreciate sites like Yelp and user product ratings when shopping online. It’s just that not everything that users do online can be analogized with wanting to make a purchase.

Questions

Based on the part on p. 1003 where they discuss motivational factors in “noticing and using social data:” why would users want to contribute to this project? Is it the same reason for working on websleuthing projects, Wikipedia, and free/open source software? If not, what are the key differences between all these tools that rely on crowdsourcing knowledge?
For what types of items would this be most appropriate? The authors make frequent reference to a camera, but what about less concrete objects? Are there items that challenge hypotheses such as “dimensions that are shared across more people will be more useful,” and can we theorize why that might be?
What if this leads to a winnowing effect where majority rule effectively pushes people away from domains that they may have been interested in?
What is the relationship between socially augmented information foraging via the Clipper add-on and a) upvoting (à la Reddit and Metafilter, if anyone remembers what that is!) and b) algorithmic social media timeline prioritization (à la Twitter and Facebook)?
Hypothesis 3.2 (p. 1006) states that “The social condition will generate more prototypical and more useful dimensions earlier than the non-social condition.” But what is this usefulness is partially a function of user suggestibility? As an appendage to this point, and a more general meta-comment on this paper — the authors are clearly addressing psychological matters when they discuss “mental schema.” What are the assumptions they are making the way “mental schemas” are created and used, and does this embed a priori bias into the tool?