Standing on the Schemas of Giants: Socially Augmented Information Foraging

Paper:

Kittur, A., Peters, A. M., Diriye, A., & Bove, M. (2014). Standing on the Schemas of Giants: Socially Augmented Information Foraging. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing (pp. 999–1010). New York, NY, USA: ACM.

Leader: Emma

Summary

In this article, Aniket Kittur, Andrew M. Peter, Abdigani Diriye and Michael R. Bove describe new methods for usefully collating the “mental schemata” developed by Internet users as they work to make sense of information they gather online. They suggest that it may be useful to integrate these sense-making facilities to the extent that they can be meaningfully articulated and shared. Toward this end, they provide a number of related hypotheses that endorse a social dynamic in the production of frameworks that assist individuals in understanding web content. The authors depart from the presumption that individuals “acquire” and “develop” frameworks (which they usually refer to as “mental schemas”) as they surf the ‘net. They ask: “how can schema acquisition for novices be augmented?,” and to some degree, the rest of the article is a response to this question.

Much of this article is a technical whitepaper of sorts: the authors propose a supplement to the web tool Clipper (several variations of which I found through a Google search — this one seems exemplary: https://chrome.google.com/webstore/detail/clipper/offehabbjkpgdgkfgcmhabkepmoaednl?hl=en ) that incorporates their suspicions about the benefits of the social integration of mental schemas.  As they explain, Clipper is a web add-on (specifically, I think it’s a browser add-on) that appears as an addition to the browser interface. Displayed as a text-input box, Clipper encourages users to share their mental schemas by asking for specific types of information about the content users encounter: “item,” “valence, “dimension” (p. 1000). Here, “item” refers to the object users are researching — the authors use the example of a Canon camera — “dimension” is a feature of the item — the example is picture quality — and “valence” is a sentiment that describes the user’s experience with or opinion of the dimension (like “good” or “bad”). So the phrase “the Canon T2i [item] was good [valence] in terms of picture quality [dimension]” would be a typical Clipper input.

As the authors point out, Clipper initially worked only on an individual user → framework basis. “Users foraged for information completely independently from others,” they note (p. 1000). Their addition to Clipper is “asynchronous social aggregation,” a feature that incorporates dimensions from other users to bolster the usefulness of such a tool. With social aggregation, dimensions can be auto-suggested, and users can have access to a pool of knowledge about the “mental schemas” of so many others as they have similar experiences online. The authors offer that more frequently-input dimensions are generally more valuable in terms of sensemaking, and the augmentation to Clipper that they propose would display and collate information on dimensions according to their popularity.

After this, the authors give contextual background to their perspectives on socially augmented online sensemaking. They review relevant  contemporary research on information seeking, social data, and social and collaborative sensemaking (p. 1001) to support their hypotheses about the usefulness of socially augmenting Clipper. Then, the article moves to a discussion of the interface design and features, which include autocomplete, dimension hints, a workspace pane that hovers over web pages, and a review table where users can see a final view of the clips the user has produced during their web searching activities.

The next part of the article fully describes the multiple hypotheses that underscore the rationale of socially augmenting Clipper. The hypotheses fall into three basic categories: the first is about how the social aggregation of dimensions should lead to overlaps; the second is about the social use and virality of overlapping dimensions; the third is about the objective usefulness and timeliness of this information. The authors then describe the conditions of their experiments with the tool (p. 1004), and provide an assessment of their hypotheses based on this experiment. Overall, their hypotheses proved to be accurate while leaving some room for further research: “our results indicated that the dimensions generated by users showed significant overlap, and that dimensions with more overlap across users were rated as more useful,” they tell us (p. 1008), a prelude to this self-judgment: “our results provide an important step towards a future of distributed sensemaking.” At the end, they acknowledge a number of potential drawbacks, most of which emanate from conditions of variability and subjectivity among users. 

(This is a good place for me to begin my reflection…)

Reflection

This article is very rote and straightforward. (As I mentioned, parts of it read like a technical whitepaper). With that in mind, it’s not the kind of piece that lends itself to strong opinion. If I have any, it’s a mildly negative feeling that is not so much based on the authors’ intentions or the tool’s efficacy as on the presumptions at the core of their method. The notion of a “mental schema” in particular is an under-investigated concept. I’m not sure with what authority they make statements like “users build up mental models and rich knowledge representations that capture the structure of a domain in ways that serve their goals” (p. 999). Obviously they provide citations, but they’re now squarely in the field of psychology, where falsifiable knowledge is elusive and (I’d argue) it is unethical to present this information as fact, at least without further commentary on this. How a “rich knowledge representation” is different from that which simply goes by the name “knowledge” escapes me — honestly, I think it’s a just a convenient conflation. That type of unusual language (and a lot of vaguely-explained jargon) pervades their writing. I dislike it because 1) it offers an air of scientific dignity to some of their claims about the way humans make sense of information, whereas what’s really needed is further exploration of the psychological literature on which it’s based and 2) it’s bad writing. It sounds unnatural and confusing.

Moving away from a basic critique of writing style and language choice — I would have appreciated this more if the authors had gone into further detail about the types of information for which this is useful. I immediately took umbrage at the idea that social data necessarily means improved user experience when making sense of online content. The ethos of “social” and “sharing” underscores the business model of the web, which encourages people to constantly give their (highly profitable) data over to platforms that have a monopoly, and which function largely on network effects. Facebook and Google are as profitable as they are because they emphasize a social dynamic to user interaction, the feeling that the internet is always a community, and to not use these tools would mean being left out of the web experience. So I’m immediately suspicious of tools that simply reproduce this mindset rather than articulating and commenting on it (although I understand that social web use is now so naturalized that my take on may too erudite to be useful in a broad critique). Having said this, on a less penetrating level, I understand where this could be useful. For instance, I appreciate sites like Yelp and user product ratings when shopping online. It’s just that not everything that users do online can be analogized with wanting to make a purchase.

Questions

  1. Based on the part on p. 1003 where they discuss motivational factors in “noticing and using social data:”  why would users want to contribute to this project? Is it the same reason for working on websleuthing projects, Wikipedia, and free/open source software? If not, what are the key differences between all these tools that rely on crowdsourcing knowledge?
  2. For what types of items would this be most appropriate? The authors make frequent reference to a camera, but what about less concrete objects? Are there items that challenge hypotheses such as “dimensions that are shared across more people will be more useful,” and can we theorize why that might be?
  3. What if this leads to a winnowing effect where majority rule effectively pushes people away from domains that they may have been interested in?
  4. What is the relationship between socially augmented information foraging via the Clipper add-on and a) upvoting (à la Reddit and Metafilter, if anyone remembers what that is!) and b) algorithmic social media timeline prioritization (à la Twitter and Facebook)?
  5. Hypothesis 3.2 (p. 1006) states that “The social condition will generate more prototypical and more useful dimensions earlier than the non-social condition.” But what is this usefulness is partially a function of user suggestibility? As an appendage to this point, and a more general meta-comment on this paper — the authors are clearly addressing psychological matters when they discuss “mental schema.” What are the assumptions they are making the way “mental schemas” are created and used, and does this embed a priori bias into the tool?

Read More

Mapillary Summary

Mapillary is a startup founded in Sweden in 2013 by Jan Erik Solem, the founder of the facial recognition company, Polar Rose that was eventually acquired by Apple (Lunden, 2016).  The vision behind Mapillary was to create a more open and fluid version of Google’s Street view.  To remedy the limitations of a solo team with a camera rigged to a vehicle, Mapillary created an open platform utilizing the concept of crowdsourcing to build a better, more accurate, and personal map.  In addition to creating a better map, the company analyzes the photo to collect geospatial data.   Mapilliary uses a technology called Structure from Motion to create and reconstruct places in 3D.  By using semantic segmentation on the images, the company seeks to understand what is in the image such as buildings, pedestrians, cars, etc and build that into AI systems.  As of May 2017, their database held over 130 million images all through crowdsourcing that is also being used to train automotive AI systems (Lunden, 2017).

 

Mapillary is determined not to use advertising, but instead will focus on B2B platform to provide information for governments, business, and researchers.  Though researchers may use the data at no charge, commercial entities can purchase Mapillary’s services from $200 or $1000 a month dependent on the amount of data used.  The site list several institutions that have successfully used Mapillary.  The World Bank Transportation and ICT group utilized Mapillary to capture images for a rural accessibility project to evaluate the environment and road conditions remotely.  Westchester County in New York state has used the service to capture their trails to create interactive hikes with their park systems.

To date has mapped over 3 million kilometers of over 170 million images on all seven continents.

 

 

To explore Mapillary:

 

  1. Go to mapillary.com
  2. Create an account by clicking on the “Create Account” button on the lower left side of the page.
  3. Choose to create a Mapillary login or use either Google, Facebook, or OpenStreetMap login.
  4. Once signed in you may explore maps or create maps.
    1. To explore maps:
      1. Zoom in on an area on the map-a green line indicates that area has been mapped.
      2. Go to the magnifying glass on the upper left side of the screen and enter a location.
  • When you have located your area, place your curser on the line and a photo will pop up and the image will locate on the lower left screen. Click the forward or back arrows to move through the images or the play arrow.

 

To contribute to Mapillary:

You may upload an image to the webpage:

  1. Click on the menu arrow by your login name on the upper right screen
  2. Click on Uploads
  3. Click on “Upload Images”
  4. Upload your image according to the options and instructions
  5. Click “Review”
  6. You may click on the dot to see the image.
  7. Zoom into the location and place the dot on the map

Or use your smartphone:

  1. Download the Mapillary app on your smartphone from either Google Play or the App Store.
  2. Sign-in or create an account.
  3. Tap the camera icon
  4. Position the camera so that it is level with the horizon and nothing is obstructing view
  5. Choose your capture option: The automatic capture option will automatically capture images as you move every 5 meters OR use the manual option to capture panorams, objects, and intersections
  6. Tap the Red record button and move either by walking, driving, biking, or whichever means of movement and transportation you prefer.
  7. When done, tap on the exit arrow
  8. Tap the upload icon (the cloud icon)
  9. Upload your images-your images will be uploaded and deleted from your device
  10. Images are then processed by Mapillary
  11. You will receive a notification for when images have been uploaded, edits accepted, comments, or mentions
  12. You’re done!!

 

https://techcrunch.com/2016/03/03/mapillary-raises-8m-to-take-on-googles-street-view-with-crowsourced-photos/

https://techcrunch.com/2017/05/03/mapillary-open-sources-25k-street-level-images-to-train-automotive-ai-systems/

 

Read More

Digital Vigilantism as Weaponisation of Visibility

Paper:

Trottier, D. (2017). Digital vigilantism as weaponisation of visibility. Philosophy & Technology30(1), 55-72.

 

Discussion leader: Lee Lisle

Summary:

This paper explores a new era in vigilantism, where “criminals” are shamed and harassed through digital platforms. These offenders may have parked poorly or planted a bomb, but there is no real verification process. They are harassed through the process known as “doxing,” which is where their personal information is shared publically. The authors term this as “weaponised visibility,” and it can lead to other users on the Internet to harass or threaten the accused in person.

The authors define digital vigilantism and compare it to the more traditional vigilantes before the Internet lowered thresholds. In particular, they use Les Johnston’s six elements of vigilantism and define how digital vigilantism embodies each element. These elements and how they are enacted are in Table 1.

With the link to more traditional vigilantism established, the authors then make the argument that the lowered thresholds of the Internet increase the response to the offender’s acts. Once an idea or movement is released on the Internet, the person who started it is no longer in full control. This lack of a singular leader means the response to the offense is uncontrolled, which further means that the digital campaign can vastly exceed boundaries and have a nonproportional response to the offense. As a corollary, the authors point out that the people who start these campaigns would not be aware of how far the response will go. In fact, in the early stages of the Internet, it was considered a separate place from the real world. As time has gone on, the barriers between the digital and real worlds have decreased in scope and context. The authors point out parallels of cyber-bullying and digital vigilantism, but make the distinction that digital vigilantism occurs when citizens are collectively offended by other citizens.

The authors then point out the differences between state actors and these digital vigilantes. They state that a lowered confidence in state actors such as police is responsible for these coordinated efforts online, which then, in turn, results in less cooperation with state actors. Cyber-bullying and revenge porn are used as examples where the vigilantes are taking action since law-enforcement agencies aren’t.

Next, the authors make a comparison of how state actors and these vigilantes perform surveillance. Digital tools have made surveillance significantly easier, and the public has been shown various results of this, such as the Snowden revelations on government actions. Furthermore, the efforts of digital vigilantism can increase surveillance on private citizens when state actors look at the citizens and see that there’s a DV campaign against them. Also, users can over-share their daily lives over social media, such as detailing their exercise routines or other forms of life-logging. The authors make the point that this can even be used against the users in a DV campaign, since the visibility can lead to more doxing. The authors also write about the concept of “sousveillance,” where a less powerful actor or citizen monitors more powerful actors, such as the state. This can be seen in recordings of police responses. Lastly, the authors point out that pop-culture is likely encouraging occurences of DV. Reality-TV shows often encourage the contestants to try to catch each other engaging in “dishonest or immoral behavior.” This form of entertainment normalizes the concept of surveillance and leads to further efforts in digital vigilantism.

 

Reflections:

This article makes some interesting points about how digital vigilantism is an extension of traditional vigilante efforts. Since the Internet lowers the bar for the creation of what is essentially a mob armed with either facts or pseudo-facts, retaliation happens more easily and is less controlled. However, as this kind of reaction happens more and more frequently, the creators of these mobs should understand their actions more. The statement that DV participants “may not be aware of the actual impact of their actions” seems like less of an excuse as more of these examples come out.

Digital Vigilantism doesn’t always create poor outcomes. In some of their examples, the people targeted by the vigilantes were performing actions that should be illegal. There are now cases where cyber-bullying is a criminal act. Revenge porn is now illegal in 26 states. The digital vigilantism against these actions may have helped create the laws to make them illegal.

Questions:

  • This article, written in 2015, makes the point that white nationalism and the KKK are linked to digital vigilantism. Considering recent events, do you agree that DV has caused (or helped cause) the resurgence of these groups?
  • How do you think reality-TV shows influence the public? Do you agree with the authors statement that it encourages digital vigilantism?
  • In this class, we have gone over several cases where DV’s response has been extremely disproportionate. Are there examples where DV has helped society?
  • The authors point out that law-enforcement can easily see DV campaigns against individuals. Should state actors ignore DV campaigns?  Should they try to contain them?
  • The authors point out the concept of “sousveillance,” where less powerful actors monitor more powerful actors. This can explicitly be seen in the movements to monitor police officers and their interactions with people. What do you think about this kind of DV?

Read More

Digilantism: An analysis of crowdsourcing and the Boston marathon bombings

Paper:

Nhan, J., Huey, L., & Broll, R. (2017). Digilantism: An analysis of crowdsourcing and the Boston marathon bombings. The British Journal of Criminology57(2), 341-361.

Discussion leader: Leanna

Summary:

The article explores digilantism, or crowdsourced web-sleuthing, in the wake of the Boston marathon bombing. They focus on police-citizen collaboration – highlighting various crowdsourcing efforts done by the police and some successes and “failures” of online vigilantism.

The authors theoretically frame their paper around nodal governance – a theoretical marriage between security and social network analysis. In this framing, the authors combine various works. Overall, following the logic of network analysis, the theory understands organizations or security actors as nodes in a decentralized structure. The nodes (or actors) (potentially) have associations and work with each other (edges) within the network, such as police corresponding with private security forces or a Reddit community sharing information to the police. Each node has the potential to have varying degrees (weights) of capital (i.e., economic, political, social, cultural, or symbolic) that can be shared between the nodes.

The authors use threaded discussion as well as thematic analysis to examine various threads from Reddit about the Boston Marathon bombing, coming up with 20 thematic categories. For this paper, the authors are mainly interested in their theme “investigation-related information” (pg. 346). In the results, the authors note that most comments were general in nature. Some sub-themes within the investigation category included 1) public security assets, 2) civilian investigations online, 3) mishandling of clues, and 4) police work online.

The first subcategory—public security assets—discusses the vast professional backgrounds of users on Reddit and their ability to contribute based on this experience and knowledge (e.g., military forensic). In this section, the authors raise the point about the occurrence of parallel investigations and a general lack of communication between the police and web-sleuths (mainly on the part of the police). They speculate this disconnection could stem from the police subculture or legal concerns with incorporating web-sleuths into investigations.

In the next sub-theme—civilian investigations—the authors take note of the unofficial role that Reddit users had in the investigation of the Boston Marathon Bombing. This included identifying photographs of suspects and blast areas, as well as conducting background checks on suspects. Nhan and colleagues referred to this as “virtual crime scene investigation” (pg.350). In this section, the authors expanded upon the silo-effect of parallel investigations. They noted that the relationship between the police and web-sleuths were uni-directional, with users encouraging each other to report to the police with information.

In the third sub-theme—mishandling of clues—the authors focus on two consequences of web-sleuthing: 1) being suspicious of innocent acts; and 2) misidentifying potential suspects. In particular, the authors highlight the fixation of users on people carrying backpacks and the misidentification of Sunil Tripathi as a potential suspect in the bombing.

In the final sub-theme—police work online—the authors highlight police efforts to harness web-sleuths either by providing correct information or by asking people to provide police with videos from the event. The authors noted that this integration of police into the Reddit community was a way to regain control of the situation and the information being spread.

In the conclusion, the authors conclude with various policy recommendations, such as assigning police officers to be moderators on sites such as Reddit or 4Chan. In addition, the authors do acknowledge the geographical and potential cultural differences between their two examples of police crowdsource use (Boston vs. Vancouver). Lastly, the authors again note that the police have not used the expertise of the crowd.

Reflection:

When reading the paper, numerous things came to my mind. Below is a list of some of them:

  1. In the background section, the authors mentioned an article by Ericson and Haggerty (1997) that classifies the four eras of policing: political, reform, community and information. Other authors have defined this fourth era as the national security era (Willard, 2006) or militarized era (Hawdon, 2016). Hawdon (2016) argues in an ASC conference presentation, for example, that a pattern is occurring among the eras (see the first five rows below). In particular, the organizational structure, approach to citizenry, functional purpose and tactical approach of law enforcement flip flops between each era. Thinking forward, I foresee a coming era of crowdsourcing police as a continuation of the pattern Hawdon identifies (see the last row). This style would be decentralized (dispersed among the various actors), clearly integrated into the community, focused on more than law enforcement, and would intervene informally in the community members’ lives (via open-communication online). Therefore, fitting neatly into the cyclical pattern we see in policing (Hawdon, 2016).

 

Era Organizational structure Approach to Citizenry Functional Purpose Tactical Approach
Political (1860-1940) Decentralized Integrated into community Broad Service
Reform (1940-1980) Centralized Distant from community Narrow Legalistic
Community (1980 – 2000) Decentralized Integrated into community Broad Service
Militarized (1990- today) Centralized Distant from community Narrow Legalistic
Crowdsourced (?? – ??) Decentralized Integrated into community Broad Service

Note: functional: “a narrow function (i.e., law enforcement) or serving the community by fulfilling numerous and broad functions” Hawdon (2016) pg. 5; tactical: legalistic = “stresses the law-enforcement function of policing” p.5 service = intervenes frequently in the lives of residents, but officers do so informally” pg. 5.

  1. Nhan and colleagues highlight various “police-citizen collaborations” (pg. 344) with regards to social media, such as crowdsourcing face identification of the 2011 Stanley cup riots and disseminating information via Twitter. But, in many ways, these police engagement in social media appear to lack innovation. The former is like posting wanted photos on telephone poles and the latter disseminating info via a popular newspaper. Yes, the media has changed and therefore the scale of the impact has shifted, but the traditional structure hasn’t changed. The other “police-citizen collaboration” (pg. 344) that was mentioned was collecting information. This is not collaboration. In the example of Facebook, this is simply using the largest repository of available biometric data that people are willing give away for free. It’s becoming the new and improved governmental surveillance dataset, but there is nothing formally collaborative about citizen use of Facebook (even if Facebook might collaborate with law enforcement at times).
  2. The paper is missing crucial details to fully understand the authors’ numerical figures. For example, the authors noted that only a small number of individuals (n=16) appear to be experts. It would have been great to put this figure into context; how many distinct users posted in the amount of posts that were analyzed. Without a larger sense of the total n that the authors are dealing with, assessments of the external validity (generalizability) of the findings becomes difficult.
  3. The authors frame their analysis around nodal governance and the large-scale nodal security network. The guiding theory itself needs to be expanded on. The authors allude to this need but do not make the full connection. In the paper, the police and Reddit are simply being considered as the nodes. Instead, the network needs to acknowledge the organization (e.g., police or Reddit) and also the individual users. This, if my memory serves me correct, is called a multilevel network. In this model, users (nodes in one group) are connected to organizations (nodes in another group) and relationships (or edges) exist between (among actors) and within groups (organizations). The authors allude to this need when mentioning the wide breadth of knowledge and expertise that posters bring when doing web-sleuthing on Reddit, but stop there. Reddit users can be connected to the military (as mentioned) and have access to the capital that that institution brings. These individual users are then connected to two organization structures within the security network.
  4. Lastly, it was not surprising that the authors noted a “mislabelling of innocent actions as suspicious activities” (pg. 353); however, it was surprising that it was underneath the label of “the mishandling of clues” (pg. 353). In addition, the mislabelling of activities is not unique to web sleuths. I was expecting a conversation about mislabeling and its connection to a fearful/risk society. This mislabelling is all around us. Mislabelling happens in schools, for example, when nursery staff think a 4-year-old boy’s drawing of a cucumber is a cooker bomb, when the police think tourists taking photos are terrorists or when police arrest a man thinking his kitty litter was meth.

Questions

  1. Is crowdsourcing the next policing era?
  2. What drives police hesitation for police-citizen collaboration?
  3. Is police reluctance to engage in crowdsourcing harming future innovative methods of crime-fighting or police-community engagement?
  4. What are some ways police can better integrate into the community for investigation?
  5. Does the nodal governance theory fit with the crowdsourcing analysis?

Read More

Galaxy Zoo – A Citizen Science Application

Technology:

Citizen Science Application “Galaxy Zoo”

Demo leader: Lee Lisle

Summary:

Galaxy Zoo is a citizen science application in the Zooiniverse collection of projects. Citizen science is a special category of applications that uses the power of crowds to solve complex science problems that cannot be easily solved by algorithms or computers. There are many different citizen science apps that you can try out on Zooniverse if you want to learn more about this field.

Galaxy Zoo asks its users to classify various pictures of galaxies from pictures from the Sloan Digital Sky Survey, the Cerro Tololo Inter-American Observatory, and the VLT Survey Telescope. Starting in 2007, this project has been so successful that it actually spurred the creation of the entire Zooinverse site. In fact, the Galaxy Zoo team has written 55 different papers from the data they have gathered from the project.

As an example of what they have discovered using crowd-generated data, the team created a new classification of galaxy based on the observation of the citizen scientists. After the workers found a pattern of pea-like entities in many galaxy pictures, the team looked closer at those formations. They found that the formations were essentially young “star factory” galaxies that created new stars much more quickly than older, more established galaxies.

Also, it’s interesting to note that the project started because a professor assigned a grad student to classify 1 million pictures of galaxies. After a grueling 50,000 classifications of these pictures done by one person, the student and professor came up with a solution to leverage the crowd to get this data set organized.

You can also create your own project on Zooniverse to take advantage of their over 1 million “zooite” user base. This is best used for massive datasets that need to be worked on manually. It also uses both intrinsic and extrinsic motivations for users through the benefit of science and giving each user a “score” on how many classifications they have performed.

Demo:

  1. Go to the Zooniverse website.
  2. Register a new account.
  3. Click on “Projects” on the top menu bar to see all of the citizen science apps available. Note that you can also search by category, which is useful if you want to work on a particular field.
  4. To work on specifically Galaxy Zoo, start typing “galaxy zoo” in the name input box on the right side of the screen (under the categories scroll bar).
  5. Click on “Galaxy Zoo” in the auto-complete drop down.
  6. Click on “Begin Classifying.”
  7. Perform classifications! This involves answering the question about the galaxy in the box next to the picture. It may also be helpful at this step to click on “Examples” to get more information about these galaxies.

Read More

Motivation Factors in Crowdsourced Journalism: Social Impact, Social Change, and Peer Learning

Paper:

Aitamurto, T. (2015). Motivation Factors in Crowdsourced Journalism: Social Impact, Social Change, and Peer Learning. International Journal of Communication9(0), 21. http://ijoc.org/index.php/ijoc/article/view/3481/1502

Discussion Leader:  Rifat Sabbir Mansur


Summary:

Crowdsourcing journalism has recently become a more common knowledge-search method among professional journalists where participants contribute to journalistic process by sharing their individual knowledge. Crowdsourcing can be used for both participatory, where the crowd contributes raw materials to a process run by a journalist, and citizen journalism, ordinary people adopt the role of journalist. Unlike outsourcing where the task is relied on a few known experts or sources, crowdsourcing opens up the tasks for anybody to participate voluntarily or for monetary gain. This research paper tries to explain crowd’s motivation factors for crowdsourced journalism based on social psychological theories by asking the two following questions:

  • Why do people contribute to crowdsourced journalism?
  • What manifestations of knowledge do the motivation factors depict?

The author tries to seek the motivation factors in crowdsourcing, commons-based peer production, and citizen journalism using self-determination theory in social psychology. According to the theory human motivations are either intrinsic, done for enjoyment or community-based obligation, or extrinsic, done for direct rewards such as money. The author reviews various papers where she found both forms of motivation factors for crowdsourcing, commons-based peer production, and citizen journalism. In the later part of the paper, the author introduces four journalistic processes which use crowdsources, conducted in-depth interviews with many of the participators in the crowdsourcing, and processed her findings based on that.

The cases the author uses are as following:

  1. Inaccuracies in physics schoolbooks
  2. Quality problems in Finnish products and services
  3. Gender inequalities in Math and Science Education
  4. Home Loan Interest Map

The first 3 stories were published in various magazines and the later one was conducted on Sweden’s leading daily newspaper. The first 3 stories were further categorized to Case A since the same journalists worked on all three stories. The 4th story used a crowdmap where the crowd submitted information about their mortgages and interest rates online. The information were then geographically located and visualized. It became very popular breaking online traffic records for the newspaper.

The author conducted semi-structured interviews with 22 participants in Case A and 5 online participants in Case B. The interview data were then analyzed by Strauss and Corbin’s analytical coding system. With these analyzed data the author presented her findings.

The author posits that based on her findings the main motivation factors for participating in crowdsourced journalism are as follows:

  • Possibility of having an impact
  • Ensuring accuracy and adding diversity for a balanced view
  • Decreasing knowledge and power asymmetries
  • Peer learning
  • Deliberation

The findings from the author shows that the motivations above are mostly intrinsic. Only the motivation for peer learning the participants expressed to have the desire to learn from others’ knowledge and practice their skill, making it extrinsic in nature with its intrinsic nature of a better understanding of others. None of the participants expected any financial compensation for their participation. They rather found themselves rewarded as their voices were heard. The participants also believed monetary compensation could lead to false information. The participation in crowdsourced journalism is mainly volunteer in nature having altruistic motivations. The intrinsic factors in the motivations mostly derived from the participators’ ideology, values, social and self-enhancement drives.

The nature of crowdsource journalism are to some extent different from commons-based peer production, citizen journalism, and other crowdsourcing contexts as the former have neither career advancement nor reputation enhancement unlike the later. Rather the participants perceive journalism as being a part to social change.

The author brings the theories of digital labor abuse and refutes it by showing results suggesting the argument does not fit to the empirical reality of online participation.

The author finally discusses about several limitations of the research and scope for future research using larger samples, numerous cases, and empirical contexts in several countries including both active and passive participators of the crowd.

 

Reflection:

The study in the paper had profound social psychological analysis on the motivations of the participators in crowdsourcing. Unlike prior researches the paper involves itself with motivation factors in voluntary-based crowdsourcing i.e. crowdsourcing without pecuniary rewards. The author also tried to address the motivations on crowdsourcing, commons-based peer production, and citizen journalism separately. This allowed her to dig deeper into the intrinsic and extrinsic drives of the motivations. The author also further classifies intrinsic motivations into 2 factors, such as, enjoyment-based and obligation/community-based.

The study revealed some very interesting points. As the author mentions that having an impact drives participation. This is one of the main motivations in the crowd participators. One specific comment:

“I hope my comment will end up in the story, because we have to change the conditions. Maybe I should go back to the site, to make sure my voice is heard, and comment again?”

I find this comment very interesting because this shows nature of the intrinsic motivation and the unwritten moral obligation the participators feel towards their topic of interest. Here, the participator’s motivation is clearly to bring social change.

Another interesting factor, in my opinion, is that volunteering involves sharing one’s fortune (e.g., products, knowledge, and skills) to protect oneself from feeling guilty about being fortunate. The author mentions this as the protective function that drives volunteer work.

In my opinion, one of the clear good sides of crowd participation is developing a more accurate picture of the topic and offering multiple perspectives. The role of filling the knowledge gaps in a particular topic in the journalists’ understanding helps build a more accurate and full picture. It also provides a check for yellow journalism. This also allows participators to contribute with multiple perspectives creating diverse views about controversial topics.

What I found interesting is that the participants did not expect financial compensation for their participation in crowdsourcing. On the contrary they believed if this effort was monetarily compensated, it could actually be dangerous and skew participation. However, with pecuniary rewards the tasks draws a different group of crowd who are more aware of the responsibilities. This might actually encourage people to be more attentive participators and more aware about their comments/remarks.

Another interesting notion of the paper is that, the participants in this study did not expect reciprocity in the form of knowledge exchange. This characteristic, in my opinion, could arise the situation of firmly holding onto one’s false belief. The fact that the participators want to be a part of a social change they can be disheartened if their volunteer efforts were not appropriately addressed in the journalistic process.

I liked the author’s endeavor to address the differences and similarities between motivations in crowdsourced journalism.  In crowdsourced journalism, the crowd contributes only small pieces of raw material to a journalist to consider in the story process. Cumulatively, it can produce a bigger picture of an issue. By this way participators of the crowdsourcing can be a contributing part of a social change with their respective atomic inputs.

The limitations of the study, however, has great significance. The author mentions that it is possible that only those participants who had a positive experience with crowdsourcing accepted the interview request for the study. This might have caused the motivations in the study to be more intrinsic and altruistic in nature. With a different and widespread sample, the study might reveal some more interesting factors of human psychology.

 

Questions:

  1. What do you think about the differences between voluntary and reward-based crowdsourcing in terms of social impact?
  2. What do you think about the effects of citizen journalism on professional media journalism?
  3. Given the limitations, do you the case studies had adequate data to back up its findings?
  4. What do you think the future holds about the moderation of crowdsourcing?
  5. The study suggests a wide variety of crowd contribution like crowd-mapping, citizen journalism, common-based peer production, etc. How do you think we can develop systems to better handle the crowd’s contribution?

Read More

Emerging Journalistic Verification Practices Concerning Social Media

Paper:
Brandtzaeg, P. B., Lüders, M., Spangenberg, J., Rath-Wiggins, L., & Følstad, A. (2016). Emerging Journalistic Verification Practices Concerning Social Media. Journalism Practice, 10(3), 323–342.
https://doi.org/10.1080/17512786.2015.1020331

Discussion Leader: Md Momen Bhuiyan

Summary:
Social media contents have recently been used widely as a primary source of news. In United States 49 percent of the people get breaking news from social media. One study found that 96 percent of the UK journalists use social media everyday. This paper tries to characterize journalistic values, needs and practices concerning the verification process of social media content and sources. Major contribution of this paper is the requirement analysis from a user perspective for the verification of social media content.

The authors use a qualitative approach to find answers to several questions like how journalists identify contributor, how they verify content and what the obstacles for verification are. By interviewing 24 journalists working with social media in major news organizations in Europe they divided verification practices into five categories. Firstly, if a content is published by a trusted source like popular news organization, Police, fire department, politician, celebrity etc. , they are usually considered reliable. Secondly, journalists use social media to get in touch with eyewitnesses. The reliability of the eyewitness is verified by checking if a trusted organization follows him and by their previous record. They also have to check if there are conflicting stories. However, journalists prefer to use traditional methods like direct contact with people. Furthermore, for multimodal contents like text, picture, audio, video etc. they usually use different tools like Google, NameChecker, Google Reverse Image Search, TinEye, Google Maps, Streetview etc. But they have huge gap in knowledge about these tools. Finally if they cannot verify a content they use workaround like disclaimers.

By looking into user group characteristics of the journalists and their context the authors find several potential user requirements for verification tools. They need efficient and easy to use tool to verify content. It has to organize huge amount of data and make sense of them. They also need it to be integrated into their current workflow. The tool need to offer high-speed verification and publication and accessibility from different types of devices. Another requirement is that the journalists need to understand how verification takes place. Furthermore, it needs to support verification of multimodal contents.

Finally the authors discuss limitations for both the study sample and findings. In spite of limitations this study provides a valuable basis for requirement for verification process of social media content.

Reflection:
Although the study made good contribution regarding requirements of verification tools for news organizations, it has several short comings. The study sample was taken from several countries and several organizations, but they don’t include any major organizations. Which begs the question how does major organizations like BBC, CNN, AP, Reuters verify social media contents? How do they define trusted sources? How do they follow private citizen?

The study also doesn’t make much comparison between younger and older journalists and how thier verification process differs. It was noted that young and female journalists have better experience with technologies. But the study doesn’t look if there are differences in thier respective verification process. All in all, further research is necessary to address these question.

Questions:
1. Can verification tools help gain public trust in news media?
2. What are the limitations of verification tools for multimodal content?
3. Can AI automate verification process?
4. Can journalism be replaced by AI?

Read More

The Verification Handbook

Paper:

Chapters 1, 3, 6, and 7 of:
Silverman, C. (Ed.). (2014). The Verification Handbook: A Definitive Guide to Verifying Digital Content for Emergency Coverage. Retrieved from The Verification Handbook
Discussion Leader: Lawrence Warren
 
Summary:
“This book is a guide to help everyone gain the skills and knowledge necessary to work together during critical events to separate news from noise, and ultimately to improve the quality of information available in our society, when it matters most.”

Chapter 1: When Emergency News Breaks

 This section of the book dealt with the perpetuation of rumors whenever a disaster strikes. According to the 8 1/2 Laws of Rumor Spread it is easy to get a good rumor going when we are already anxious about a situation. This problem existed long before the current world of high speed networks and social media and has become a serious thorn in the sides of information verification associates. People at times intentionally spread false rumors at times to be apart of the hot topic and to bolster attention to a social media account or cause which adds yet another layer of problems for information verification. This epidemic is intensified during actual times of crisis when lives hang within the balance of having the correct information. One would think the easiest way to verify data is for professionals to be the ones to disperse information, but the problem is that many times an eye witness will observe a situation long before an actual journalist, and at times a journalist may not have access to the things which are seen first hand. People rely on official sources to provide accurate information in a timely fashion while simultaneously those agencies rely on ordinary people to help source information as well as bring it to context.

Chapter 3: Verifying User Generated Content

The art of gathering news has been transformed by two significant developments; mobile technology and the ever developing social network. In 2013 it was reported that over half of phones sold were smartphones which meant several ordinary people had the capability of recording incidents and taking them to any number of media outlets to be shared with the world. People normally send things to social media as many do not understand the process of handing something off to a news station and they feel more comfortable within their own network of chosen friends. It is also for this same feeling of security why people normally tune into social media during a breaking news update, which is where some people are fed fake news reports because of malicious users intentionally creating fake pages and sites to create a buzz around false facts. Then there are people who find content and claim it as their own which makes it harder to find the original sources at times of inspection. Verification is a skill which all professionals must have in order to help prevent fake news from circulating and it involves 4 items to check and confirm:

  1. Provenance: Is this the original piece of content?
  2. Source: Who uploaded the content?
  3. Date: When was the content created?
  4. Location: Where was the content created?

Chapter 6: Putting the Human Crowd to Work

Crowd sourcing is by no means a new concept and has always been a part of information gathering, but with the rise of social media dynamos, we can now do this on a much larger scale than before. This section of the book lists a few of the best practices for crowd source verification.

Chapter 7: Adding the computer Crowd to the Human Crowd

This section of the book is about the possibility of automating the verification process of information. Advanced computing (human computing and machine computing) is on the rise as machine learning becomes more advanced. Human computing has not yet been used to verifying social media information but with the direction of technology it is not too far away. Machine computing could be used to create verification plug-ins which would help to verify if an entry is likely to be credible.

Reflections:

The book does a good job of trying to be a centralized guideline for information verification in all aspects of the professional world. If all people and agencies used these guidelines then I believe it would remove a great deal of misinformation and would save time of any emergency efforts trying to assist. Decreasing the number of fake reports would help increase productivity of people who are actually trying to help.

This collection of ideals and practices run under the umbrella that professionals do not purposely spread false rumors because they are ethically not supposed to do so. We have seen very extreme views given by several news anchors and show hosts, mostly built on self opinion and have had no backlash or repercussions for what they say. It is my belief that as long as there are people involved in information distribution, there is no real way to stop misinformation from being spread. Ultimately as long as there is a person with an opinion behind some information gathering or distribution it will be impossible to eradicate fake news reports, or even embellished stories.

 Questions:
  • What can we do as individuals to prevent the spread of false reports within our social networks?
  • There is a debate on the effectiveness of algorithms and automated searches against the human element. Will machines ever completely replace humans?
  • Should there be a standard punishment for creating false reports or are the culprits protected by their 1st amendment rights? Are there any exceptions to your position on that idea?
  • Verification is a difficult job that many people work together to get accurate information. Can you imagine a way (other than automation) to streamline information verified?

Read More

Integrating On-demand Fact-checking with Public Dialogue

Paper:

Kriplean, T., Bonnar, C., Borning, A., Kinney, B., & Gill, B. (2014). Integrating On-demand Fact-checking with Public Dialogue. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing (pp. 1188–1199). New York, NY, USA: ACM. https://doi.org/10.1145/2531602.2531677 (Links to an external site.)

Discussion Leader: Sukrit V

Summary:

This article aims to understand the design space for inserting accurate information into public discourse in a non-confrontational manner. The authors integrate a request-based fact-checking service with an existing communication interface, ConsiderIt – a crowd-sourced voters guide. This integration involves the reintroduction of professionals and institutions – namely, librarians and public library systems – that have up to now been largely ignored, into crowdsourcing systems.

The authors note that existing communication interfaces for public discourse often fail to aid participants in identifying which claims are factual. The article first delves into different sources of factually correct information and the format in which it should be conveyed to participants. They then discuss who performs the work of identifying, collating and presenting this information: either professionals or via crowdsourcing. Lastly, where this information is presented by these people is crucial: through single function entities such as Snopes or Politifact, embedded responses, or overlays in chat interfaces.

Their system was deployed in the field during the course of a real election with voluntary users – the Living Voters Guide (LVG), and utilized librarians from the Seattle Public Library (SPL) as the fact-checkers.Initial results indicated that participants were not opposed to the role played by these librarians. One key point to note is the labeling of points post verification: accurate, unverifiable and questionable. The term “questionable” was specifically chosen since it is more considerate of users’ feelings – as opposed to the negative connotation associated with “wrong” or a red X.

The rest of the article discusses balancing the task of informing LVG users of which pro/con points were factual but in a non-confrontational manner. The decision to prompt a fact-check was in the hands of the LVG participants and the fact-check was performed only on the factual component of claims and presented in an easy-to-assess manner. From the perspective of the SPL librarians, they played a crucial role in determining the underlying features of the fact-checking mechanism.

In the results, the authors were successfully able to determine that there was a demand for a request-based fact-checking service, and that the SPL librarians were viewed and welcomed as trustworthy participants which simultaneously helped improve the credibility of the LVG interface. Based on Monte Carlo simulations, the authors demonstrate that there was an observable decrease in commenting rates before and after fact-checking, having taken into account temporal effects.

Conclusively, the authors note that the journalistic fact-checking framework did not interface well with librarian referencing methods. In their implementation, there was also no facility for enhanced communicating between the librarians, the user whose point was being checked, and the requester. The method in which the fact-checks were displayed tended to dominate the discussion section and possibly caused a drop in comment rates. Some librarians were of the opinion that they were exceeding their professional boundaries when determining the authenticity of certain claims – especially those pertaining to legal matters.

Reflections:

The article made good headway in creating an interface to nudge people towards finding a common ground. This was done through the use of unbiased professionals/institution vis-à-vis librarians and the Seattle Public Library, in a communication interface.

The involvement of librarians – who are still highly trusted and respected by the public – is notable. These librarians help the LVG participants finding verified information on claims, amidst a deluge of conflicting information presented to them by other users and on the internet.  One caveat – that can only be rectified through changes in existing laws – is that librarians cannot perform legal research. They are only allowed to provide links to related information.

On one hand, I commend the efforts of the authors to introduce a professional, unbiased fact-checker into a communication system filled with (possibly) misinformed and uninformed participants. On the other, I question the scalability of such efforts. The librarians set a 48-hour deadline on responding to requests, and in some cases it took up to two hours of research to verify a claim. Perhaps this system would benefit from a slightly tweaked learnersourcing approach utilizing response aggregation and subsequent expert evaluation.

Their Monte Carlo analysis was particularly useful in determining whether the fact-checking had any effect on comment frequency, versus temporal effects alone. I also appreciate the Value Sensitive Design approach the authors use to evaluate the fact-checking service from the viewpoint of the main and indirect stakeholders. The five-point Likert scale utilized by the authors also allows for some degree of flexibility in gauging stakeholder opinion, as opposed to binary responses.

Of particular mention was how ConsiderIt, their communication interface, utilized a PointRank algorithm which highlights points that were more highly scrutinized. Additionally, the system’s structure inherently disincentivizes gaming of the fact-checking service. The authors mention triaging requests to handle malicious users/pranksters. Perhaps this initial triage could be automated, instead of having to rely on human input.

I believe that this on-demand fact-checking system shows promise, but will only truly be functional at a large scale if certain processes are automated and handled by software mechanisms. Further, a messaging interface wherein the librarian, the requester of the fact-check, and the original poster can converse directly with each other would be useful. Perhaps  that would defeat the purpose of a transparent fact-checking system and eschew the whole point of a public dialogue system. Additionally, the authors note that there is little evidence to show that participants short-term opinions changed. I am unsure of how to evaluate whether or not these opinions change in the long-term.

Overall, ConsiderIt’s new fact-checking feature has considerably augmented the LVG user experience in a positive manner and integrated the work of professionals and institutions into a “commons-based peer production.”

Questions:

  • How, if possible, would one evaluate long-term change in opinion?
  • Would it be possible to introduce experts in the field of legal studies to aid librarians in the area of legal research? How would they interact with the librarians? What responsibilities do they have to the public to provide “accurate, unbiased, and courteous responses to all requests”?
  • How could this system be scaled to accommodate a much larger user base, while still allowing for accurate and timely fact-checking?
  • Are there certain types of public dialogues in which professionals/institutions should not/are unable to lend a hand?

Read More

Amazon Mechanical Turk

Technology: Amazon Mechanical Turk (MTurk)

Demo Leader: Leanna Ireland

Summary:

Amazon Mechanical Turk (MTurk) is a crowdsource platform which connects requesters (researchers, etc.) to a human workforce to complete tasks in exchange for money. Requesters as well as workers can be located all over the world.

Requesters provide tasks and compensation for the workers. The tasks, or human intelligence tasks (HITs), can range from identifying photos and transcribing interviews to writing reviews and taking surveys. When creating a task, requesters can specify worker requirements, such as the number of HITs undertaken, percentage of HITs approved, or a worker’s location. Other qualifications can be specified but for a fee. These options include: US political affiliation, education status, gender, and even left handedness.

Requesters set the monetary reward. Many HITs on MTurk are actually set to a relatively low reward (e.g., US $0.10). Some workers choose to pass over work with low payment; however, others will complete the low paying rewards to increase their HIT approval rates. Requesters pay workers based on the quality of their work. They approve or reject the work completed by workers and thus, if a worker’s work is rejected, the monetary reward is not given.

Overall, MTurk is an inexpensive rapid form of data collection, often resulting in participants more representative of the general population than other Internet and student samples (Buhrmester et al. 2011). However, MTurk participants can vary from the general population. Goodman and colleagues (2013) found that compared to community samples MTurk participants pay less attention to experimental materials, for example. In addition, MTurk raises some ethical issues with the often low rewards for workers. Completing three twenty-minute tasks for 1.50 a piece, for example, does not allow workers to meet many mandated hourly minimum wages.

Because MTurk is a great source for data collection, MTurk can also be used in nefarious ways. This could include being asked to take a geotagged photo of the front counter of your local pharmacy. This innocent-enough task could help determine local tobacco prices or could discover the location and front counter security measures of a store. In addition, requesters could crowdsource paid work for lower value or even crowdsource class assignments to the US population, such as the demo below…

Research about MTurk:

Buhrmester, M., Kwang, T., & Gosling, S. D. (2011). Amazon’s Mechanical Turk a new source of inexpensive, yet high-quality, data?. Perspectives on psychological science6(1), 3-5.

Goodman, J. K., Cryder, C. E., & Cheema, A. (2013). Data collection in a flat world: The strengths and weaknesses of Mechanical Turk samples. Journal of Behavioral Decision Making26(3), 213-224.

Demo:

  1. Go to the MTurk
  2. You will be presented with two options: to either become a worker or a requester. Click ‘Get Started’ under the ‘Get Results from Mechanical Turk Workers’ to become a requester.
  3. To become a requester, for the purposes of the demo, click the ‘get started’ button under the ‘Work Distribution Made Easy’ option.
  4. You will be asked to sign-in to create a project on this page. Browse through the various available options on the left column. Requesters can launch surveys, ask workers to choose an image they prefer, or ask them their sentiments of a tweet. Simply click ‘sign in to create project’.
  5. You will then be asked to sign in or set up an amazon account.
  6. After signing into your account, you will be brought to a page with the following tabs: home, create, manage, developer, and help tabs.
  7. To create a new project, click ‘create’ and then ‘new project’ directly below. You will now need to select the type of project you wish to design from the left column. I choose ‘survey link’ as I will be embedding a link from Qualtrics (web-based survey tool) for the purposes of this demonstration, so the following instructions are for the survey link option. The survey is asking a question from our previous week’s discussion: “What do you think is a bigger problem — incorrect analysis by amateurs, or amplifying of false information by professionals?”
  8. After you have indicated your choice of project, click the ‘create project’ button.
  9. Under the ‘Enter properties’ tab, provide a project name as well as description for your HIT to the Workers. You will also need to set up your HITs. This includes indicating the reward per assignment, number of assignments per (how many people you want to complete your task), the time allotted per assignment, HIT expiration, and the time window before payments are auto-approved. Lastly, you need to indicate worker requirements (e.g., location, HIT approval rate, number of HITs approved).
  10. Under the design layout, you can use the HTML editor to layout the HIT (e.g., write up the survey instructions as well as provide the survey link).
  11. You then can preview the instructions. After you have completed setting up your HIT, you will be taken back to the create tab where your new HIT is listed. To publish the batch, simply click ‘Publish Batch’. You then need to confirm payment and publish.
  12. To view the results and to allocate payment, click ‘Manage’ and download CSV. To approve payment, place an X under the column ‘Approve’. To reject payment, place an X under the column ‘Reject’. This CVS file is then uploaded to MTurk, where approvals and rejections are processed and payment is disbursed to the workers.
  13. To download results from MTurk, under the ‘manage’ tab, click ‘results. And download the CSV. OR, you can download the results from the platform you are using (e.g., Qualtrics).
  14. Lastly, there is an entire community forum for MTurk workers entitled Turker Nation. Workers share tips and techniques and discuss all things MTurk and more (e.g., what HITs to complete but also which HITs or requesters to avoid). This can be a useful site to further advertise your HITs.

Read More