This pair of papers examines the role of social media on aspects of healthcare, both attitudes toward vaccination and predicting depression. In Mitra et al., the authors look at Twitter data to understand linguistic commonalities in users who are consistently pro-vaccine, consistently anti-vaccine, and transition from pro-to-anti. They found that consistently anti-vaccine users are (my wording) conspiracy nutjobs who distrust government and also communicate very directly, whereas users who transition to anti-vaccine seem to be actively looking for that information, being influenced more by concerns about vaccination more than just being generally conspiracy-minded. The De Choudhury et al. paper also uses Twitter (along with mTurk) to measure social media behavioral attributes of depression sufferers. They analyze factors such as engagement, language, and posting time distributions to understand what factors social media factors can be used to separate depressed and non-depressed populations. Following this analysis, they build a ~70% accurate predictor for depression via social media signs.
My biggest surprise with the Mitra et al. paper was the authors’ decision to exclude a cohort of users who transition from anti-vaccine to pro-vaccine. I understand the goals and motivations that the authors have presented, but it feels to me research focused on understanding how best to bring these misguided fools back to reality is just as important as the other way around. Understanding how to prevent others from diving into the anti-vaccine pit is also clearly useful research, but I’d be more interested in reading a study that gives recommendations for rehabilitation rather than prevention, as well as simply displaying what topics are commonly found in discussions around the time that these users return to sanity. I guess it’s a bit late to propose a new class project now, but this really interests me.
Going beyond the linguistic and topical analysis, I’d also be curious to run a network analysis study in this dataset. Twitter affords unidirectional relationships, where an individual follows another user with no guarantee of reciprocation. This leads to interesting research questions such as (1) if a prominent member of the anti-vaccine community follows me back, am I more influenced to join the community? (2) Is the interconnectedness of follow relationships within the anti-vaccine community stronger than in the general population? (3) How long does it take for an incoming member of the anti-vaccine group to be indistinguishable from a long-time member with respect to the number/strength of these follow relationships?
As a depression sufferer myself, the De Choudhury et al. paper was also a very interesting read. I paused in reading the paper to score myself on the revised version of the CES-D test cited, and the result was pretty much what I expected. So there’s one more validation point to demonstrate that the test is accurate.
I thought it was interesting that the authors acquired their participants via mTurk instead of going through more “traditional” routes like putting up flyers in psychiatrist offices. There’s certainly an advantage to getting a large number of participants easily through computational means, and the authors did work hard to ensure that they restricted their study to quality participants, but I’m still a bit wary about using mTurkers for a study. This is especially true in this case, where the self-reporting nature of mTurk is going to stack with the self-reporting nature of depression. Using public Twitter data from these users clearly helps firm up their analysis and conclusions, but my wariness about taking this route in a study of my own hasn’t faded since reading through the paper.