Reflection #12 – [04/05] – [Meghendra Singh] | CS6724 Spring18: Computational Social Science

Felbo, Bjarke, et al. “Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm.” arXiv preprint arXiv:1708.00524(2017).
Nguyen, Thin, et al. “Using linguistic and topic analysis to classify sub-groups of online depression communities.” Multimedia tools and applications 76.8 (2017): 10653-10676.

The first paper presents DeepMoji, an emoji prediction deep neural network, trained using 1.2 billion tweets. I found the paper and the DeepMoji demo available at https://deepmoji.mit.edu/ very compelling and fascinating. The key contribution of this work was to show how emoji occurrences in tweets can be used to learn richer emotional representation of text. To this end the authors construct a deep neural network model (DeepMoji) using an embedding layer, two bidirectional LSTM layers, Bahdanau attention mechanism and a Softmax classifier at the end. The authors detail the pretraining process, three transfer learning approaches (full, last and chain-thaw) and evaluate the models obtained on 3 NLP tasks on 8 benchmark datasets across 5 domains. Results of the evaluation suggest that the model trained using the chain-thaw transfer learning procedure beats the state of the art on all the benchmark datasets.

I am not really sure how upsampling works and the authors do not discuss the upsampling technique used in the paper. Also, it would have been interesting to know if the authors experimented with different architectures of the network before finalizing on the one presented here. How did they arrive at this particular architecture? Additionally, will increasing the number of BiLSTM layers improve performance on the benchmark datasets and will this change in architecture be comparable with the chain-thaw transfer learning technique are questions that can be explored. Moreover, since tweet length is limited to 280 characters, it is not possible to analyze longer texts with high confidence using this technique, unless the study is repeated on a dataset with longer texts mapped to specific emojis. It might be difficult to replicate this study for languages other than English and Mandarin. This is because large twitter/weibo-like data sources that contain distant supervision labels in the form of emojis, may not exist for other languages. Therefor, it will be interesting to see what other distant supervision techniques can be used to predict emotional labels for texts on social media in other languages.

In table 2, we see that most of the emojis on Twitter are positive (laughter, sarcasm, love) and the negative emojis (sad face, crying face, heartbreak), I wonder if the same trend would be observed on other social media websites. Nevertheless, given the proliferation of emojis in computer mediated communication, it would be interesting to repeat this study with data like: facebook posts, comments, posts and comments on any social website. Additionally, as one can use this approach to effectively determine various emotions that are associated with any text at a very granular level, this approach can be used to filter content/news for a user. For example, if a user only wants to read content that is optimistic and cheerful, this approach can filter out all the content that does not fall in that bucket. One can also think of using this approach to detect the psychological state of an author. It might be interesting to see if the emotional content of an author’s posts remains consistently pessimistic does that predict clinical conditions like: depression, anxiety or self-harm events?

This brings us to the second paper which analyzes 38K posts in 24 Live Journal communities to discover the psycholinguistic features and content topics present in online communities discussing about Depression, Bipolar Disorder, Self-Harm, Grief/Bereavement and Suicide. The study generated 68 psycholinguistic features using LIWC and 50 topics using LDA for text from 5K posts (title and content text). The authors subsequently use these topics and LIWC features as predictors with LASSO regression, for the 5 subgroups of communities interested in the 5 disorders/conditions (Depression, Bipolar Disorder, Self-Harm, Grief/Bereavement and Suicide). The authors find that latent topics had greater predictive power than linguistic features for bipolar disorder, grief/bereavement communities, and self-harm subgroups. The most interesting fact for me was that help-seeking was not a topic in any of the subgroups and only the Bipolar Disorder subgroup discussed treatments. This seems very strange for communities dedicated to discussion for psychological illness.

It would be interesting to repeat this experiment and see if the results remain consistent. I say this because the authors do a random sampling of 5K posts and this may have missed certain topics, LIWC categories. It would also be interesting to know the statistics about lengths of these posts and whether this was taken into consideration when sampling the posts? Another aspect to point out is that the Bipolar Disorder subgroup had a larger number of communities (7 out of 24) did this somehow effect the diversity of topics extracted? Perhaps it might be a good idea to use all the posts from the 24 communities? We also see that Lasso outperformed the other three classifiers and it would be interesting to see if ensemble classifiers would outperform Lasso? Overall the second paper was an excellent read and presented some very interesting results.

Reflection #12 – [04/05] – [Meghendra Singh]

meghs

Leave a Reply Cancel reply