4/22/20 – Lee Lisle – SOLVENT: A Mixed Initiative System for Finding Analogies between Research Papers

Summary

Chan et al.’s paper discusses a way to find similarities in research papers through the use of mixed initiative analysis. They use a combination of humans to identify sections of abstracts and machine learning algorithms to identify key words in those sections in order to distill the research down into a base analogy. They then compare across abstracts to find papers with the same or similar characteristics. This enables researchers to find similar research as well as potentially apply new methods to different problems. They evaluated these techniques through three studies. The first study used grad students reading and annotating abstracts from their own domain as a “best-case” scenario. Their tool worked very well with the annotated data as compared to using all words. The second study looked at helping find analogies to fix similar problems, using out-of-domain experts to annotate abstracts. Their tool found more possible new directions than the all words baseline tool. Lastly, the third study sought to scale up using crowdsourcing. While the annotations were of a lesser quality with mTurkers, they still outperformed the all-words baseline.

Personal Reflection

               I liked this tool quite a bit, as it seems a good way to “unstuck” oneself in the research black hole and find new ways of solving problems. I also enjoyed that the annotations didn’t necessarily require domain-specific or even researcher-specific knowledge even with the various jargon that is used. Furthermore, though it confused me initially, I liked how they used their own abstract as an extra figure of sorts – using their own approach to annotating their abstract was a good idea. It cleverly showed and explained how their approach works quickly without reading the entire paper.

               I did find a few things confusing about their paper, however. They state that the GloVe model doesn’t work very well in one section, but then use it in another. Why go back to using it if it had already disappointed the researchers in one phase? Another complication I noticed was that they didn’t define the dataset in the third study. Where did the papers come from? I can glean from reading it that it was from one of the prior two studies, but I think its relevant to ask if it was the domain-specific or the domain-agnostic datasets (or both).

               I was curious about total deployment time for this kind of thing. Did they get all of the papers analyzed by the crowd in 10 minutes? 60 minutes? A day? With how parallel the task can be performed, I can imagine it could be very quick to get the analysis performed. While this task doesn’t need to be quickly performed, it could be an excellent bonus of the approach.

Questions

  1. This tool seems extremely useful. When would you use it? What would you hope to find using this tool?
  2.  Is the annotation of 10,000 research papers worth $4000? Why or why not?
  3. Based on their future work, what do you think is the best direction to go with this approach? Considering the cost of the crowdworkers, would you pay for a tool like this, and how much would be reasonable?

One thought on “4/22/20 – Lee Lisle – SOLVENT: A Mixed Initiative System for Finding Analogies between Research Papers

  1. I agree. The tool certainly seems useful. I would use it at multiple stages of a project, right from the initial conceptualization till writing the Related Work section of the paper. Initially, I hope to draw inspiration from distant domains, and over time to see how similar/different it is from other papers in my domain.

Leave a Reply