04/22/20 – Lulwah AlKulaib-Solvent

Summary

The paper argues that scientific discoveries are based on analogies in distant domains. Nowadays, it is difficult for researchers to keep up in finding analogies due to the rapidly growing number of papers in each discipline and the difficulty of finding useful analogies from unfamiliar domains. The authors propose a system to solve this issue. Solvent, a mixed initiative system for finding analogies between research papers. They hire human annotators that structure academic papers abstracts into different aspects and then a model constructs the semantic representations from the provided annotations. The resulting semantic annotations are then used in finding analogies within research papers in that domain and across different domains. In their studies, they show that the proposed system finds more analogies than existing baseline approaches in the information retrieval field. They outperform state of the art and prove that annotations can generalize beyond the domain and that the analogies that the semantic model found are found to be useful by experts. Their system is a step in a new direction towards computationally augmented  knowledge sharing between different fields. 

Reflection

This was a very interesting paper to read. The authors use of scientific ontologies and scholarly discourse like those in Core Information about Scientific Papers (CISP) ontology makes me think of how relevant their work is, even when their goal differs from the corpora paper. I found the section where they explain adapting the annotation methods for research papers very useful for a different project. 

One thing that I had in mind while reading the paper was how scalable is this to larger datasets. As they have shown us in the studies, the datasets are relatively small. The authors explain in the limitations that part of the bottleneck is having a good set of gold standard matches that they can use to evaluate their approach. I think that’s a valid reason, but still doesn’t eliminate the question of what would it require? and how well would it work?

When going over their results and seeing how they outperformed existing state of the art models/approaches, I also thought about real world applications and how useful this model is. I never thought of using analogies to perform discovery in different scientific domains. I always thought it would be more reliable to have a co-author from that domain that would weigh in. Especially nowadays with the vast communities of academics and researchers on social media it’s no longer that hard to find someone that could be a collaborator on a domain that isn’t yours. Also when looking at their results, their high precision was only in results recommending the top k% of most similar pairs analogies. I wonder if automating that has a greater impact than using the knowledge of a domain expert.

Discussion

  • Would you use a similar model in your research?
  • What would be an application that you can think of where this would help you while working on a new domain?
  • Do you think that this system could be outperformed by using a domain expert instead of the automated process?

Leave a Reply