04/29/2020 – Yuhang Liu – Accelerating Innovation Through Analogy Mining

Summary:

This article is similar to a previous paper “SOLVENT: A Mixed Initiative System for Finding Analogies between Research Papers” in that it focuses on how to search for articles in a vast corpus of papers. In the massive essay library, constantly get ideas. The research method of this paper is based on previous papers and previous papers with some changes. In previous papers, this system of discovering new ideas by analogy usually requires an understanding of the deep similarity between two entities, and then comparison or research, However, finding analogies is challenging for machines, as it is based on having an understanding of the deep relational similarity between two entities that may be very different in terms of surface attributes, and in previous studies there were methods based on similarity, such as TF-IDF, LSA, LDA, and GloVe, but the authors of this paper investigate a weaker structural representation, the goal is to come up with a representation that can be learned, while still being expressive enough to allow analogical mining. So the method proposed in this article requires crowdsourced workers to annotate the purpose and mechanism of a paper. Then through learning and systematic research, new ideas for similar purposes but through different mechanisms are obtained. And from the test, the authors found it has a better effect than other methods.

Reflection:

First of all, on the basis of reading another paper on a similar subject, I want to make a comparison between the two articles. In another article, People need to analyze an article from four aspects, Background, Purpose, Mechanism, Findings. Then the machine generates an analogy by comparing and learning these four structural aspects. In this article, a paper is divided into two parts, the purpose and the mechanism, which is also called as a relatively weak structured representation. Separating an idea into purpose and mechanisms enables core analogical innovation processes such as repurposing. So the final experiment in this article is also based on the same purpose, different mechanism. So, there are only two dimensions to represent a paper which is more abstract and broad, and directly learn them in a supervised method. The benefits of doing this is, it is possible to automatically extract these representations from product descriptions for potential wide applicability. Identifying key components and functions can also improve the search function of the system and better understand the needs of users.

Secondly, I think the most important thing about a system that finds analogies between papers or products is its feasibility, which is reason why I think the method in this paper is better. In terms of the feasibility of the system, letting a crowdsourced worker or machine discover the purpose and mechanism of an article described in a paper is far simpler than analyzing the structure (background, purpose, mechanism, finding) of an article, and there will be less errors in the simple tasks, so a better data sets will be available for learning. At the same time, in terms of the feasibility of the ideas that are formed, the system introduces graduate students to comprehensively judge the feasibility of ideas. From my aspect, this is particularly important, if an idea is unrealizable or cannot withstand the test , then no matter how novel, it is useless. So in my opinion this is the advantage of the system.

But at the same time, I have some doubts about the results of the system, because the system seems to be more inclined to find the different mechanism for the same purpose. For different purposes, the idea of the same mechanism seems difficult to obtain. And I think that the most difficult part of innovation is to apply an idea to a new field, and the examples of bionics that we think of most are: fish float and submarine, bat and radio. In my opinion, those inventions that have a significant impact on people’s lives are usually the application of mechanisms in other fields to achieve new goals. So for an unresearched field or unrealized purpose, I think these can be used as the direction of future research of the method.

Question:

  1. What are the limitations do you think of the method proposed in this article?
  2. What methods do you think can be used to evaluate the usefulness of analogy ideas?
  3. What do you think is more important for the idea of finding an analogy, surface similarity, structural similarity, or some other factors?

One thought on “04/29/2020 – Yuhang Liu – Accelerating Innovation Through Analogy Mining

  1. Hi Yuhang, great reflection! To address your first question, the first limitation I can think about is data quality control. The researchers collect the purpose and mechanism in product description annotated by crowd workers and feed the dataset into deep learning models. As presented in other papers we discussed in class before, high-quality data is much more difficult to obtain as the complexity of the task increases. The noise and bias in the training data may negatively affect model performance. The second limitation relates to the RNN model. Although this sequence-to-sequence model achieves good performance in a variety of NLP tasks, its usage is contained by sequence length thus not able to keep track of long-term dependencies.

Leave a Reply