04/29/20 – Sukrit Venkatagiri – Accelerating Innovation Through Analogy Mining

Paper: Tom Hope, Joel Chan, Aniket Kittur, and Dafna Shahaf. 2017. Accelerating Innovation Through Analogy Mining. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’17), 235–243. https://doi.org/10.1145/3097983.3098038

Summary: This paper talks about the challenge of mining analogies from large, real-world repositories, such as patent databases. Such databases pose challenges because they are highly relational but sparse in nature. This s a reason why machine learning approaches do not fare well when applied to these types of databases, especially since they cannot formulate a patter of the underlying structure, which is important for analogy mining. The corpora are also expensive to build, store, and update, while automation cannot be easily applied. The authors overcome these limitations by leveraging the creativity of the crowd and affordable computational capabilities of RNNs. The approach is a structured purpose-mechanism schema for identifying analogies between two research papers. Finally, the authors evaluate crowd worker performance by asking graduate students to annotate the ideas generated around three main ideas: quality, novelty, and feasibility. They find that their approach increased feasibility among the participants in the study.

Reflection:
Overall, I really liked the paper in how it attempts to solve a hard problem by using a scalable approach: crowds and RNNs, and tests it on a real-world dataset. I also liked how the paper defines similarity between different ideas (i.e. analogies) based on purpose and the mechanisms through which products work. Further, the paper suggests more complex metrics for research papers. This raises the question: how much more difficult is it to mine analogies for complex/more abstract ideas, compared to simple ideas? Perhaps structured labels could help in that regards.

The approach itself is commendable since it is a great example of a mixed-initiative user interface that combines the creativity of the crowd and the affordable computation of RNNs. Further, this approach does not needlessly waste human computation. The authors also completed a thorough evaluation of the machine intelligence portion.

Second, I appreciate the approach taken towards making something subjective—in this case, creativity—into something more objective, by breaking it down into different rate-able metrics.

Finally, the idea of using randomly generate analogies to “spark creativity” and the results of that show that creativity really does need diverse ideas. I wonder why this may be, and how to introduce such randomness into real-world work practice.

Questions:
1. How scalable do you think the system is? What other limitations does it have?
2. Can this approach be used to generate analogies in other fields? What would be different?
3. Do you think creativity is subject? Can it be made into something objective?

                                                                          

Leave a Reply