04/29/2020 – Dylan Finch – Accelerating Innovation Through Analogy Mining

Word count: 579

Summary of the Reading

This paper works to make it easier to find analogies in large, unstructured datasets that cover a variety of domains. The work has many real world applications, targeting real world examples of large, unstructured datasets, like data from the US patent office. The system works by collecting data about each entry in the dataset. This data includes the purpose and the mechanism that achieves the purpose. By collecting both of these features, it makes it easier to find analogies. 

This system was evaluated to see how well it helped with ideation by analogy. The evaluation used crowd workers to see how effective the new method is. Many workers were asked to come up with new product ideas. To help, these workers were shown example products. Some were shown products found to be similar using the system described in this paper. Others were shown products found to be similar by other means and others were shown random products. The evaluation showed that this system helped the workers to come up with better ideas than the other 2 methods.

Reflections and Connections

I think this problem is really one that spans generations. The paper brings up the troubles of trying to use a real world dataset like the one from the US patent office. In a case like this, I think that the data presents many different challenges, depending on when you look. For data from the past, there are probably inconsistencies in the formats of the data, the data may be in multiple different sources, and some of it may have been lost or changed over time. For data from the present, there is just so much of it. With so many more people and intellectual property more valuable than ever, the US patent office probably has more data about inventions than they can deal with. These represent two very different challenges that a dataset like the one from the US patent office face and they are a great reason why research like this is so sorely needed.

The idea of this paper also reminds me of an idea from last week’s papers: SOLVENT. Both papers try to make it easier for researchers to find analogies in data sources. In fact the existence of both of these papers I think helps to illustrate the need for technology like this. In fact, neither of these papers cite each other even though they are working on very similar research. Perhaps if there had been a widely available version of SOLVENT, they would have been able to find each other’s papers and build off of each other. 

I think that as these datasets get larger and larger, the need for easier ways to access things from them will become more and more important. The number of patents and papers is growing quicker than ever before and that means that it is easier than ever for valuable knowledge to be lost. We need to start implementing more ideas like this so that we don’t lose important knowledge. I hope that the existence of both of these papers helps to show others the real need for technology like this.

Questions

  1. Do you think this system is better or worse than SOLVENT?
  2. What is another real world, unstructured data source that a system like this might work well on?
  3. What are some applications for this system outside of ideation?

Read More