04/29/2020 – Mohannad Al Ameedi – Accelerating Innovation Through Analogy Mining

Summary

In this paper, the authors aim to improve the search and discovery for ideas using analogies in massive and unstructured datasets. Their approach combines both crowd workers and recurrent neural network to learn from a week structural representation of vectors. The authors used a patent dataset to search for product description and used crowd workers to extracted purpose and mechanics to help with finding ideas across different domains. They have used Amazon Mechanical Turk to hire workers to perform a dual annotation on each product description by labeling the parts of text that is related to the purpose of the product and another labeling related to the mechanism or the way the product work and used . The authors then used bidirectional recurrent neural network and information retrieval techniques to find a deep and more accurate similarity between the searched idea and available innovation and research about it. The authors approach has a high precision and recall and can improve the retrieval accuracy by 25%.   

Reflection

I think the approach used by the authors is very interesting. Extracting the purpose and mechanism from a production description is like looking at the data from two different angles. Calculating the similarity base on two vectors is a nice implementation and can help on finding a close relationship between two subjects in different domains that share a common attribute.

I also like the idea of using deep learning instead of TF-IDF to calculate the similarity between to products’ description as it can improve the quality of the search.

I personally use google scholars to search for a similar ideas but didn’t use the websites mentioned in the paper and that is something that I have learned while reading the paper.

This approach can be used as a verification tool when reviewing a copy right application. The idea might be the same as other idea but in different domain and the application that was built by the authors can help on finding this out.

This approach is like mapping vocabulary to concept space to improve information retrieval by performing latent space indexing rather than just performing similarity on keywords. Different words might have the same meaning and one word might have different meanings. Searching based on the keywords might retrieval incorrect results, while searching based on the concept might lead to a much accurate result.

Questions

  • The authors asked the crowd workers to extract two pieces of information, the purpose and mechanism, from the product description. Can we use this approach to solve a different problem?
  • Do you agree with the authors that the recurrent neural network is better than traditional TF-IDF in calculating the similarity for the two vectors? Why or why not?
  • Can you use a similar approach in your project to ask the crowd workers to annotate your data from two different perspectives or looking at your data from two different angles?
  • The authors mentioned more than two websites that store information about patents, have you used these websites?

One thought on “04/29/2020 – Mohannad Al Ameedi – Accelerating Innovation Through Analogy Mining

  1. Hello Mohannad. Regarding your first question, “The authors asked the crowd workers to extract two pieces of information, the purpose, and mechanism, from the product description. Can we use this approach to solve a different problem?”, I feel that this can certainly be applied to other domains to solve problems. For instance, one potential extension could be integrating this system with recommender systems. Extracting metadata from user’s browsing history, comparing this with other available content, and recommending new content could be one possible extension that could help the users and potentially save user’s time.

Leave a Reply