This paper proposes a novel mixed-initiative method called SOLVENT that has the crowd annotate relevant parts of a document based on purpose and mechanism and representing the documents on a vector space. The authors identify that representing technical documents using the purpose-mechanism concept with crowd workers has obstacles such as technical jargon, multiple sub-problems in one document, and the presence of understanding-oriented papers. Therefore, the authors modify the structure to hold background, purpose, mechanism, and findings instead. With each document represented by this structure, the authors were able to apply natural language processing techniques to perform analogical queries. The authors found better query results than baseline all-words representations. To scale the software, the authors made workers of Upwork and Mturk annotate technical documents. The authors found that the workers struggled with the concept of purpose and mechanism, but still provided improvements for analogy-mining.
I think this study will go nicely together with document summarization studies. It would especially help since the annotations are done by specific categories. I remember one of our class’s project involved ETDs and required summaries. I think this study could have benefited that project given enough time.
This study could also have benefited my study. One of the sample use-cases that the paper introduced was improving creative collaboration between users. This is similar to my project which is about providing creative references for a creative writer. However, if I want to apply this study to my project, I would need to additionally label each of the references provided by the Mturk workers by purpose and mechanism. This will cost me additional funds for providing one creative reference. This study would have been very useful if I had enough money and wanted more quality content rankings in terms of analogy.
It was interesting that the authors mentioned different domain papers could still have the same purpose-mechanism. It made me wonder if researchers would really want similar purpose-mechanism papers on a different domain. I understand multi-disciplinary work is being highlighted these days but would each of the disciplines involved in a study try to address the same purpose and mechanism? Wouldn’t they address different components of the project?
The followings are the questions that I had while reading the paper.
1. The paper notes that many technical documents are understanding-oriented papers that have no purpose-mechanism mappings. The authors resolved this problem by defining a larger mapping that is able to include these documents. Do you think the query results would have had higher quality if the mapping was kept compact instead of increasing the size? For example, would it have helped if the system separated purpose-mechanism and purpose-findings?
2. As mentioned in my reflection, do you think the disciplines involved in a multi-disciplinary project all have the same purpose and mechanism? If not, why?
3. Would you use this paper for your project? To put in other words, does your project require users or the system to locate analogy inside a text document? How would you use the system? What kind of queries would you need out of the combinations possible (background, purpose, mechanism, findings)?
I agree with your comment that this work might help in the summarization task. It would give an idea about the related documents in the topic. From that point, we would need to use extractive or abstractive methods to actually form the summaries. But, yes, nonetheless the method would be useful in such tasks. We are the team working on ETDs, mainly focusing on classification and citation parsing tasks.
I believe I can answer your second question, at least in reference to my own interdisciplinary work. IST is about creating a tool for different disciplines for them to make sense of large document sets. We are specifically doing this by evaluating the tool from a historical analysis perspective in conjunction with the department of history at VT. I believe that our purpose-mechanisms are the same. The CS-side of the project wants to create a tool that can assist analysts perform their work, while the history-side of the project wants the same thing. The purpose and manner of the mechanism is determined by the history-side. I believe this will hold true for other projects as well, though finding a counter-example would be time-consuming.