04/22/2020 – Sushmethaa Muhundan – SOLVENT: A Mixed-Initiative System for Finding Analogies between Research Papers

This work aims to explore a mixed-initiative approach to help find analogies between research papers. The study uses research experts as well as crowd-workers to annotate research papers by marking the purpose, mechanism, and findings of the paper. Using these annotations, a semantic vector representation is constructed that can be used to compare different research papers and identify analogies within domains as well as across domains. The paper aims to leverage the inherent causal relationship between purpose and mechanism to build “soft” relational schemas that can be compared to determine analogies. Three studies were conducted as part of this paper. The first study was to test the system’s quality and feasibility by asking domain expert researchers to annotate 50 research papers. The second study was to explore whether the system would be beneficial to actual researchers looking for analogical inspiration. The third study involved using crowd workers as annotators to explore scalability. The results from the studies showed that annotating the purpose and mechanism aspects of research papers is scalable in terms of cost, not critically dependent on annotator expertise, and is generalizable across domains.

I feel that the problem this system is trying to solve is real. While working on research papers, there is often a need to find analogies for inspiration and/or competitive analysis. I have also faced difficulty finding relevant research papers while working on my thesis. If scaled and deployed properly, SOLVENT would definitely be helpful to researchers and could potentially save a lot of time that would otherwise be spent on searching for related papers.

The paper claims that the system quality is not critically dependent on annotator expertise and the system can be scaled using crowd workers as annotators. However, the results showed that the annotations of Upwork workers matched expert annotations 78% of the time and those of Mturk workers matched 59% of the time. The results also showed that the results varied considerably: a few papers had 96% agreement while a few had only 4%. I am a little skeptical regarding these numbers and I am not convinced that expert annotations are dispensable. I feel that using crowd workers might help the system scale but it might have a negative impact on quality.

I found one possible future extension extremely interesting: the possibility of authors themselves annotation their work. I feel that if each author spends a little extra effort to annotate their own work, a large corpus could easily be created with high-quality annotations. SOLVENT could easily produce great results using this corpus.

  • What are your thoughts about the system proposed? Would you want to use this system to aid your research work?
  • The study indicated that the system needs to be vetted with large datasets and the usefulness of the system is yet to be truly tested in real-world settings. Given these limitations, do you think the usage of this system is feasible? Why or why not?
  • One potential extension mentioned in the paper is to combine the content-based approach with graph-based approaches like citation graphs. What are other possible extensions that would enhance the current system?

Read More

04/22/2020 – Sushmethaa Muhundan – Opportunities for Automating Email Processing: A Need-Finding Study

This work aims to reduce the efforts of senders and receivers in the email management space by designing a useful, general-purpose automation system. This work is a need-finding study that aims to explore the potential scope for automation along with studying the information and computation required to support this automation. The study also explores existing email automation systems in an attempt to determine which needs have been addressed already. The study employes open-ended surveys to gather needs and categorize them. A need for a richer data model for rules, more ways to manage attention, leveraging internal and external email context, complex processing such as response aggregation, and affordances for senders emerged as common themes from the study. The study also developed a platform, YouPS, that enabled programmers to develop automation scripts using Python but abstracted the complexity of IMAP API integration. Participants were asked to program using the YouPS platform to write scripts that would automate tasks to make email management easier. The results showed that the usage of the platform was able to solve problems that were not straight-forward to solve in the existing email clients’ ecosystem. The study concluded by listing limitations and also highlighted prospective future work.

I found it really interesting that this study provided the platform, YouPS, to understand what automation scripts would have been developed if it was easy to integrate with the existing APIs. After scraping public Github repositories for potential automation solutions, the study found that there were limited solutions that were generally-accessible. I feel that providing a GUI that would enable programmers as well as non-programmers to furnish rules to structure their inbox, as well as schedule outgoing emails using context, would definitely be useful. This GUI would be an extension to YouPS that abstracts the API integration layer away so that the end-users can focus on fulfilling their needs to enhance productivity.

While it is intuitive that receivers of emails would want automation to help them organize the incoming emails, it was interesting that the senders also wanted to leverage context and reduce the load on recipients by scheduling their emails to be sent when the receiver is not busy. The study mentioned leveraging internal and external context to process the emails and I feel that this would definitely be helpful. Filtering emails based on past interactions and the creation of “modes” to handle incoming emails would be practical. Another need that I was able to relate to was the aggregation example the study talks about. When an invite is sent to a group of people, individual emails for each response is often unnecessary. Aggregating the responses and presenting a single email with all the details would definitely be convenient.

  • The study covered areas where automation would help in the email management space. Which need did you identify with the most? 
  • Apart from the needs identified in the study, what are some other scenarios that you would personally prefer to be automated?
  • The study indicated that participants preferred to develop scripts using YouPS to help organize their emails as opposed to using the rule-authoring interfaces in their mail clients. Would you agree? Why or why not?

Read More

04/22/2020 – Bipasha Banerjee – SOLVENT: A Mixed Initiative System for Finding Analogies between Research Papers

Summary 

The paper by Chan et al. is an interesting read on finding analogies from research papers. The main domain considered here is scientific papers. The annotation scheme has been divided into four categories, namely, background, purpose, mechanism, and findings. This paper’s goal is to make it easier for researchers to find related work in their field. The author conducted three studies to test their approach and its feasibility. The first study consisted of domain-experts annotating a particular abstract in their research area. The second study, on the other hand, focussed mainly on how the model could tackle the real-world problem where a researcher needs to find relevant papers in their area to act as inspiration, related-work, or even baselines for their experiments. The last study, however, was very different from the first two, where an experienced researcher annotated the data or used the system for solving their research problem. The third study, on the other hand, used crowdworkers to annotate abstracts. The platforms utilized by the authors were Upwork and Amazon Mechanical Turk.

Reflection

The mixed-initiative model developed by the authors is an excellent step in the right direction to find analogies in scientific papers. There are traditional approaches in natural language processing that help find similarities in textual data. The website [1] gives an excellent insight into the steps involved in finding similarities in texts. However, when it comes to scientific data, just using these steps is not enough. Additionally, most of the models involved are trained using generic websites and news data (like CNN or DailyMail). Hence, most of the scientific jargon is “out of vocabulary” (OOV) for such models. Hence, I appreciate that the authors used human annotations along with traditional methods in information retrieval (like TF-IDF, etc.) to tackle the problem at hand.

Additionally, for finding the similarity metric, they took multiple categories into account, like Purpose+Mechanism. This is definitely useful when finding similarities in the text data. I also liked the fact that for the studies, they considered normal crowdworkers in addition to people with domain knowledge. I was intrigued to find that 75% of the time, the annotations of crowdworkers matched with the researchers. Hence the conclusion that “crowd annotations still improve analogy-mining” is valuable. Not only that, getting researchers in large amounts in one domain just to annotate the data is difficult, sometimes there are very few people in one domain of research. Rather than having to find researchers who are available to annotate such data, it is good that we can use existing methods available.

Lastly, I would like to mention that I liked that the paper identified the limitations very well, and the scope for future work has also been clearly mentioned. 

Questions

  1. Would you agree that level of expertise of the human annotators would not affect the results for your course project? If yes, could you please clarify?

(For my class project, I think I would agree with the paper’s findings. I work on reference string parsing, and I don’t think we need experts just to label the tokens.)

  1. Could we have more complex categories or sub-categories rather than just the four identified?
  2. How would this extend to longer pieces of texts like chapters of book-length documents? 

References 

  1. https://medium.com/@Intellica.AI/comparison-of-different-word-embeddings-on-text-similarity-a-use-case-in-nlp-e83e08469c1c 

Read More

04/22/20 – Lulwah AlKulaib-Acclerator

Summary

Most of the crowdsourcing tasks in the real world are submitted to platforms as one big task due to the difficulty in decomposing tasks into small, independent units. The authors argue that by decomposing and distributing tasks we could utilize more of the resources provided by crowdsourcing platforms at a lower cost than the existing traditional method. They propose a computational system that can frame interdependent, small tasks to represent one big picture system. This proposal is difficult, so to investigate its viability, the authors prototype the system to test the distributed information combination after all tasks are done and to evaluate the output across multiple topics. The system compared well to existing top information sources on the web and it exceeded or approached quality ratings for highly curated reputable sources. The authors also suggested some design patterns that should help other researchers/systems when thinking of breaking big picture projects into smaller pieces. 

Reflection

This was an interesting paper. I haven’t thought about breaking down a project into smaller pieces to save on costs or that it would get better quality results by doing so. I agree that some of the existing tasks are too big, complex, and time consuming and maybe those need to be broken down to smaller tasks. I still can’t imagine how breaking tasks so small that they can’t cost more than $1 generalizes well to all existing projects that we have on Amazon MTurk.  

The authors mention that their system, even though it has a strong performance, was generated by non-expert workers that did not see the big picture, and that it should not be thought of as a replacement to expert creation and curation of content. I agree with that. No matter how good the crowd is, if they’re non-experts and they don’t have full access to the full picture, there would be some information missing which could lead to mistakes and imperfection. That shouldn’t be compared to a domain knowledge expert who would do a better job even if it costs more. Cost should not be a reason we favor the results of this system.

The design patterns suggested were a useful touch and the way they were explained help in understanding the proposed system as well. I think that we should adapt some of these design patterns as best as we could in our projects. Learning about this paper late enough in our experiment design would make it hard to implement breaking our tasks down to simpler tasks and test that theory on different topics. I would have loved to see how we each reported since we have an array of different experiments and simplifying some tasks could be impossible. 

Discussion

  • What are your thoughts on breaking down tasks to such a small size?
  • Do you think that this could be applicable to all fields and generate similarly good quality? If not, where do you think this might not perform well?
  • Do you think that this system could replace domain experts? Why? Why not?
  • What applications is this system best suited for?

Read More

04/22/20 – Lulwah AlKulaib-Solvent

Summary

The paper argues that scientific discoveries are based on analogies in distant domains. Nowadays, it is difficult for researchers to keep up in finding analogies due to the rapidly growing number of papers in each discipline and the difficulty of finding useful analogies from unfamiliar domains. The authors propose a system to solve this issue. Solvent, a mixed initiative system for finding analogies between research papers. They hire human annotators that structure academic papers abstracts into different aspects and then a model constructs the semantic representations from the provided annotations. The resulting semantic annotations are then used in finding analogies within research papers in that domain and across different domains. In their studies, they show that the proposed system finds more analogies than existing baseline approaches in the information retrieval field. They outperform state of the art and prove that annotations can generalize beyond the domain and that the analogies that the semantic model found are found to be useful by experts. Their system is a step in a new direction towards computationally augmented  knowledge sharing between different fields. 

Reflection

This was a very interesting paper to read. The authors use of scientific ontologies and scholarly discourse like those in Core Information about Scientific Papers (CISP) ontology makes me think of how relevant their work is, even when their goal differs from the corpora paper. I found the section where they explain adapting the annotation methods for research papers very useful for a different project. 

One thing that I had in mind while reading the paper was how scalable is this to larger datasets. As they have shown us in the studies, the datasets are relatively small. The authors explain in the limitations that part of the bottleneck is having a good set of gold standard matches that they can use to evaluate their approach. I think that’s a valid reason, but still doesn’t eliminate the question of what would it require? and how well would it work?

When going over their results and seeing how they outperformed existing state of the art models/approaches, I also thought about real world applications and how useful this model is. I never thought of using analogies to perform discovery in different scientific domains. I always thought it would be more reliable to have a co-author from that domain that would weigh in. Especially nowadays with the vast communities of academics and researchers on social media it’s no longer that hard to find someone that could be a collaborator on a domain that isn’t yours. Also when looking at their results, their high precision was only in results recommending the top k% of most similar pairs analogies. I wonder if automating that has a greater impact than using the knowledge of a domain expert.

Discussion

  • Would you use a similar model in your research?
  • What would be an application that you can think of where this would help you while working on a new domain?
  • Do you think that this system could be outperformed by using a domain expert instead of the automated process?

Read More

04/22/2020 – Bipasha Banerjee – Opportunities for Automating Email Processing: A Need-Finding Study

Summary

The primary goal of the paper is to provide automation support to users in terms of email handling. The authors first tried to determine what are the automatic features that users wanted in their email service and what are the informational and computational needs when it came to implementing such a system. They conducted three experiments, out of which, the first experiment was to gauge what kind of features the users wanted to be automated. In this particular study, there was no boundary as to what can or can’t be implemented. So this experiment effectively gave all the range of tasks and features users would “wish” their email interface provided. The second experiment aimed to find all the current automated implementations available on the market. This involved sifting through GitHub repositories to find projects aiming to solve the current gap regarding automation of email processing. Finally, the last experiment involved giving the users to code their “ideal” features using the YouPS interface. This study consisted mainly of students from engineering backgrounds familiar with Python programming.

Reflection

The paper provided an interesting perspective on how users want their email clients to perform. For this, it was important to understand the needs of the people. The authors’ do this by conducting the first experiment on finding the ideal features that users look for. I liked the way the task of customer discovery of needs was approached. However, I wanted to point out that the median age range of participants’ was 20-29, and all had a university affiliation. It would be interesting to see what older people from both university and industry background want in such email clients. Getting the perspective of a senior researcher or a senior manager is important. I feel that these people are the ones who receive far more number of emails and would want and need automated email processing. 

I resonated with most of the needs that users’ pointed out and recognized some of the existing features that my current email client provides. For example, google generally keeps an option of “follow up” if an email sent didn’t get a response or the “reply” option if the email received hasn’t been replied for n-days. I am particularly interested in the different modes that could be set up. This would prove to be useful where the user could focus on work and periodically check a particular label like “to-do” or “important.” Additionally, only getting an important notification is also a priceless feature, in my opinion. 

Having loved all the proposed features in this paper, I would also like to point out some of the flaws, in my opinion. First, some of the applied rules might cause disruptions in case of important emails. One of the features mentioned was to automatically mark an email “read” when the consecutive emails come from the same recipient. This would work in case of a “social” or “promotions” email. However, this might end up making the user do more tasks, i.e., find from the read emails the ones that he actually never read. Additionally, I was also curious to know how security was handled here. Emails are anyways not known to be a secure medium of communication, and using this tool on top of it might make it further unsecure. Especially when research-related topics are discussed in the emails, they might be prone to breach? 

Questions

  1. What are the features you look for when it comes to email management? I would want to be only notified about emails that are important. 
  2. What other systems could benefit from such studies other than email processing? Could this be used to improve recommender systems, other file organizing software? 
  3. Would it be useful to take the input of senior researchers and managers? They are people who receive a lot of emails, and knowing their needs would be useful.
  4. How was the security handled in the YouPS system? 

Read More

04/22/2020 – Akshita Jha – Opportunities for Automating Email Processing: A Need-Finding Study

Summary:
“Opportunities for Automating Email Processing: A Need-Finding Study” by Park et. al. is an interesting paper as it talks about the need to manage emails. Managing emails is a time-consuming task that takes significant effort both from the consumer and the recipient. The authors find out that some of the work can be automated. The authors performed a mixed-methods need-finding study in order to essentially understand and answer two important questions: (i) What kind of automatic email handling do users want? (ii) What kind of information and computation is needed to support that automation? The authors conduct an investigation including a design workshop and a survey to identify the categories of needs and thoroughly understand these needs. The authors also surveyed the existing automated email classification systems to understand which needs have been addressed and where the gaps are in fulfilling these needs. The work done highlights the need for: “(i) a richer data model for rules, (ii) more ways to manage attention, (iii) leveraging internal and external email context, (iv) complex processing such as response aggregation, and affordances for senders.” The authors also ran a small authorized script over a user’s inbox which demonstrated that the above needs can not be fulfilled by existing email clients. This can be used as a motivation for new design interventions in email clients.

Reflections:
This is an interesting work that has the potential to pave the way for new design interventions in email processing and email management. However, there are certain limitations of this work. Out of the three studies that the authors conducted, two of them were explicitly focused on programmers. The third study focused on an engineer. This brings into question the generalizability of the experiments conducted. The needs of diverse users may wary and the results might not hold. Also, the questions the authors asked in the survey were influenced by the design workshop conducted by the authors which in turn influenced the analysis of the needs. The results might not hold true for all kinds of participants. The authors also could not quantify the percentage of need that is not being met. Also, asking programmers to be a part of the study did not help as they have the skills to write their own code and fulfill their own needs. The GUI needed by non-programmers might differ from those needed by the programmers. The authors should also seek insight from prior tools to build and improve upon their system.

Questions:
1. What are your thoughts on the paper? How do you plan to use the concepts present in the paper in your project?
2. Would you want an email client that manages your attention? Do you think that would be helpful?
3. How difficult is it for machine learning models to understand the internal and external context of an email?
4. Which email client do you use? What are the limitations of that email client?
5. Do you think writing simple rules for email management is too simplistic?

Read More

04/22/2020 – Akshita Jha – The Knowledge Accelerator: Big Picture Thinking in Small Pieces

Summary:
“The Knowledge Accelerator: Big Picture Thinking in Small Pieces” by Hahn et. al. talks about interdependent pieces of crowdsourcing. Most of the crowdsourcing tasks involve relying on a small number of humans to complete all the tasks required to complete a big picture. For example, most of the work in Wikipedia is done by a small number of highly invested editors. The authors bring up the idea of using a computational system such that each individual sees only a small part of the whole. This is a difficult task as much of the real-world tasks cannot be broken down into small, independent units and hence, the Amazon Mechanical Turk (AMT) cannot be used efficiently as it is used for prototyping. Also, it is a challenging problem to maintain the coherency of the overall system while breaking down the big task into smaller and mutually independent chunks for the crowd workers to work on. Moreover, the quality of the work being done is also dependent on the division of the tasks into coherent chunks. The authors present their idea which mitigates the need for a big picture view by a small number of workers, by ensuring small contributions by the individuals who see only a small chunk of the whole.

Reflections:
This is an interesting work as it talks about the advantages and the limitations of breaking down big tasks into small pieces. They built a prototype system called “Knowledge Accelerator” which was constrained such that no single task would amount for more than $1. Although the authors used this metric for tasks division, I’m not sure if this is a good enough metric to judge the independence and the quality of the small task. Also, the authors mention that the system should not be seen as a replacement for expert creation and curation of content. I disagree with this as I feel that with some modifications to the system, the system has the potential and might be able to completely replace humans for this task in the future. As is, the system has some gaping issues. The absence of a nuanced structure in the digests is problematic. It might also help to include iterations in the system after the workers have completed a part of their tasks and require more information. Finally, the authors would benefit by taking into account the cost of producing these answers on a large scale. The authors could use a computational model to dynamically decide how many workers and products to use at each stage such that the overall cost is minimized. The authors can also check if some of the answers could be reused across questions and across users. Incorporating contextual information can also help improve the system significantly.

Questions:
1. What are your thoughts on the paper?
2. How do you plan to use the concepts present in the paper in your project?
3. Are you dividing your tasks into small chunks such that crowd workers only see a part of the whole?

Read More

04/22/2020 – Subil Abraham – Chan et al., “SOLVENT”

Academic research strives for innovation in every field. But in order to innovate, researchers need to scour the length and breadth of their field as well as adjacent fields in order to get new ideas, understand the current work, and make sure they are not repeating someone else’s work. Being able to use an automated system that will scour the published research landscape for similar existing work would be a huge boost to productivity. SOLVENT aims to solve this problem by providing a method of having humans annotating the research abstracts to identify the background, purpose, method, and findings and use that as a schema to index the papers and identify similar papers (which have also been annotated and indexed) that match closely in one or a combination of those categories. The authors conduct three case studies to validate their method, for finding analogies within a single domain, for finding analogies across different domains, and for validating if annotating using crowd workers would be useful. They find that their method consistently outperforms other baselines in finding analogous papers.

I think that something like this would be incredibly helpful for researchers and significantly streamline the research process. I would gladly use something like this in my research process because it would save so much time. Of course, as they pointed out, the issue is scale because for it to be useful, a very large chunk (or ideally all) of current published research needs to be annotated according to the presented system. This could be something that can be integrated as a gamification mechanism in google scholar, where you occasionally ask the user to annotate an abstract. This way, you’re able to do it at scale. I also find it interesting that purpose+mechanism produced more results than background+purpose+mechanism. I would’ve figured that the more context that background provides would serve to find better matches. But given the goal of finding analogies even in different fields, perhaps background+purpose+mechanism rightly does not give great results because it gets too specific by providing too much information.

  1. Would you find use for this? Or would you prefer seeking out papers on your own?
  2. Do you think that the annotation categories are appropriate? Would other categories work? Maybe more or less categories?
  3. Is it feasible to expand on this work to cover annotating the content of whole papers? Would that be asking too much?

Read More

04/22/2020 – Subil Abraham – Park et al., “Opportunities for automating email processing”

Despite many different innovations from many different directions to try and revolutionize text communication, the humble email has lived on. The use of emails and email clients has adapted to current demands. The authors of this paper conduct investigations of the current needs of email users through a workshop and a survey, and they analyze open source code repositories that were performing things on email, in order to identify what are the current users needs, what is not being solved by existing clients, and what tasks are people taking the initiative to solve programmatically that their email clients don’t solve. They identify several high level categories of needs: the need for additional metadata on the email, the ability to leverage internal and external context, managing attention, not overburdening the receiver, and automated content processing and aggregation. They create a python tool called YouPS that provides an API with which a user can write scripts to perform some email automation tasks. They study the users of their tool for a week and note the kind of work they automate with the tool. They found that about 40% of the rules the users created with YouPS could not be done within their ordinary email client.

It’s fascinating that there is so much efficiency that can be obtained by allowing people to manage their email programmatically. I feel like something like this should’ve been a solved problem but apparently there is still room for innovation. It’s also possible that what YouPS provides is something that couldn’t really be done in an existing client, either because an existing client is trying to be as user friendly as possible to the widest variety of people (how many people actually know what IMAP does?). Alternatively, it could be a result of email clients just having accumulated so much cruft that adding a programmable layer would be incredibly hard. I get the reason why their survey participants skew towards computer science students, and how their solution gravitates towards solving the email problem in a way that is better suited for people who are in computer science. But I also think that, in the long term, keeping YouPS the way it is is the right way to go. With every additional layer of abstraction you add, you lose flexibility. GUIs are not the way to go but rather that people will adapt to using the programming API as programming becomes more prevalent in daily life. I also find the idea of modes really interesting and useful and would definitely be something I would like to have in my email clients.

  1. What kind of modes would you set up with different sets of rules e.g. a research mode, a weekend mode?
  2. Do you think that changing YouPS to be more GUI based would be beneficial because it would reach a wider audience? Or should it keep its flexibility at the cost of maybe not having wide spread adoption?
  3. How would you think about training an ML model that can capture internal and external context in your email?

Read More