04/22/2020 – Ziyao Wang – Opportunities for Automating Email Processing: A Need-Finding Study

The authors conducted a series of surveys regarding email automation. Firstly, they held a workshop which invited 13 computer science students who are able to program. They were required to write email rules using natural language or pseudocode to identify categories of needed email automation. Then they analyzed the source code of scripts on GitHub to see what is needed and already developed by programmers. Finally, they deployed a programmable system YouPS which enables users to write custom email automation rules and made a survey after they used the system for one week. Finally, they found that currently limited email automation cannot meet users’ requirements, about 40% or the rules cannot be deployed using existed systems. Also, they concluded these extra user requirements for future development.

Reflections

The topic of this paper is really interesting. We use email every day and sometimes are annoyed by some of the emails. Though the email platforms already deployed some automation and allow users to customize their own scripts, some of the annoying emails can still go into users’ inboxes and some important emails are classified as spams. For me, I used to adjust myself to the automation mechanism. Check my spams and delete the advertisements from the inbox every day. But it would be great if the automation can be more user-friendly and provide more labels or rules for users to customize. This paper focused on this problem and did a thorough series of surveys to understand the users’ requirements. All the example scripts shown in the results seem useful to me and I really want the system can be deployed practically.

We can also learn from the methods used by the authors to do the surveys. Firstly, they hired computer science students to find general requirements. These students can seem like pilots. From these pilots, the researchers can have an overview of what is needed by the users. Then they did background researches according to the findings from the pilots. Finally, they combined the findings from both pilots and background researches to implement a system and test the system with the crowdsource workers, who can represent the public. This series of works is a good example of our projects. For future projects, we may also follow this workflow.

From my point of view, a significant limitation in the paper is that they only test the system on a small group of people. Neither computer science students nor programmers who upload their codes to GitHub cannot represent the public. Even the crowd workers still cannot represent the public. Most of the public knows little about programming and do not complete Hits on MTurk. Their requirements are not considered. If the condition available, the surveys should be done with more people.

Questions:

What is your preference in email automation? Do you have any preference which is not provided by current automation?

Can the crowd workers represent the public?

What should we do if we want to test systems with the public?

Read More

04/22/2020 – Dylan Finch – Opportunities for Automating Email Processing: A Need-Finding Study

Word count: 586

Summary of the Reading

This paper investigates automation with regards to emails. A large portion of many people’s days is devoted to sifting through the hundreds of emails that they receive. Many of the tasks that go into this might be automatable. This paper not only looks at how different tasks related to dealing with emails can be automated, but it also investigates the opportunities for automation in popular email clients. 

The paper found that many people wanted to automate tasks that required more data from emails. Users wanted access to things like the status of the email (pending, done, etc.), the deadline, the topic, the priority, and many other data points. The paper also noted that people would like to be able to aggregate responses to emails to more easily see things like group responses to an event. Having access to these features would allow for users to better manage their inboxes. Some current solutions exist to these issues, but some automation is held back by limitations in email clients.

Reflections and Connections

I love the idea of this paper. I know that ever since I got my email account, I have loved playing around with the automation features. When I was a kid it was more because it was just fun to do, but now that I’m an adult and receive many more emails than back then (and many more than I would like), I need automation to be able to deal with all of the emails that I get on a daily basis. 

I use Gmail and I think that it offers many good features for automating my inbox. Most importantly, Gmail will automatically sort mail into a few major categories, like Primary, Social, and Promotions. This by itself is extremely helpful. Most of the important emails get sent to the Primary tab so I can see them and deal with them more easily. The Promotions tab is also great at aggregating a lot of the emails I get from companies about products or sales or whatever that I don’t care about most of the time. Gmail also allows users to make filters that will automatically do some action based on certain criteria about the email. I think both of these features are great. But, it could be so much more useful.

As the paper mentions, many people want to be able to see more data about emails. I agree. The filter feature in Gmail is great, but you can only filter based on very simple things like the subject of the email, the date it was sent, or the sender. You can’t create filters for more useful things like tasks that are listed in the email, whether or not the email is an update to a project that you got other emails about, or the due date of tasks in the email. Like the paper says, these would be useful features. I would love a system that allowed me to create filters based on deeper data about my emails. Hopefully Gmail can take some notes from this paper and implement new ways to filter emails.

Questions

  1. What piece of data would you like to be able to sort emails by?
  2. What is your biggest problem with your current email client? Does it lack automation features? 
  3. What parts of email management can we not automate? Why? Could we see automatic replies to long emails in the future?

Read More

04/22/2020 – Dylan Finch – SOLVENT: A Mixed Initiative System for Finding Analogies Between Research Papers

Word count: 566

Summary of the Reading

This paper describes a system called SOLVENT, which uses humans to annotate parts of academic papers like the high-level problems being addressed in the paper, the specific lower-level problems being addressed in the paper, how the paper achieved its goal, and what was learned/achieved in the paper. Machines are then used to help detect similarities between papers so that it is easier for future researchers to find articles related to their work.

The researchers conducted three studies where they showed that their system greatly improves results over similar systems. They found that the system was able to detect near analogies between papers and that it was able to detect analogies across domains. One interesting finding was that even crowd workers without extensive knowledge about the paper they are annotating can produce helpful annotations. They also found that annotations could be created relatively quickly.

Reflections and Connections

I think that this paper addresses a real and growing problem in the scientific community. With more people doing research than ever, it is increasingly hard to find papers that you are looking for. I know that when I was writing my thesis, it took me a long time to find other papers relevant to my work. I think this is mainly because we have poor ways of indexing papers as of now. Really the only current ways that we can index papers are by the title of the paper and by the keywords embedded in the paper, if they exist. These methods can help find results, but they are terrible when they are the only way to find relevant papers. A title may be about 20 words long, with keywords being equally short. 40 words does not allow us to store enough information to fully represent a paper. We lose even more space for information when half of the title is a clever pun or phrase. These primitive ways of indexing papers also lose much of the nuance of papers. It is hard to explain results or even the specific problem that a paper is addressing in 40 words. So, we lose that information and we cannot index on it. 

A system like the one described in this paper would be a great help to researchers because it would allow them to find similar papers much more easily. This doesn’t even mention the fact that it lets researchers find papers outside of their disciplines. That opens up a whole new world of potential collaboration. This might help to eliminate the duplication of research in separate domains. Right now, it is possible that mathematicians and computer scientists, for example, try to experiment on the same algorithm, not knowing about the team from the other discipline. This wastes time, because we have two groups researching the same thing. A system like this could help mitigate that.

Questions

  1. How would a system like this affect your life as a researcher?
  2. Do you currently have trouble trying to find papers or similar ideas from outside your domain of research?
  3. What are some limitations of this system? Is there any way that we could produce even better annotations of research papers?
  4. Is there some way we could get the authors of each paper to produce data like this by themselves?

Read More

04/22/2020 – Mohannad Al Ameedi – The Knowledge Accelerator: Big Picture Thinking in Small Pieces

Summary

In this paper, the authors try to help crowdsource workers to have a big picture of the system that the small tasks assigned to them try to accomplish which can help in executing the task more efficiently to have a better contribution on achieving the big goal. The work also tries to help companies to remove the bottleneck caused by a small number of people who normally know or maintain the big picture and can cause serious risks if these individuals leave the company. The authors designed and developed a system known as Knowledge Accelerator that can be used by crowdsource worker to answer a given question and allow them to use relevant resources to help them answering the question in a big picture context without the need to a moderator. The system starts by asking the workers to choose different web pages related to the topic then extract the relevant information, then cluster the information on different categories. The system then integrate the information by drafting an article, then allow the editing of the article, and finally add supporting images or videos that are related to the article, and that way the system can help the crowdsource worker to know the big picture of the system and complete the task in a way that can achieve the big picture and goal of the system.

Reflection

I found the study mentioned in the paper to be very interesting. I agree with the authors that a lot of tasks that are done by crowdsource workers are simple and it is hard to divide complex tasks that can require the knowledge of the big picture. Knowing the big picture is very important and often known be very few people who are normally in technical leadership positions and losing them might cause serious issues.

I like the way the system is designed to provide a high-level information about the system while working on a small tasks. The pipeline and multi-stage operations used in the system to generate a cohesive article can help the workers to achieve the goal and also to know more information about the topic.  

This approach can also be used when building large scale systems where many components need to be built by developers, and often these developers don’t know what the system is trying to accomplish or try to solve. Normally developers work on a specific task to add an employee table or building a web service endpoint that can receive a request and send back a response without knowing who will be using the system or what will be the impact of their tasks on the overall system. I think we can use such system to help developers to understand the big picture of the system which can help them to solve problem on a way that can make a greater impact on the big picture or big problem that the system is trying to solve.

Questions

  • The system developed by the authors can help with generating articles about a specific topic, can we use the system to solve different problems?
  • Can we use this approach in software development to help developers understand the big picture of the system that they are trying to build especially when building large systems?
  • Can you think of a way to use a similar approach in your course project?

Read More

04/22/2020 – Mohannad Al Ameedi – Opportunities for Automating Email Processing: A Need-Finding Study

Summary

In this paper, the authors aim to study the needs of users for email automation and the resources required to achieve the automation. The authors goal is to design a good email automation system. They led a workshop to group the requirements into different categories, and they also conducted a survey using human computation to help understanding the users needs. After collecting all the requirements, the authors performed another study by reviewing an open source codebase available on GitHub to see which requirements already been met. After building and running the source code, they asked users to interact with the system to find out what is working well and what is not. They find out that there are limitations with the current implementation especially with complex requirements and lots of requirements are not being met. The authors hope that their findings can help future research to focus on the needs that are not met or satisfied yet.

Reflection

I found the method used by the authors to be very interesting. Conducting a survey and leading a workshop to find the users requirements and cross reference them with what is available and what is not with the current implementations is a nice approach to find out what is not implement yet.

I also like the idea of performing a code analysis on an open source project and link the analysis with user requirements. This approach can be used by software companies to search GitHub for current implementations of certain requirement rather than just searching a code implementation for a specific library or a tool.

I like the idea of email automation and I have used rules before to automatically move certain emails to special fodders. Nowadays most systems send automatic notifications and these notifications are necessary but sometimes it make it hard to distinguish between emails that need an immediate response versus emails that need a review at a later time. I also like that Gmail automatically move emails that has advertisements to different folders or different view to let the user focus on the important emails.

I agree with authors that there is a big room of improvements in the current implementation of email automation, but it will be interesting to know what will be the results if email systems like outlook, Gmail, and Yahoo have been deeply investigated to know what have been already implemented in these systems which was missing in the system that they have studied.

Questions

  • The authors studied the current implementation using one system and over a week of time, do you think using more than one systems or study the user interactions over multiple weeks or months might lead to a different results?
  • Do you think email automation can be used to send critical business emails that might accidentally includes some information that shouldn’t be sent? How can such systems overcome such issues?
  • Have you used rules to automate email operations? Were they useful?  

Read More

04/22/20 – Myles Frantz – The Knowledge Accelerator: Big Picture Thinking in Small Pieces

Summary

Maintaining a public and open source website can be difficult since the website is supported by individuals that are not paid. This team investigated using a crowd sourcing platform to not only support the platform but create articles. These articles (or tasks) were broken down into micro tasks there were manageable and scalable by the crowd sourced workers. These tasks were integrated throughout other HITs and were given extra contributions in order to relieve any extra reluctance on editing other crowd workers work. 

Reflection

I appreciate the competitive nature of comparing both the supervised learning (SL) and a reinforcement learning (RL) in the same type of game scenario of helping the human succeed by aiding the as best as it can. However as one of their contributions, I have issue with the relative comparison between the SL and RL bots. Within their contributions, they explicitly say they find “no significant difference in performance” between the different models. While they continue to describe the two methods performing approximately equally, their self-reported data describes a better model in most measurements. Within Table 1 (the comparison of humans working with each model), SL is reported as having a better (yet small) increase and decrease in Mean Rank and Mean Reciprocal Rank respectively (lower and then higher is better respectively). Within Table 2 (the comparison of the multitude of teams), there was only one scenario where the RL Model performed better than the SL Model. Lastly even in the participants self-reported perceptions, the SL Model only decreased performance in 1 of 6 different categories. Though it may be a small decrease in performance, they’re diction downplays part of the argument their making. Though I admit the SL model having a better Mean Rank by 0.3 (from Table 1 MR difference or Table 2 Human row) doesn’t appear to be a big difference, I believe part of their contribution statement “This suggests that while self-talk and RL are interesting directions to pursue for building better visual conversational agents…” is not an accurate description since by their own data it’s empirically disproven. 

Questions

  • Though I admit I focus on the representation of the data and the delivery of their contributions while they focus on the Human-in-the-loop aspect of the data, within the machine learning environment I imagine the decrease in accuracy (by 0.3 or approximately 5%) would not be described as insignificant. Do you think their verbiage is truly representative of the Machine Learning relevance? 
  • Do you think more Turk Workers (they used data from at least 56 workers) or adding requirements of age would change their data? 
  • Though evaluating the quality of collaboration is imperative between Humans and AI to ensure AI’s are made adequately, it seems common there is a disparity between comparing that collaboration and AI with AI. Due to this disconnect their statement on progress between the two collaboration studies seems like a fundamental idea. Do you think this work is more idealistic in its contributions or fundamental? 

Read More

04/22/20 – Myles Frantz – Opportunities for Automating Email Processing: A Need-Finding Study

Summary

Email is a formalized standard used throughout companies, college, and schools. It is also steadily used as documentation throughout companies, keeping track of requirements. Since emails are being used for increasingly more reasons, people have more usages for it. Through this the team has studied various usages of emails and a more integrated way to automate email rules. Using a thorough survey this team has created a domain specific language. Integrating this with the Internet Message Access Protocol (IMAP) protocol, users are also able to create more explicit and dynamic rules. 

Reflection

Working within a company I can greatly appreciate the granularity the provided framework. Within companies’ emails are used as a “rolling documentation”. This rolling documentation is in line with Agile, as it represents new requirements added later in the story. Creating very specific rules pertaining to certain scrum masters may be necessary to contain for reminders upon the rest of the team. Continuing the automation into tools could also lead further into a more streamlined deployment stream, enabling an email to signal a release from the release manager. Despite the wide acceptance of emails, there is the more available direct integration of tools like Mattermost. This availability is solely due to the being open for the application programmable interface that Mattermost provides. Despite the tools Google and Microsoft give throughout emails, the open source community provides a faster platform sharing this information. 

In addition to the rules provided through the interfaces, I believe the python email interface is an incredible extension throughout automating emails. The labeling system provided within many email interfaces is limited to rudimentary rules. The integration of such rules could potentially create better reminders through schools or an advisor advisee relationship. Using a reminder rule could create help issue reminds about grants or ETD issues. Since these rules are written in python, these can be shared and shared amongst group labs to ensure emails that are required are automatically managed. Instead of being limited to a single markdown based language, Python can the most popular language according to the IEEE top programming language survey. 

Questions

  • Utilizing a common standard ensures a there is a good interface for people to learn and get used to throughout the different technologies and companies. Do you think the python scripting is a common interface compared to the other markdown languages for the non-computer science-based users? 
  • The python language can be used in various platforms due to its libraries. In addition to the libraries, many python programs are extensible with to various platforms through an application programmable interface. Utilizing the potential of integrating with other systems throughout the python background, what other systems do you think this email system can be integrated with? 
  • This system was created while adapting current technology. Using the common Internet Message Access Protocol, this uses the fundamental mail protocol. This type of technology is adaptable to current usages within various servers. What kind of usages rules would you integrate with your university email? 

Read More