04/22/2020 – Sukrit Venkatagiri – SOLVENT: A Mixed Initiative System for Finding Analogies between Research Papers

April 23, 2020May 4, 2020 Sukrit Venkatagiri Leave a comment

Paper: Joel Chan, Joseph Chee Chang, Tom Hope, Dafna Shahaf, and Aniket Kittur. 2018. SOLVENT: A Mixed Initiative System for Finding Analogies Between Research Papers. Proc. ACM Hum.-Comput. Interact. 2, CSCW: 31:1–31:21

Summary: In this paper, the authors attempt to help researchers by generating (mining) analogies in other domains to help support interdisciplinary research. The paper proposes a unique annotation schema to extend prior work by Hope et al. and has four key facets: background, purpose, mechanism, and findings. The paper also has 3 interesting studies. First, it was collecting annotations from domain experts in research fields, and second, using the Solvent system to generate analogies with real-world usefulness. Finally, the authors scaled up Solvent through the use of crowdsourcing workflows. In each of the three studies, they used semantic vector representations for the annotations. The first study had a dataset focused on papers from CSCW and annotated by a member of the research team, while the second study involved working with an interdisciplinary team in bioengineering and mechanical engineering to determine whether Solvent could help identify analogies not easily found with citation tree search. Finally, in the third study, the authors leveraged crowd workers from UpWork and Amazon Mechanical Turk to generate annotations, and the authors found that workers had difficulties with the purpose and mechanism type annotations. On the whole, the Solvent system was found to help researchers and generate analogies effectively.

Reflection: Overall, I think this paper is well-motivated, and the 3 studies that form the basis for the results are impressive. It was also interesting that there was significant agreement between crowd workers and researchers in terms of annotation percentage. This proves a useful finding more broadly in that novices may be able to contribute to science not necessarily by doing science (especially as science gets harder to do by “normal” people, and is done in larger and larger teams), but by finding analogies between different disciplines’ literatures.

For their second study, the authors trained a word2vec model on a curated dataset of over 3000 papers from 3 domains. This was also good because they did not limit their work to just one domain and strived to generalize their work/findings. However, they are still largely engineering disciplines, albeit CSCW has a somewhat social science component to it. I wonder how it would work in other disciplines such as between the pure sciences? That might be an interesting follow up study.

I wonder how such a system might be deployed more broadly, as compared to a limited way that was done in this paper. I also wonder how long it would have taken crowd workers to go through the tasks and generate the findings in total.

Questions:

What other domains do you think Solvent would be useful in? Would it easily generalize?
Is majority vote an appropriate mechanism? What else could be used?
What are the challenges to creating analogies?

04/22/2020 – Sukrit Venkatagiri – Opportunities for Automating Email Processing: A Need-Finding Study

April 22, 2020April 22, 2020 Sukrit Venkatagiri Leave a comment

Paper: Soya Park, Amy X. Zhang, Luke S. Murray, and David R. Karger. 2019. Opportunities for Automating Email Processing: A Need-Finding Study. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI ’19), 1–12

Summary: This paper is a need-finding study exploring the opportunities and challenges for automating email processing. The authors conducted a mixed-methods study to pinpoint users’ expectations and needs in terms of automating email handling, in addition to the informational and computational support required for it. This study was divided into three main parts: what types of automated emails users want, what types of information and computation is needed, and then a field deployment of a simple inbox scripting tool. They did so in two steps. First, they had a formative design workshop where 13 computer science students created email processing rules. Second, they had a survey where 77 people (as well as 35 people without a technical background) answered questions to better understand categories of email automation, and their needs. The results show that there is a need to strengthen richer data on email, better management features, use of internal and external context,, and affordances. Finally, the paper describes a platform for writing small scripts for users’ inboxes, called YouPS. They enlisted 12 email users and found that users wanted more automation in their email management, especially in terms of richer data models, and processing content automatically.

Reflection: I agree with the premise of the paper: the fact that we should and can help people better manage their email inboxes to reduce the amount of energy people spend making sense of it. I wonder why email itself has got so overwhelming in the first place, and how it has affected workplace productivity.

I especially like the multi-pronged approach that the authors took in this paper, with a formative study, a survey, and building a system. I believe this multi-state approach is valuable and can provide multiple insights as well as opportunities for triangulating data.

With respect to their findings, I think the need for richer data models and rules, as well as ways to leverage internal and external email contexts are very important. If we are able to understand, for example, the senders’ urgency level and the receivers’ commitments to that sender, then we could draft a rule prioritizing or deprioritizing said emails. I also think the use of email templates and autofill options are useful and Google does something but in a more intelligent way with Gmail’s autofill feature.

However, I wonder how many users actually make use of intelligent filters, and/or would make use of any new tools that are introduced in the feature. It may be the case that only knowledge workers are bombarded with emails that require responses, while most other users simply receive spam (which, I think, is about 90-95% of emails that are sent in the entire world). It would also be interesting to see how this differs between people’s work and home emails. I myself maintain an email address for communications that I know will be spammy, such as insurance applications.

Questions:

How do you manage your email? Do you use filters?
Do you manage your different inboxes differently? How?
What do you think of YouPS? Would you use it? Why or why not?

04/22/2020 – Yuhang Liu – Opportunities for Automating Email Processing: A Need-Finding Study

April 21, 2020April 21, 2020 yuhang Liu 2 Comments

Summary:

This article discusses the e-mail management system. It is well known that e-mail has very important position in our lives, so the authors have developed a platform to help manage e-mail in order to make e-mail severs better. The authors have implemented methods that need study to learn: (i) what kind of automatic e-mails do users want, and (ii)what kinds of information and computation are needed to support that automation. The authors conducted three surveys: designing a workshop to identify categories of needs, conducting surveys to better understand these categories, and classifying existing email automation software to determine the needs that have been resolved. The authors ’results highlight the need to strengthen the following aspects in order to better automate the management of mail: richer data, more management attention, use of internal and external email contexts, complex processing (such as response summarization) and affordance senders. Finally, the authors developed a platform for writing small scripts on user inboxes. In their research, we found that most of the popular mail services are not enough to support automated management, which also supports us to develop new mail services.

Reflection:

First of all, from my personal experience, I agree with that we need a system that can help people manage email to reduce the energy people spend in this regard. Usually during my studies or work. If there is a new e-mail, I usually go to deal with the mail first, which is seriously affected the work efficiency. So a system to help manage mail is very necessary. And I also agree that the authors use three probes to study the needs of the mail management system. Among them, I think there are several requirements that are really urgent, such as richer data model for rules and the leveraging of internal and external email contexts. The richer data model helps to study the content and format of the email. For example, we have another article this week. Marking the structure of the article through people can help the machine to understand the article. Similarly, more email formats and templates Can help the machine understand the mail. And studying the internal and external emails can also better understand the content of the email, and at the same time understand the relationship between the sender and the user. These can be greatly improved. But I have doubts about the system mentioned in the article and the future development direction. Because researching the content of emails and users, especially such in-depth research, I think that it will intrude the privacy of users, and at the same time, I think that the accuracy of management cannot be guaranteed, so the wider application may bring more problems. As an example, I often find that some of the mails that I think are important are considered as spam and enter the spam box and make me miss many things. But I think the author proposes to make people customize in the text is a good solution, however for those who are not familiar with computer applications, whether such customization is really beneficial to their use is also a question and worth thinking about.

Question:

What aspects of the current popular mail service system needs to be improved?
If we want to build an automating e-mail processing system, do you think we need a brand new framework, or change slowly based on the existing service system?
Will automating processing of e-mail cause other problems?

04/22/2020 – Yuhang Liu – SOLVENT: A Mixed Initiative System for Finding Analogies between Research Papers

April 21, 2020April 22, 2020 yuhang Liu 2 Comments

Summary:

This paper mainly talks about a new paper search system, and I think that it can be thought as a thought initiation system rather than a paper search system. The article first proposes that scientific discovery is usually promoted by finding analogies in distant fields, but as more and more papers are published now, it is difficult to find papers with relevant ideas in a field, even though in those cross-field. Therefore, in order to achieve this aim, the authors introduced a hybrid system. In this system, crowdsourcing workers are mainly responsible for reading and understanding an article. People need to analyze an article from four aspects, Background, Purpose, Mechanism, Findings. The computer then analyzes the article based on these semantic frameworks, such as through TFIDF, or a combination of different architectures, and then finds similar usages from research papers. And through verification, found that these annotations are more effective, and can help experts.

Reflection:

First of all, I agree with the goal of the thesis, helping more researchers to obtain new ideas by analogy from outside the field, and then use these ideas and innovative ideas to promote the development of science and technology. I also think that analogies can help technology anyway. The article also cited quite a few examples to show the effectiveness of analogy and the breakthrough of research after the analogy. And in my reading in the past few weeks, I often feel that the article uses analogy. Since then, I have been thinking about how to help people more effectively through analogy and learn from other subject areas. Secondly, I think it is a very effective method to introduce people to assist in the completion. This is also the biggest gain in the course of my word. When a problem is encountered, the consideration is to use human power to solve it. And in the article, let the crowd source workers to annotate the article from four directions, I think it can greatly decompose an article, so that the computer can better understand this article. It is still difficult for a computer to directly understand an article and find out from it, but based on these architectures, finding the connection between articles is indeed a relatively simple task. But at the same time, I have some doubts about the effectiveness of the system proposed in the article. The article also spent a considerable amount of space to describe the limitations of this system. And my doubts are mainly focused on the third point, the usefulness in the real world. I think there are many aspects that will affect this practicality. For example, when the data increases, there will be more similar analogies, and the quality of these analogies is difficult to control. As we know, not all the lessons are useful, some ideas may bring Other problems, such as reduced efficiency, wasted resources, etc. The final point is that I think it takes a lot to get a good idea. We also need to control the quality of the work done by the workers and whether the algorithm can be so enlightening. In my opinion, a good innovation is usually electro-optical flint. Although this may be relatively easy to achieve on the basis of analogy, it still needs a good collaboration between human and machine to complete.

Question:

Do you think finding analogy by analyze papers based on their framework is a useful way?
Is there any other factors might influence this system, such as the increase of similar articles or different understanding between workers?
Except the methods mentioned in the article, as computer scientists, what can we do to inspire people?

04/22/20 – Fanglan Chen – SOLVENT: A Mixed Initiative System for Finding Analogies Between Research Papers

April 21, 2020 Fanglan Chen 1 Comment

Summary

Chan et al.’s paper “SOLVENT: A Mixed Initiative System for Finding Analogies Between Research Papers” explores the feasibility to leverage a mixed-initiative system to categorize research papers into their relational schemas by a collaborative Human-AI team, which can be utilized to identify analogical research papers potentially leading to innovative knowledge discoveries. The motivation of the researchers is the boom of research papers during recent decades, which makes searching for relevant papers in one domain or cross domains become more and more difficult. To facilitate the paper retrieval and interdisciplinary analogies search, the researchers develop a mixed-initiative system called SOLVENT in which humans mark the key aspects of research papers (their background, purpose, mechanism, and findings) with a computational machine learning model extracting semantic representations from these key aspects, which can facilitate the identifying analogies across different research domains.

Reflection

I think this paper conducted an innovative study on how the proposed system can actually support knowledge sharing and discovery in one domain and across different research communities. In the research explosion era, researchers would greatly benefit from using such a system for their own research and explore more interdisciplinary possibilities. That makes me think about why the system can achieve good performance via annotating the content of abstracts in the domains they conducted experiments. As we know, abstracts of the papers usually summarize the most important point of the research papers at a high-level. So it is intuitive and wise to utilize that part for annotating and further tasks. The researchers adopt the pre-trained word embedding models to generate semantic vector representations for each component, which performs pretty well in the tasks presented in the paper. I would imagine that the framework would probably work especially well for experimentation-driven domains, computer science, civil engineering, biology, etc., in which the research papers follow a specific writing structure. Can the proposed framework scale up to other less structured text materials, such as essays, novels, by extending it to full content instead of focusing on the abstracts? I think that would be an interesting future direction to explore.

In addition, one potential future work discussed in the paper is to extend the content-based approach with graph-based approaches like citation networks. I feel this is a novel idea and there is a lot of potential in this direction. Since the proposed system has the ability to find analogies across various research areas, I would be curious to see if it is possible to generate a knowledge graph based on the analogy pairs that can create something similar to a research road map, which indicates how the ideas from different papers in various research areas relate in a larger scope. I would imagine researchers would benefit from a systematized collection of research ideas.

Discussion

I think the following questions are worthy of further discussion.

Would you use this system to support your own research? Why or why not?
Do you think that the annotation categories capture the majority of the research papers? Can you think about other categories the paper did not mention?
What do you think of the researchers’ approach to annotating the abstracts? Would it be helpful to expand on this work to annotate the full content of the papers?
Do you think the domains involved in cross-domain research share the same purpose and mechanism? Can you think about some possible examples?

04/22/2020 – Palakh Mignonne Jude – Opportunities for Automating Email Processing: A Need-Finding Study

April 21, 2020 Palakh Mignonne Jude 2 Comments

SUMMARY

In this paper, the authors conduct a mixed-methods investigation to identify the expectations of users in terms of automated email handling as well as the information and computation required to support the same. They divided their study into 3 probes – ‘Wishful Thinking’, ‘Existing Automation Software’, and ‘Field Deployment of Simple Inbox Scripting’. The first probe was conducted in two stages. The first stage included a formative design workshop wherein the researchers enlisted 13 computer science students that were well-versed with programming to create rules. The second stage was a survey that enlisted 77 participants from a private university including 48% without technical backgrounds. The authors identified that there was a need for automated systems to have richer data models, use internal/external context, manage attention, alter the presentation of the inbox. In the second probe, the authors mined GitHub repositories to identify needs that programmers had implemented. Some of the additional needs they identified included processing, organizing, and archiving content, altering the default presentation of email clients, email analytics and productivity tools. As part of the third probe, the authors deployed their ‘YouPS’ system that enables users to process email rules in Python. For this probe, they enlisted 12 email users (all of whom could code in Python). Common themes across the rules generated include the creation of email modes, leveraging interaction history, and a non-use of existing email client features. The authors found that users did indeed desire more automation in their email management especially in terms of richer data models, internal and time-varying external context, and automated content processing.

REFLECTION

I liked the overall motivation of the study and especially resonated with the need of automated content processing as I would definitely benefit from having mail attachments downloaded and stored appropriately. The subjects that mentioned a reaction to signal if a message was viewed reminded me about Slack’s interface that allows you to ‘Add reaction’. I also believe that having a tagging feature would be good to ensure that key respondents are alerted of tasks that must be performed by them (especially in case of longer emails).

I liked the setup of Probe 3 and found that this was an interesting study. However, I wonder about the adoptability of such a system and as mentioned by the authors in the future work, I would be very interested in knowing how non-programmers would make use of these rules via the use of a drag-and-drop GUI.

The authors found that the subjects (10 out of 12) preferred to write rules in Python rather than use the mail client’s interface. This reminded me of prior discussions in class for the paper ‘Agency plus automation: Designing artificial intelligence into interactive systems’ wherein we discussed how humans prefer to be in control of the system and the level of automation that users desire (in a broader context).

QUESTIONS

The studies conducted include participants that had an average age group that was less than 30 and most of whom were affiliated with a university. Would the needs of business professionals vary in anyway as compared to the ones identified in this study?
Would business organizations be welcoming of a platform such as the YouPS system? Would this raise any security concerns considering that the system is able to access the data stored in the emails?
How would to rate the design of the YouPS interface? Do you see yourself using such a system to develop rules for your email?
Are there any needs, in addition to the ones mentioned in this paper, that you feel should be added?
The authors state that even though 2/3 studies focused on programmers, the needs identified were similar between programmers and non-programmers. Do you agree with this justification? Was there any bias that could have crept in as part of this experimental setup?

04/22/2020 – Palakh Mignonne Jude – SOLVENT: A Mixed Initiative System for Finding Analogies between Research Papers

April 21, 2020 Palakh Mignonne Jude 2 Comments

SUMMARY

The authors attempt to assist researcher to analogies in other domains in an attempt to aid interdisciplinary research. They propose a modified annotation scheme that extends on the work described by Hope et. al. [1] and contains 4 elements – Background, Purpose, Mechanism, and Findings. The authors conduct 3 studies – the first, involving the sourcing of annotations from domain-expert researchers, the second, using SOLVENT to find analogies with real-world value, and the third, scaling up SOLVENT through crowdsourcing. In each study, semantic vector representations were created from the annotations. In the first study, the dataset used focused on papers from the CSCW conference and was annotated by members of the research team. In the second study, the researchers worked with an interdisciplinary team working with bioengineering and mechanical engineering in an attempt to identify whether SOLVENT can aid in identifying analogies not easily found through keyword/citation tree searches. In the third study, the authors used crowdsource workers from Upwork and AMT to perform the annotations. The authors found that these crowd annotations did have substantial agreement with researcher annotations but the workers struggled with purpose and mechanism annotations. Overall, the authors found that SOLVENT helped researchers to find analogies more effectively.

REFLECTION

I liked the motivation for this paper – especially the study 3 that used of crowdworkers for the annotations and was glad to know that the authors found substantial agreement between crowdworker annotations and researcher annotations. This was an especially good finding as the corpus that I deal with also contains scientific work and scaling the annotations for the same has been a concern in the past.

As part of the second study, the authors mention that they trained a word2vec model on 3,000 papers in the dataset curated using papers from the 3 domains under consideration. This made me wonder about the generalizability of their approach. Would it be possible to generated more scientific word vectors that span across multiple domains? I think it would be interesting to see how the performance of a such a system would measure against the existing system. In addition to this, word2vec is known to face issue with out-of-vocabulary words, so that made me wonder if the authors had made any provisions to deal with the same.

QUESTIONS

In addition to the domains mentioned by the authors in the discussion section, what other domains can SOLVENT be applied to and how useful do you think it would be in those domains?
The authors used majority vote as the quality control mechanism for Study 3. What more sophisticated measures could be used instead of majority vote? Would any of the methods proposed in the paper ‘CrowdScape: Interactively Visualizing User Behavior and Output’ be applicable in this setting?
How well would SOLVENT extend to the abstracts of Electronic Theses and Dissertations that would contain a mix of STEM as well as non-STEM research? Would any modifications be required to the annotation scheme presented In this paper?

REFERENCES

Tom Hope, Joel Chan, Aniket Kittur, and Dafna Shahaf. 2017. Accelerating Innovation Through Analogy Mining. InProceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM,235–243.

04/22/2020 – Bipasha Banerjee – The Knowledge Accelerator: Big Picture Thinking in Small Pieces

April 21, 2020 bipashab 1 Comment

Summary

The paper talks about breaking larger tasks into smaller sub-tasks and then evaluating the performance of such systems. Here, the authors approach of dividing a large piece of work, mainly online work, into smaller chunks which would then use crowdworkers to perform the required tasks. The authors created a prototype system called “Knowledge Accelerator”. Its main goal is to use crowdworkers and help and find answers to open-ended, complex questions. However, the workers would only see part of the entire problem and work on a small amount of the task. It is mentioned that the maximum payment for any one task was $1. This gives an idea about how granular and simple tasks the authors wanted the crowdworkers to accomplish. The experiment was divided into two phases. In the first phase, the workers had to label some categories which were later used in the classification task. The second phase, on the other hand, required the workers to clean the output the classifier produced. This task involved the workers looking at the existing clusters and then tagging the new clips into an existing or a new cluster.

Reflection

I liked the way the authors approach the problem by dividing a huge problem into smaller manageable parts which in-turn becomes easy for workers to annotate. For our course project, we initially wanted the workers to read an entire chapter from an electronic thesis and dissertation and then label the department from which they think the document should belong to. We were not considering the fact that such a task is huge and would take a person around 15-30 minutes to complete. Dr. Luther pointed us in the right direction, where he asked us to break the chapter in parts and then present it to the workers. The paper also mentioned that too much context for workers could prove to be confusing. We can further decide better on how to divide the chapters so that we provide just the right amount of context.

I liked how the paper mentioned their ways of finding the sources, the filtering, and clustering techniques. It was interesting to see the challenges that they encountered while designing the task. This portion helps future researchers in the field to understand the mistakes and the decisions the authors took. I would view this paper as a guideline on how to best break a task into pieces so that it is easy as well as detailed enough for Amazon Mechanical Turkers.

Finally, I would like to point out that it was mentioned in the paper that only workers from the US were only considered. The reason was also mentioned in the footnote, that because of currency conversion, the value of $ is relative. I thought this was a very thoughtful point to add and bring light to. This helps maintain the quality of the work involved. Although, I think a current currency converter (API) could have been incorporated to compensate accordingly. Since the paper deals with searching for relevant answers for complex questions, involving workers from other countries might help improve the final answer.

Questions

How are you breaking a task into sub-tasks for the course project? (We had to modify our task design for our course project and divide a larger piece of text into smaller chunks)
Do you think that including workers from other countries would help improve the answers? (After considering the currency difference factor and compensating the same based on the current exchange rate.)
How can we improve the travel-related questions? Would utilizing workers who are “travel-enthusiasts or bloggers” improve the situation?

Note: This is an extra submission for this week’s reading.

4/22/20 – Lee Lisle – Opportunities for Automating Email Processing: A Need-Finding Study

April 21, 2020 Lorance R Lisle 1 Comment

Summary

Park et al.’s paper covers the thankless task of email management. They discuss how people spend too much time reading and responding to emails, and how it might be nice to get some sort of automation going for dealing with the deluge of electronic ascii flooding our days. In their process, they interviewed 13 people in a design workshop setting where they came up with 42 different rules for dealing with emails. From these rules, they identified five different overarching categories for these rules. Using this data, the authors then sent a survey out and received 77 responses on how they would use a “smart robot” to handle their emails. They identified 6 categories of possible automation from this survey. The authors then took to GitHub to find any existing automation that coders have come up with to deal with email through searching for codebases that messed with IMAP standards. This came up with 8 different categories. They then took all of the data thus far and created an email automation tool they called YouPS (cute), and identified how today’s email clients needed to adjust to fully handle the wanted automation.

Personal Reflection

I have to admit, when I first saw that they specified they gathered 13 “email users,” I laughed. Isn’t that just “people?” Furthermore, a “smart robot” is just a machine learning algorithm. The entire premise of calling their mail handler “YouPS.” This paper was full of funny little expressions and puns that I aspire to create one day.

While I liked that they found that senders wanted recipients to have an easier time dealing wither their email, I wasn’t terribly surprised about that. If I wanted a reply to an email, I’d rather they get the email and be able to deal with it immediately rather than risking them forgetting about my request altogether. That’s the best of both worlds, where all parties involved have the right amount of time to apply to pressing concerns.

I also appreciated that they were able to get responses from non-university affiliated people, as it’s often the case that research is found too narrowly focused on college students.

Lastly, I enjoyed the abstraction they created with their YouPS system. While it was essentially just an API that allowed users to use standard python with an email library, it seemed genuinely useful for many different tasks.

Questions

What is your biggest pet peeve about the way email is typically handled? How might automation solve that issue?
Grounded Theory is a method that pulls a ton of data out of written or verbal responses, but requires a significant effort. Did the team here effectively use grounded theory, and was it appropriate for this format? Why or why not?
How might you solve sender issues in email? Is it a worthwhile goal, or is dealing with those emails trivial?
What puns can you create based on your own research? Would you use them in your papers? Would you go so far as to include them in the titles of your works?

04/22/20 – Fanglan Chen – The Knowledge Accelerator: Big Picture Thinking in Small Pieces

April 21, 2020April 21, 2020 Fanglan Chen Leave a comment

Summary

Hahn’s paper “The Knowledge Accelerator: Big Picture Thinking in Small Pieces” utilizes a distributed information synthesis task as a probe to explore the opportunities and limitations of accomplishing big picture thinking by breaking it down into small pieces. Most traditional crowdsourcing work targets simple and independent tasks, but real-world tasks are usually complex and interdependent, which may require a big picture thinking. There are a few current crowdsourcing approaches that support the breaking-down of complex tasks by depending on a small group of people to manage the big picture view and control the ultimate objective. This paper proposes the idea that a computational system can automatically support big picture thinking all through the small pieces of work conducted by individuals. The researchers complete the distributed information synthesis in a prototype system for and evaluate the output of the system on different topics to validate the viability, strengths, and weaknesses of their proposed approach.

Reflection

I think this paper introduces an innovative approach for knowledge collection which can potentially replace a group of intermediate moderators/reviewers with an automated system. The example task explored in the paper is to answer a given question by collecting information in a parallel way. That relates with the question about how the proposed system enhances the quality of answer by a structured article compiled with the pieced information collected. To facilitate the similar question-answer task, we actually have a variety of online communities or platforms. Take Stack Overflow for example, it is a site for enthusiast programmers to learn and share their programming knowledge. A large number of professional programmers answer the questions on a voluntary basis, and usually a question would receive several answers detailing different approaches with the best solution on the top with a green check. You can check other answers as well in case you have tried one but that does not work for you. I think the variety of answers from different people sometimes enhance the possibility the problem can be solved. Somehow the proposed system reduces that kind of diversity in the answers. Also, one informative article is the final output of the system to a given question, then its quality would be important, but it seems hard to control the vote-then-edit pattern without any reviewers to ensure the quality of the final answer.

In addition, we need to be aware that much work in the real world can hardly be conducted via crowdsourcing because of the difficulty in decomposing tasks into small, independent units, and more importantly, the objective is beyond to accelerate the computational time or collect complete information. For creative work such writing a song, editing a film, designing a product, the goal is more like to encourage creativity and diversity. In those scenarios, even with a clear big picture in minds, it is very difficult to put together the small pieces of work by a group of recruited crowd workers to create a good piece of work. As a result, I think the proposed approach is limited to comparatively less creative tasks where each piece can be decomposed and processed in an independent way.

Discussion

I think the following questions are worthy of further discussion.

Do you think the proposed system can completely replace the role of moderators/reviewers in that big picture? What are the advantages and disadvantages?
This paper discusses the proposed system in the task of question-answer. What are the other possible applications the system could be helpful?
Can you think about any possible aspect of improving the system to scale it up to other domains or even non-AI domains?
Do you consider the breaking-down approach in your course project? If yes, how would you like to approach that?