02/05/2020 – Bipasha Banerjee – Power to the People: The Role of Humans in Interactive Machine Learning

February 4, 2020February 5, 2020 bipashab Leave a comment

Summary

The article was an interesting read on interactive machine learning published by Amershi et al. in the AI magazine in 2014. The authors pointed out the problems with traditional machine learning (ML). In particular, the time and efforts that are wasted to get a single job done. The process involves time-consuming interactions between machine learning practitioners and domain experts. In order to make this process efficient, continuous interactive approaches are needed to make the model interactive. The authors mentioned that the updates in the interactive strategies are quicker, get updated quickly based on user feedback. Another benefit of this approach that they pointed out where users with little or no ML experience could interact as the idea is input-output driven. They gave several case studies of such applications as the Crayons system. They mentioned some observations which demonstrate how the end users’ involvement affected the learning process. Some novel interfaces were also proposed in this article for interactive machine learning like the assessment of model qualities, timing queries to users, among others.

Reflection

I feel that interactive machine learning is a very useful and novel approach to machine learning. Having users get involved in the learning process indeed saves time and effort as opposed to the traditional approach. In traditional ML approaches, the collaboration between the practitioner and the domain experts are not seamless. I enjoyed reading about interactive based learning and how users are directly involved in the learning process. Case studies like learning for gesture-based music or image segmentation demonstrate how users provide feedback to the learner immediately after looking at the output. In traditional ML algorithms, we do involve human components during training. It is mainly in the form of annotated training labels. However, whenever domain-specific work is involved (e.g., the clustering of low-level protein problems), the task of labeling by crowd workers becomes tricky. Hence, this method of experts and end-users being involved in the learning process is productive. This is essentially a human-in-the-loop approach, as mentioned in the “Ghost Work” text. However, human interaction said it is different from the human interaction that occurs when humans are actively trying to interact with the system. The article mentioned various observations when dealing with humans, and it was interesting to see how humans behave, have biases. This was brought forward by last week’s reading about affordances. We found that humans do tend to have a bias, in this case, positive bias whereas machines tend to have an unbiased opinion (debatable as machines are trained by humans, and that data is prone to bias).

Questions

How to deal with human bias effectively?
How can we evaluate how well the system performs when the input data is not free from human errors? (E.g., humans tend to demonstrate how the learner should behave, which may/may not be the correct approach. They tend to have biases too)
Most of the case studies mentioned are interactive in nature (teaching concepts to robots, interactive Crayons System, etc.). How does this extend to domains that are non-interactive like text analysis?

02/05/2020 – Nurendra Choudhary – Guidelines for Human-AI Interaction

February 4, 2020February 5, 2020 Nurendra Choudhary Leave a comment

Summary

In this paper, the authors propose a set of 18 guidelines for human-AI interaction design. The guidelines are codified 150 AI-related design recommendations collected from diverse sources. Additionally, they also validate the design from both the users’ and expert’s perspective.

For the users, the principles are evaluated by 49 HCI practitioners by testing out a familiar AI-driven feature of a product. The goal is to estimate the number of guidelines followed/not followed by the feature. The feedback form also had a field for “does not apply” with a corresponding explanation field. Also, the review included a clarity component to find out ambiguity in the guidelines. From the empirical study, the authors were able to conclude that all the guidelines were majorly clear and hence could be applied to human-AI interactions. The authors revised the guidelines according to the feedback and conducted an expert review

The guidelines are really suitable when deploying ML systems in the real-world. Generally, in the AI community, researchers do not find any immediate concrete benefits for developing user-friendly systems. However, when such systems need to be deployed for real-world users, the user experience or human-AI interaction becomes a crucial part of the overall mechanism.

For the experts, the old and new guidelines were presented and they agreed on the revised guidelines for all but one (G15). From this, the authors conclude the effectiveness of the review process.

Reflection

Studying its applicability is really important (like the authors did in the paper), because I do not feel all of them are necessary for the diverse number of applications. It is interesting to notice that for photo organizers, most of the guidelines are already being followed and that they include the most number of “does not apply”. Also, e-commerce seems to be plagued with issues. I think this is because of the gap in transparency. The AI systems in photo-organizers need to be advertised to the users and it directly affects their decisions. However, on the other hand for e-commerce, the AI systems work in the background to influence user choices.

AI systems steadily learn new things and its several times not interpretable by the researchers who invented them. So, this I believe is an unfair ask. However, as the AI research community pushes for increased interpretability in the systems, I believe it is possible and will definitely help users. Imagine if you could explicitly set the features attached to your profile to improve your search recommendations.

Similarly, focus on “relevant social norms” and “mitigate social biases” are presently not currently focus but I believe these will grow over time to form a dominant area of ML research.

I think we can use these guidelines as tools to diversify AI research into more avenues focusing on building systems that inherently maintain these principles.

Questions

Can we study the feasibility and cost-to-benefit ratio of making changes to present AI systems based on these guidelines?
Can such principles be evaluated from the other perspective? Can we give better data guidelines for AI to help it learn?
How frequently does the list need to evolve with the evolution of ML systems?
Do the users always need to know the changes in AI? Think about interactive systems, the AI learns in real-time. Wouldn’t there be too many notifications to track for a human user? Would it become something like spam?

Word Count: 569

02/05/20 – Fanglan Chen – Principles of Mixed-Initiative User Interfaces

February 4, 2020February 5, 2020 Fanglan Chen Leave a comment

Horvitz’s paper “Principles of Mixed-Initiative User Interfaces” highlighted several principles important for allowing AI engineers to enhance human-computer interaction through a carefully designed coupling of automated services with direct manipulation by humans. The author demonstrated a middle ground of the human-computer interaction debate over opportunities of total automation of user needs (via intelligent agents) versus the importance of user control and decision making (via graphical user interfaces). By showing how to turn the proposed principles into potential improvements of an application, LookOut system for scheduling and meeting management, this paper explored the possibility to design innovative user interfaces and new human-computer interaction modalities by considering (from the ground up) designs that benefit of the power of direct manipulation and potentially valuable automated reasoning.

I think this discussion can be framed by noting the interesting duality between artificial intelligence(AI) and human-computer interaction(HCI). In AI, the goal is to mimic the way humans learn and think in order to create computer systems that can perform intelligent actions beyond naive tasks. In HCI, the target is to design computer interfaces that leverage off humans and provide aids to the users in the execution of intelligent actions. The basic idea of mixed-initiative interaction is to let agents work most effectively through a collaborative process. From the agents side, the major challenge is to deal with the uncertainties of users’ interests and intentions, thus know how to proceed to coordinate the users in a variety of tasks. It is indispensable to get humans in the interaction through an interaction mode convenient to the users. To achieve this, intelligent agents must be designed to be able to focus on various subproblems, fill in details, identify problem areas, and collaborate with different users to find the best personalized solutions. Without this mixed-initiative, AI designs would be very likely to fall into either human control or system control approaches. We also need to be aware that the mixed-initiative, also called co-creative, framework may come with a high cost. The system controlled frameworks are prevalent nowadays because they can save companies’ efforts and money. When we try to balance the operational expenses and improved customer service, it is important to ask how we can decide which framework to choose and at what stages we need to get humans involved.

In the mixed-initiative framework, a user and an AI agent work together to produce final products. Let us take a look at an example of mixed-initiative research led by the University of Rochester. Through years on mixed-initiative planning systems, one of their projects is to develop systems that can enhance human performance in managing plans, such as transportation network planning. There is no denying the fact that how intelligent planning systems and humans solve problems are highly different: automated agents require complete specifications of the goals and situation before knowing where to start; human experts incrementally learn about the scenario and modify the goals during the process of developing the plan. Faced with this dilemma, the research team decided to try to design a collaborative planning system which takes the advantages of both the user and machine to build the plan. The idea is that users bring intuition, concrete goals and trade-offs between goals, and advanced problem-solving strategies, while the agents bring an ability to manage details, allocate resources, and perform quantitative analysis of proposed actions. In this way, the capability of humans and the AI agents in creating desired output would be extended. I think the following questions are worthy of further discussion.

What is the boundary between human control, system control, and mixed-initiative frameworks?
How can we decide which framework to choose and at what stages we need to get humans involved to make the systems better?
How can we bring personalized user experience in consideration of the countless uncertain decisions?
Does all kinds of tasks require a mixed-initiative? What kind of projects would benefit more from the mixed-initiative framework?

02/05/20 – Vikram Mohanty – Power to the People: The Role of Humans in Interactive Machine Learning

February 4, 2020February 5, 2020 Vikram Mohanty Leave a comment

Paper Authors: Saleema Amershi, Maya Cakmak, W. Bradley Knox, Todd Kulesza

Summary

This paper highlights the usefulness of intelligent user interfaces or the power of human-in-the-loop workflows for improving machine learning models, and makes the case for moving from traditional machine learning workflows to interactive machine learning platforms. Implicitly, domain experts, or the potential users of such applications, can provide high-quality data points. In order to facilitate that, the role of user interfaces and user experience is illustrated via numerous examples. The paper outlines some challenges and future direction of research for understanding better how user interfaces interact with learning algorithms and vice-versa.

Reflections

The case study with proteins and biochemists illustrates a classic case of frustration associated with iterative design, while striving to align with user needs. However, in this example, the problem space was focused on getting a ML model right for the users. As the case study showed, interactive machine learning applications seemed to be the right fit for solving this problem as opposed to iteratively tuning the model manually by the experts. The research community is rightfully moving in the direction of producing smarter applications, and in order to ensure more (better?) intelligibility of these applications, building user interfaces/applications for interactive machine learning seem to be an effective and cost-efficient route.
In the realm of intelligent user interfaces, even though human users are not just good enough for providing quality training data and provide a lot more value beyond that, my reflection will center around the “human-in-the-loop” aspect to keep the discussion aligned with the paper’s narrative. The paper, without explicitly mentioning, also shows how we can get good quality training labels without relying solely on crowdsourcing platforms like AMT or Figure Eight, but rather, by focusing on the potential users of such applications, who are often domain experts for the applications. The trade-offs between collecting data from novice workers on AMT and domain experts are pretty obvious: quality vs cost.
The authors, through multiple examples, also make an effective argument about the inevitable role of user interfaces in ensuring a stream of good-quality data. The paper further stresses the importance of user experiences in generating rich and meaningful datasets.
“Users are People, Not Oracles” is the first point, and seems to be a pretty important one. If applications are built with the sole intention of collecting training data, there’s a risk of user experience being sacrificed, which may affect good quality data and the cycle ceases to exist.
Because it is difficult to decouple the contributions of the interface design or the algorithm chosen, coming up with an effective evaluation workflow seems like a challenge. However, it seems to be very context-dependent and following recent guidelines such as https://pair.withgoogle.com/ or https://www.microsoft.com/en-us/research/project/guidelines-for-human-ai-interaction/ can go a long way in improving these interfaces.

Questions

For researchers working on crowdsourcing platforms, even it’s for a simple labeling task, how did you handle poor quality data? Did you ever re-evaluate your task design (interface/user experience)?
Let’s say you work in a team with domain experts. Domain experts use an intelligent application in their every day work to accomplish a complex task A (the main goal of the team) and a result, you get data points (let’s call it A-data). As a researcher, you see the value of collecting data points B-data from the domain experts, which may improve the efficiency of task A. However, in order to collect B-data, domain experts have to perform task B, which is an extra task and deviates from A (which is their main objective and what they are paid for). How would you handle this situation? [This is pretty open-ended]
Can you think of any examples where collecting negative user feedback (which can significantly improving the learning algorithm) also fits the natural usage of the application?

02/04/2020 – Akshita Jha – Making Better Use of the Crowd: How Crowdsourcing Can Advance Machine Learning Research

February 4, 2020February 5, 2020 Akshita Jha Leave a comment

Summary:
“Making Better Use of the Crowd: How Crowdsourcing Can Advance Machine Learning Research” by Vaughan is survey paper that provides an informative overview of the crowdsourcing research for the machine leaning community. There are four main application areas:
(i)Data Generation: This is made up of two types of work. The first type of data aggregation is where the several crowdworkers are assigned the same data point and asked to annotate it. The machine learning algorithm then aggregates this data and finalizes the response. The second type of research in data aggregation involves modifying the system to get quality responses from crowdworkers.
(ii)Evaluation and debugging of the model: The crowdsourced workers can help debug and evaluate unsupervised machine learning algorithms like topic modelling, LDA, generative models etc.
(iii)Hybrid systems that utilize both machines and humans to expand its capabilities: Humans and machines have complementary strengths which, if made proper use of, can result in effective systems that help humans as well as improve the machine’s understanding.
(iv)Crowdsourced behavioral experiments that gather data and improve our understanding of how humans would like to interact with machine learning systems: Behavioral experiments can help us understand what how humans would like to interact with the system and the changes that can be made to improve the end user satisfaction.

Reflections:
In my limited knowledge about crowdworkers, I was aware of their importance for data aggregation. The author does a good job highlighting other areas where machine learning researchers might benefit from utilizing the power of crowdworkers. What I found particularly interesting were the case studies making use of crowdworkers to debug models and evaluate their interpretability. When we think of “debugging” models and finding out flows in the system, we mostly try to view things from the developer’s point of view and rely on them completely to debug and evaluate the model’s performance. Using crowdworkers for the task seems like a useful application areas which more machine learning researchers should be aware of. These tasks might also be of greater interest to the crowdworkers because they are not repetitive and involve active participation of the crowdworkers. “Human debugging” can help the system by taking into account the crowdworkers feedback to uncover bottlenecks in machine learning models. Hybrid techniques that involve using human feedback also seems like a promising application area where the system relies extensively on human judgement to make the right decisions. This also puts more responsibility on the machine learning researchers to be creative and come up with unique ways to involve humans. Setting up pilot studies can help in this front. Pilot studies can prove useful as they demonstrate how a lay man interacts with a system and the gaps that exist which should be filled up by the researchers in order to ensure a cohesive experience for the end user. However, care should be taken to ensure that the effort put in by the crowdworkers for building these systems does not go unappreciated.

Questions:
1. Did you agree with the applications of crowdworkers presented in this survey?
1. What steps can be taken to make machine learning researchers aware of these potential applications?
2. Apart from fairly compensating the workers, what steps can be taken to value their contributions?

02-05-2020 – Ziyao Wang – Principles of Mixed-Initiative User Interfaces

February 4, 2020February 5, 2020 Ziyao Wang 1 Comment

The author proposed an idea about combining both automated services and direct manipulation in developing user interfaces. With the consideration about 12 factors include developing significant value-added automation, uncertainty about a user’s goals, user’s attention, costs, benefits, dialog and so on, the author did research on the Lookout system, which is mixed-initiative user interfaces that enable users and intelligent agents to collaborate efficiently. The factors of value-added calendaring and scheduling service, decision making under uncertainty, multiple interaction modalities and handling invocation failures are evaluated, and future expectations are set. After the discussion about costs and benefits and the research on the Lookout system, the combination of reasoning machinery and direct manipulation was proved to have a promising chance for improving human-computer interaction.

Reflection:

Though this paper is written in 1999, the idea behind the paper is still valuable now. The combination of automated services and direct manipulation has been widely applied to current user-interfaces. For example, in designing of the user-interface of Taobao, the developers built numerous modules. The modules can be arranged by the users according to their preference and in the meantime, there is an AI system that will arrange the modules according to the search history and the user actions. With the combination of these two arrangements, user experience is improved significantly. Apart from Taobao, most current popular applications or websites have similar systems that were recommended by this paper written in 1999. Apart from this paper, there must be other old papers which contain valuable ideas nowadays. For this reason, there is a necessity for the current researchers to review old papers regularly.

It is for sure that we need to read up-to-date papers, which represent the current state-of-the-art. However, some of the ideas prompted in the old papers still work now. Some of the ideas proposed in these papers were impossible to be implemented at that time. As a result, the papers were ignored by other researchers. However, with the development of technology, we should review the papers which propose ideas that were not possible to implement from time to time. Someday, these not real ideas may become true with technology development.

Apart from the idea proposed in the paper, I have another thought about how the author can think about this idea. At that time, researchers focused on both the tools for users to directly manipulate user-interfaces and automated services which can sense user activity and take automated actions. However, researches that focus on the combination of the two aspects are limited. The author considered from both sides and directed a new way for the improvement of human-computer interaction. Similarly, if we can combine two up-to-date research topics which have similarity, novel solutions to some of the current challenges may be proposed and this may be applied in our course projects.

Question:

Which applications applied this proposed approach combining both automated services and direct manipulation?

What should we do if the agent’s decision is conflicted with the user’s decision?

Is it ethical for agents to track user activities? If not, how can agents service automatically?

02/04/2020 – Akshita Jha – Power to the People: The Role of Humans in Interactive Machine Learning

February 4, 2020February 5, 2020 Akshita Jha Leave a comment

Summary:
“Power to People: The Role of Humans in Interactive Machine Learning” by Amershi et. al. talks about the tightly coupled interactivity between systems and end users and how to better user experiences while improving system performance. The workflow for conventional machine learning algorithms involves a long drawn out process of training/pre-training, fine-tuning, iteratively tuning hyper-parameters, etc. to improve the target metrics. In comparison, the feedback in the interactive machine learning workflow are rapid, focused and incremental. Prominent real-world examples of interactive machine learning systems include recommender systems like Amazon and Netflix. Interactive machine learning has also been used for image segmentation where the users were asked to mark the foreground and the background image. The system took this feedback into consideration and improved its performance. Similarly, interactive music composition definitely helps improve the system but has also shown to train the students. The authors also present case studies that explore novel interfaces for interactive machine learning. For example, experimentation providing the ability to the end user to modify the input to observe the effect on the final result or the output, studies attempting to understand the efficacy of active vs. passive learning, enabling the users to query the learner as opposed to only answering questions, enabling users to provide active feedback and critique the learner’s output etc. In all the above examples, the user and the system are tightly coupled and form a cohesive unit which is difficult to study in isolation.

Reflections:
The paper presents several case surveys that highlights the differences between machines and humans. One particular case study that I found particularly interesting was where the researchers tried to use human feedback for training a reinforcement learning based model. In conventional reinforcement learning, the agent works in a simulated task environment and receives rewards based on each of its actions. The agent then tries to find ideal policies to best complete the task at hand. It does this maximizing the rewards. Unlike machine learning’s tendency to penalize the agent, humans in the loop focused on giving positive feedback more than the negative feedback which motivated the agent to follow a greedy algorithm. This led to an undesired effect on the agent that actively avoided getting to the goal. This result is fascinating for several reasons: (i) It effectively demonstrates the difference between the way the computers learns and the manner in which human psychology operates and (ii) It shows what can be changed in the system to incorporate human feedback and make it more effective and user friendly. Another unexpected insight was that people value transparency. It was surprising to find out that knowing more about the “black box” model helped in getting better labels. In order to design effective systems, it is critical to understand what humans expect while interacting with a system.

Questions:
1. Which systems do we interact with most on a daily system? Are they interactive?
2. Can we develop metrics to appropriately evaluate a model’s ability to interact?
3. Apart from reinforcement learning are there other any specific machine learning algorithms that might benefit from having humans in the loop?

02/05/20 – Lulwah AlKulaib- Making Better Use of the Crowd

February 4, 2020February 5, 2020 Lulwah AlKulaib Leave a comment

Summary

The survey provides an overview of machine learning projects utilizing crowdsourcing research. The author focuses on four application areas where crowdsourcing can be used in machine learning research: data generation, models evaluation and debugging, hybrid intelligence systems, and behavioral studies to inform ML research. She argues that crowdsourced studies of human behavior can be valuable for understanding how end users interact with machine learning systems. Then, she argues that these studies are also useful to understand crowdworkers themselves. She explains that it is important to understand crowdworkers and how that would help in defining recommendations of best practices that can be used when working with the crowd. The case studies that she presents show how to effectively run a crowdwork study and provide additional sources of motivation for workers. The case studies also answer how common is dishonesty on crowdsourcing platforms and how to mitigate it when encountered. They also show the hidden social network of crowdworkers and unmask the misconception of independence and isolation in crowdworkers. The author concludes with new best practices and tips for projects that use crowdsourcing. She also emphasizes the importance of pilots to a project’s success.

Reflection

This paper focuses on answering the question: how crowdsourcing can advance machine learning research? It asks the readers to consider how machine learning researchers

think about crowdsourcing. Suggesting an analysis of multiple ways in which crowdsourcing

can benefit and sometimes benefit from machine learning research. The author focuses her attention on 4 categories:

Data generation:

She analyzes case studies that aim to improve the quality of crowdsourced labels.

Evaluating and debugging models:

She discusses some papers that used crowdsourcing in evaluating unsupervised machine learning models.

Hybrid intelligence systems:

She shows examples of utilizing the “human in the loop” and how these systems are able to achieve more than would be possible with state of the art machine learning or AI systems alone because they make use of people’s skills and knowledge.

Behavioral studies to inform machine learning research:

This category discusses interpretable machine learning models design, the impact of algorithmic decisions on people’s lives, and questions that are interdisciplinary in nature and require better understanding of how humans interact with machine learning systems and AI.

The remainder of her survey provides best practices for crowdsourcing by analyzing multiple case studies. She addresses dishonest and spam-like behavior, how to set payments for tasks, what are the incentives for crowdworkers, how crowdworkers can motivate each other, and the communication and collaboration between crowdworkers.

I find that the community of crowdworkers was the most interesting to read. We have always thought that they’re isolated and independent workers. Finding about the forums, how they promote good jobs, and how they encourage one another was surprising.

I also find the suggested tips and best practices suggested are beneficial for crowdsource task posters. Especially if they’re new to the environment.

Discussion

What was something unexpected that you learned from this reading?
What are your tips for new crowdsource platform users?
What would you utilize from this reading into your project planning/work?

02/05/20 – Lulwah AlKulaib- Power to the people

February 4, 2020February 5, 2020 Lulwah AlKulaib 2 Comments

Summary

The paper argues that users have little to do with application development nowadays. They mention that developers apply machine learning techniques to solve problems but limit their interaction with end users to mediation by practitioners. This results in a long process with multiple iterations which limits the users ability to affect the models. They shed light on the importance of studying users in these systems and present case studies as examples of how these systems could result in better user experiences and more effective learning systems. The authors bring to our attention the advantages of studying user interaction with interactive machine learning systems and some flaws that developers must watch out for. They also present case studies of novel interfaces for interactive machine learning, clarify the different ways that could create richer interactions with users, and emphasize the importance of evaluating them with end users. The authors conclude their paper by underlying that any approach should be appropriately evaluated and tested before deployment since permitting user interactions were often beneficial but not always. They believe that by acknowledging the challenges in this approach, they would produce better machine learning systems as well as better end users.

Reflection

This paper focuses on the importance of the end user’s role in interactive machine learning systems. It raises the questions about how can users effectively influence machine learning systems? and how the machine learning system can appropriately influence the users? The paper also shows case studies that explain how people interact with machine learning systems. In those cases some unexpected results were found, like: people violated assumptions of the machine learning algorithm or they weren’t willing to comply with them. Other cases showed that studies can lead to insights about input and output types that interactive machine learning systems should support. The paper discusses case studies about some novel interfaces for interactive machine learning. Whether the novelty comes from new methods from receiving inputs or giving outputs. They mention that the new input techniques can give users more control over the system while output techniques can make the system more transparent or understandable. The paper does mention though that not all novel interfaces were beneficial, and some certain input and output types lead to obstacles for the user which reduces the accuracy of the learner model. The paper raises a good point about how different end users have different needs and expectations of the systems and therefore, rich interaction techniques must be designed accordingly. I agree with the authors that conducting studies of novel interactive machine learning systems is critical. And that those studies could be the basis of guideline development for future interactive learning systems.

Discussion

How would you apply interactive machine learning in your project?
Have you encountered such systems in other research papers you have read?
What are applications that could benefit from utilizing interactive machine learning systems?
How would you utilize some case studies suggestions from the paper in a machine learning model rather than the user experience?

02/05/20 – Nan LI – Guidelines for Human-AI Interaction

February 2, 2020February 5, 2020 NAN LI 1 Comment

Summary

In this paper, the author proposed and evaluated 18 guidelines for Human-AI Interaction. These guidelines were summarized and distilled through four main stages. The author explained these four phases and present partial results by listing several representative examples. First, the author made exhaustive research on AI design and guidelines from different companies, industries, public articles, and papers. Then, they conducted a modified heuristic evaluation to these guidelines and reflect the results. In the third phase, the author conducted a user study with 49 HCI practitioners to evaluate these guidelines from two main aspects: 1) The broad applicability of guidelines. 2) The semantic intelligibility of guidelines. Finally, the author evaluated and revised the guidelines with experts who have work experience in UX/HCI and familiar with discount usability methods such as heuristic evaluation(from paper). These guidelines are analyzed, adjusted and summarized after each stage based on the results of that stage. In the paper, the author even presents the results of each stage through tables and figures. Finally, the author discussed the scope of these guidelines, as well as issues that he found during the evaluation phases.

Reflection:

The main content of this article is an evaluation of the author’s summary. The evaluation process is divided into three phases. There are many times when we need to evaluate our won hypotheses or conclusions in daily studying and research. Thus, the evaluation process present by the author in this paper has many valuable points that are worth learning.

In the first phases, the author’s original version of the guideline was collected from various aspects. The collection is very comprehensive. It is not limited to published papers or journals but also focuses on existing products and applications.

In the next three phases, each stage of the assessment is very detailed and comprehensive. For example, when the author wants to evaluate whether these guidelines are applicable to AI-infused products, there are only 13 products were inspected. The number is not large, but the function of these products is very representative.

In addition, the personnel involved in the inspection in each phase are professionals with experience in the HCI area, which also ensures the professionalism of the evaluation.

During the evaluation, the author not only focused on the applicability and accuracy of the guidelines but also emphasized the quality of semantic expression. This has a great positive effect on the use and dissemination of the guidelines.

In the final discussion of the article, the author also pointed out the development of AI-infused products should always consider ethical issues instead of just adhere to the design guidelines. I don’ have much comment on this, just suddenly realized that no matter in what area, no matter design what kind of product, it is always linked to ethical issues and bias issues. This is always the most complicated topic.

Questions:

This paper gives a very detailed user study process and results. Have you ever conducted a standard HCI user-study? What can you learn from the user-study in this paper?
The original version of the guidelines proposed in this article is based on the existing paper and product design summary. However, this summary is more about AI-design than HCI design. How do you think about this? Do you think they should collect more information about the HCI design principle? Or you think the information collected by the author is adequate enough.
Do you think the inspection process should include more ordinary AI product users?