02/05/2020 – Palakh Mignonne Jude – Guidelines for Human -AI Interaction

SUMMARY

In this paper, the authors propose 18 design guidelines for human-AI interaction with the aim that these guidelines would serve as a resource for practitioners. The authors codified over 150 AI-related design recommendations and then through multiple phases, and refinement processes modified this list and defined 18 generally applicable principles. As part of the first phase, the authors reviewed AI products, public articles, and relevant scholarly papers. They obtained a total of 168 potential guidelines which were then clustered to form 35 concepts. This was followed by a filtration process that reduced the number of concepts to 20 guidelines. As part of phase 2, the authors conducted a modified heuristic evaluation attempting to identify both applications and violations of the proposed guidelines. They utilized 13 AI-infused products/features as part of this evaluation study. This phase helped to merge, split, and rephrase different guidelines and reduced the total number of guidelines to 18. In the third phase, the authors conducted a user study with 49 HCI practitioners in an attempt to understand if the guidelines were applicable across multiple products and to obtain feedback about the clarity of the guidelines. The authors ensured that the participants had experience in HCI and were familiar with discount usability testing methods. Modifications were made to the guidelines based on the feedback obtained from the user study based on the level of clarity and relevance of the guidelines. In the fourth phase, the authors conducted an expert evaluation of the revisions. These experts comprised of people who had work experience in UX/HCI and were well-versed with discount usability methods. With the help of these experts, the authors assessed whether the 18 guidelines were easy to understand. After this phase, they published a final set of 18 guidelines.

REFLECTION

After reading the 1999 paper on ‘Principles of Mixed-initiative User Interfaces’, I found that the study performed by this paper was much more extensive as well as more relatable as the AI-infused systems considered were systems that I had some knowledge about as compared to the LookOut system that I have never used in the past. I felt that the authors performed a thorough comparison and included various important phases in order to formulate the best set of guidelines. I found that it was interesting that this study was performed by researchers from Microsoft 20 years after the original 1999 paper (also done at Microsoft). I believe that the authors provided a detailed analysis of each of the guidelines and that it was good that they included identifying applications of the guidelines as part of the user study.

I felt that some of the violations reported by people were very well thought out; for example, when reporting a violation for an application where the explanation was provided but inadequate with respect to the navigation product – ‘best route’ was suggested, but no criteria was given for why the route was the best. I feel that such notes provided by the users were definitely useful in helping the authors better assimilate good and generalizable guidelines.

QUESTION

  1. Which, in your experience, among the 18 guidelines did you find to be most important? Was there any guideline that appeared to be ambiguous to you? For those that have limited experience in the field of HCI, were there any guidelines that seemed unclear or difficult to understand?
  2. The authors mention that they do not explicitly include broad principles such as ‘build trust’, but instead made use of indirect methods by focusing on specific and observable guidelines that are likely to contribute to building trust. Is there a more direct evaluation that can be performed in order to measure building trust?
  3. The authors mention that it is essential that designers evaluate the influences of AI technologies on people and society. What methods can be implemented in order to ensure that this evaluation is performed? What are the long-term impacts of not having designers perform this evaluation?
  4. For the user study (as part of phase 3), 49 HCI practitioners were contacted. How was this done and what environment was used for the study?

Read More

02/05/20 – Vikram Mohanty – Principles of Mixed-Initiative User Interfaces

Paper Authors: Eric Horvitz

Summary

This is a formative paper on how mixed-initiative user interfaces should be designed, taking into account the principles surrounding users’ abilities to directly manipulate the objects, and combining it with principles of interface agents targeted towards automation. The paper outlines 12 critical factors for the effective integration of automated services with direct manipulation interfaces, and illustrates these points through different features of LookOut, a piece of software that provides automated scheduling services from emails in Microsoft Outlook.

Reflection

  1. This paper has aged well over the last 20 years. Even though this work has led to updated renditions which take into account recent developments in AI, the core principles outlined in this paper (i.e. being clear about the user’s goals, weighing in costs and benefits before intervening during the user’s actions, ability for users to refine results, etc.) still hold true till date.
  2. The AI research landscape has changed a lot since this paper came out. To give some context, modern AI-based techniques such as deep learning wasn’t prevalent both due to the lack of datasets and computing power. The internet was nowhere as big as it is right now. The cost of automating everything back then would obviously be bottlenecked by the lack of datasets. That feels like a strong motivation for aligning automated actions with the user’s goals and actions and factoring in context-dependent costs and benefits. For e.g. assigning a likelihood that an email message that has just received the focus of attention is in the goal category of “User will wish to schedule or review a calendar for this email” versus the goal category of “User will not wish to schedule or review a calendar for this email” based on the content of the messages.” This is predominantly goal-driven and involves exploring the problem space to generate the necessary dataset. Right now, we are not bottlenecked by problems like lack of computing power or unavailability of datasets, and if we do not follow what the paper advocates about aligning automated actions with the user’s goals and actions or factoring in the context, we may end up with meaningless datasets or unnecessary automation.
  3. These principles do not treat agent intervention lightly at all. In a fast-paced world, in the race towards automation, this particular point might get lost easily. For LookOut’s intervention with a dialog or action, multiple studies were conducted to identify the most appropriate timing of messaging services as a function of the nature of the message. Carefully handling the presentation of automated agents is crucial for a positive user experience.
  4. The paper highlights how the utility of system taking action when a goal is not desired can depend on any combination of the user’s attention status or the screen real estate or users being more rushed. This does not seem like something that can be easily determined by the system on its own or algorithm developers. System developers or designers may have a better understanding of such real-world possible scenarios, and therefore, this calls for researchers from both fields to work together towards a shared goal.
  5. Uncertainties or the limitations of AI should not come in the way of solving hard problems that can benefit users. Designing intelligent user interfaces that can leverage the complementary strengths of humans and AI can help solve problems that cannot be solved on its own by either parties. HCI folks have long been at the forefront of thinking about how humans will interact with AI, and how to do work that allows them to do so effectively.

Questions

  1. Which principles, in particular, do you find useful if you are designing a system where the intelligent agent is supposed to aid the users in open-ended problems that do not have a clear predetermined right/wrong solution i.e. search engines or Netflix recommendations?
  2. Why don’t we see the “genie” or “clippy” anymore? What does it tell about this – “employing socially appropriate behaviors for agent−user interaction”?
  3. A) For folks who work on building interfaces, do you feel some elements can be made smarter? How do you see using these principles in your work? B) For folks who work on developing intelligent algorithms, do you consider end-user applications in your work? How do you see using these principles in your work? Can you imagine different scenarios where your algorithm isn’t 100% accurate.

Read More

02/05/2020 – Sukrit Venkatagiri – Making Better Use of the Crowd: How Crowdsourcing Can Advance Machine Learning Research

Paper: Making Better Use of the Crowd: How Crowdsourcing Can Advance Machine Learning Research
Author: Jennifer Wortman Vaughan

Summary:
This is a survey paper that provides an overview of crowdsourcing research as it applies to the machine learning community. It first provides an overview of crowdsourcing platforms, followed by an analysis of how crowdsourcing has been used in ML research. Specifically, in generating data, evaluating and debugging models, in hybrid intelligent systems, and its use in behavioral experiments. The paper then reviews crowdsourcing literature that studies the behavior of the crowd, their motivations, and ways to improve work quality. In particular, the paper focuses on dishonest worker behavior, ethical payment for crowd work, and the communication and collaboration patterns of crowd workers. Finally, the paper concludes with a set of best practices to be followed for optimal use of crowdsourcing in machine learning research.

Reflection:
Overall, the paper provides a thorough and analytic overview of the applications of crowdsourcing in machine learning research, as well as useful best practices for machine learning researchers to better make use of crowdsourcing. 

The paper largely focuses on ways crowdsourcing has been used to advance machine learning research, but also subtly talks about how machine learning can advance crowdsourcing research. This is interesting because it points to how these two fields are highly interrelated and co-dependent. For example, with the GalaxyZoo project, researchers attempted to optimize crowd effort, which meant that fewer judgements were necessary per image, allowing more images to be annotated overall. Other interesting uses of crowdsourcing were in evaluating unsupervised models and model interpretability. 

On the other hand, I wonder what a paper that was more focused on HCI research would look like. In this paper, humans are placed “in the loop,” while in HCI (and the real world) it’s often the machine that is in the loop of a human’s workflow. For example, the paper states that hybrid intelligent systems “leverage the complementary strengths of humans and machines to expand the capabilities of AI.” A more human-centered version would be “[…] to expand the capabilities of humans.”

Another interesting point is that all the hybrid intelligent systems mentioned in Section 4 had their own metrics to assess human, machine, and human+machine performance. This speaks to the need for a common language for understanding and then assessing human-computer collaboration, which is described in more detail in [1]. Perhaps it is the unique, highly-contextual nature of the work that prevents more standard comparisons across hybrid intelligent systems. Indeed, the paper mentions this with regards to evaluating and debugging models, that “there is no objective notion of ground truth.”

Lastly, the paper talks about two relevant topics for this semester, the first is algorithmic aversion and how participants who were given more control in algorithmic decision-making systems were more accurate, not because the human judgements were more accurate, but because the humans were more willing to listen to the algorithm’s recommendations. I wonder if this is true in all contexts, and how best to incorporate this work into mixed-initiative user interfaces. The second topic of relevance is that the quality of crowd work naturally varied with payment. However, very high wages increased the quantity of work but not always the quality. Combined with the various motivations that workers have, it is not always clear how much to pay for a given task, necessitating the need for pilot studies—which this paper also heavily insists on. However, even if it was not explicitly mentioned, one thing is certain: we must pay fair wages for fair work [2].

Questions:

  1. What are some new best-practices that you learned about crowdsourcing? How do you plan to apply it in your project?
  2. How might you use crowdsourcing to advance your own research? Even if it isn’t in machine learning.
  3.  Plenty of jobs are seemingly menial, e.g., assembly jobs in factories, working in a call center, delivering mail, yet no one has tried to make these jobs more “meaningful” and motivating to increase people’s willingness to do the task.
    1. Why do you think there is such a large body of work around making crowd work more intrinsically motivating?
    2. Imagine you are doing crowd work for a living, would you prefer to be paid more for a boring task, or paid less for a task masquerading as a fun game?
  4. How much do you plan to pay crowd workers for your project? Additional reference: [2].
  5. ML systems abstract away the human labor that goes into making it work, especially as seen in the popular press. How might we highlight the invaluable role played by humans in ML systems? By “humans,” I mean the developers, the crowd workers, the end-users, etc.

References:
[1] R. Jordon Crouser and Remco Chang. 2012. An Affordance-Based Framework for Human Computation and Human-Computer Collaboration. IEEE Transactions on Visualization and Computer Graphics 18, 12: 2859–2868. https://doi.org/10.1109/TVCG.2012.195
[2] Whiting, Mark E., Grant Hugh, and Michael S. Bernstein. “Fair Work: Crowd Work Minimum Wage with One Line of Code.” In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, vol. 7, no. 1, pp. 197-206. 2019.

Read More

02/05/20 – Fanglan Chen – Guidelines for Human-AI Interaction

The variability of current AI designs as well as automated inferences of failures – ranging from the  disruptive or confusing to the more serious – calls for creating more effective and intuitive user experiences with AI. The paper “Guidelines for Human-AI interaction” enriches the ongoing conversation on heuristics and guidelines towards human-centered design for AI systems. In this paper, Amershi et al. identified more than 160 potential recommendations for Human-AI interaction from respected sources that ranged from scholarly research papers to blog posts and internal documents. Through a 4-phase framework, the research team systematically distilled and validated the guideline candidates into a unified set of 18 guidelines. This work empowers the community by providing a resource for designers working with AI and facilitates future research into the refinement and development of principles for human-AI interaction.

The proposed 18 guidelines in the paper are grouped into four sections that prescribe how an AI system should behave upon initial interaction, as the user interacts with the system, when the system is wrong, and over time. As far as I can see, the major research question is how to control automated inferences to some extent when they are performing under uncertainty. We can imagine that it would be extremely dangerous in scenarios in which humans are unable to intervene when AI makes incorrect decisions. Take autonomous vehicles for example, AI may behave abnormally under the real-world situations that it has not faced in its training. How to integrate efficient dismissal or correction is an important question to consider in the initial design of the autonomous system.

Also, we need to be aware of that while the guidelines for Human-AI Interaction are developed to support design decisions, they are not intended to be used as a simple checklist. One of the important intentions is to support and stimulate conversations between user experience and engineering practitioners that lead to better AI design. Another takeaway from this paper is that there will always be numerous situations where AI designers must consider trade-offs among guidelines and weigh the importance of one or more over others. Beyond the 4-phase framework presented in the paper, I think there are at least two points worth of discussion. Firstly, the 4-phase framework is more like a narrowing down process, while no open-ended questions are raised in the feedback circle. The functioning and goals of apps in different categories may vary. Rising capabilities and use cases may suggest there is a need for additional guidelines. As the AI design advances, we may need more innovative ideas about the future AI design instead of constraining to the existing guidelines. Secondly, it seems all the evaluators participated in the user study are in the domain of HCI and a number of them gain years of experience in the field. I’m wondering if the opinions of end users without HCI experience need to be considered as well and how a wider involvement would impact the final results. I think the following questions are worthy of further discussion.

  • Which of the 18 proposed design guidelines are comparatively difficult to employ in AI designs? Why?
  • Besides the proposed guidelines, are there any design guidelines worthy of attention but not discussed in the paper?
  • Some of the guidelines seem to be of greater importance than others in user experience of specific domains. Do you think the guidelines need to be tailored to the specific categories of applications?
  • In the user study, do you think it would be important to include end users who actually use the app but without experience studying on HCI?

Read More

2/5/2020 – Jooyoung Whang – Guidelines for Human-AI Interaction

The paper is a good extraction of various design recommendations of human-AI interaction systems that have been collected for more than 20 years since the rise of AI. The authors run 4 iterations of filtering to end up with a final set of 18 guidelines that have been thoroughly reviewed and used. Their source of data comes from commercial AI products, user reviews, and related literature. In each of the iterations, the authors:

1. Extracted the initial set of guidelines

2. Reduced the number down via internal evaluation

3. Performed user study to verify relevance and clarity

4. Tested the guidelines with experts of the field

The authors provide a nicely summarized table containing all the guidelines and their examples. Rather than going in-depth about the resulting guidelines themselves, the authors focus more on the process and feedback that they received. The authors conclude by stating that the provided guidelines are mostly for general design cases and not specific ones.

When I was examining the guideline table, I liked how it was divided into four cases in the design iteration. In a usability engineering class that I took, I learned that a product’s design lifecycle consists of Analyze, Design, Prototype, and Evaluate, in their respective order (and can repeat). I could see that the guidelines focus a lot on Analyze, Design, and Evaluate. It was interesting that prototyping wasn’t strongly implied in the guidelines. I assume it may have been because the whole design iteration was considered a pass of prototyping. It may also have been because a system involving artificial intelligence is too hard to create a low fidelity prototype. The reason for going through a prototyping process is to quickly filter out what works and what doesn’t. As the nature of artificial intelligence requires extensive training and complicated reasoning, a pass of prototyping will accordingly take longer than other kinds of products.

It is very interesting that the guidelines (for long term) instruct that the AI system must inform its actions to the users. In my experience using AI systems such as voice recognition not knowing about machine learning techniques, the system mostly appeared as a black box. I also observed many people who intentionally tried not to use these kinds of systems because of suspicion. I think revealing portions of information and giving control to the users is a very good idea. This will allow more people to quickly adjust to the system.

The followings are the questions that came up to me when I was reading the paper:

1. As in my reflection, it is expensive to go through an entire design process for human-AI systems. Would there be a good workaround for this problem?

2. How much control do you think is appropriate to give to the users of the system? The paper mentions informing how the system will react to certain user actions and allowing the user to choose whether or not to use the system. But can we and should we allow further control?

3. The paper focuses on general cases of designing human-AI systems. They note that they’ve intentionally left out special cases. What kinds of special systems do you think will not need to follow the guidelines?

Read More

02/05/20 – Myles Frantz – Making Better Use of the Crowd: How Crowdsourcing Can Advance Machine Learning Research

Throughout this paper, a solo Microsoft researcher created a seemingly comprehensive (and almost exhaustive it seems) methods in which Crowd Sourcing can be used to enhance and improve various aspects of Machine Learning. Not only limiting the study to one of the most common Crowdsourcing platforms; a multitude of other platforms were included within the study as well, including but not limited to: CrowdFlower, ClickWorker, and Prolific Academic. Through reading and summarizing around 200 papers, the key areas of affordance were categorized under 4 different categories; Data generation (the accuracy and quality of data being generated), Evaluating and debugging models (the accuracy of the predictions), Hybrid intelligence systems (the collaboration between human and a), and Behavioral studies to inform machine learning research (realistic human interactions and responses to ai systems). Each of these categories has several examples underneath of it, further describing various aspects, their benefits and their disadvantages. Included in these sub-categories are factors such as speech recognition, determining human behavior (and their general attitude) towards specific types of ads, and crowd workers inter communication. With these various factors laid out the author insists that the platform and the requestors ensure their crowd workers have a good relationship, have good task design, and are thoroughly tested.

I agree with the vastness and how comprehensive thus survey is. Many of the points seem to acknowledge most of the region of the research area. Furthermore, it also doesn’t seem this work could easily be set into a more compact state.

I do whole-fully agree with one of the lasting points of ensuring the consumers of the platform (requesters and the crowd workers) in a good and working relationship. There is a multitude of platforms overcorrecting or under correcting their issues upsetting their target audience and cliental, therefor creating negative press and temporarily dipping their stock. Such a leading example of this is Youtube and their child content, where people have been sending ads illegally towards children. Youtube in turn overcorrected and still ended up with negative press since they hurt several of their creators.

Though not a fault of the survey, I disagree with the methods of Hybrid Forecasting (“producing forecasts about geopolitical events”) and Understanding Reactions to Ads. These seem to be an unfortunate but inevitable outcome with how companies and potentially governments are attempting to predict and potentially get ahead of incidents. Advertisements are not as relatively bad, however in general it seems the practice of ensuring the perfect balance of targeting the user and creating the perfect environment for viewing an advertisement seems to be malicious and not for the betterment of humanity.

  • While impractically impossible, I would like to see what the industry has created in the aspect of Hybrid Forecasting. Without knowing how far this kind of technology has spread creates an imagination like a few Black Mirror episodes.
  • From the authors I would like to see which platforms host each of the subcategories of features. This could be done on the readers side though this might seem a study in and of itself.
  • My final question would be requesting a subjective comparison of the “morality” of each platform. This could be done in comparing the quality of the workers in their discussion or how strong the gamification is between platforms.

Read More

02/05/20 – Myles Frantz – Guidelines for Human-AI Interaction

Through this paper, the various Microsoft authors created, and survey tested a set of guide lines (or best approaches) for designing and creating AI Human interactions. Throughout their study, they went through 150 AI design recommendations, ran their initial set of guidelines through a strict set of heuristics, and finally through multiple rounds in user study consisting of 49 moderates (with at least 1 year of self-reported experience) HCI practitioners. From this, the resulting 18 guidelines had the categories of “Initially” (at the start of development), “During interaction”, “When wrong” (the ai system), and “Over time”. These categories include some of the following (but not limited to): “Make clear what the system can do”, “Support efficient invocation”, and “Remember recent interactions”. Throughout the user study, these guidelines were tested to how relevant they would be in the specific avenue of technology (such as Navigation and Social Networks). Throughout these ratings, at least 50% of the respondents thought the guidelines were clear, while approximately 75% of the resonant thought the guidelines were at least neutral (or all right to understand). Finally, a set of HCI experts were asked to ensure further revisions on the guidelines were accurate and better reflected the area.

I agree and really appreciate the insight into the relevancy testing of each guideline on each section of industry. Not only does this help to avoid mis-appropriation of guidelines into unintended sections, it also helps create a guideline for the guidelines. This will help ensure people implementing these set of guidelines have a better idea as to the best place they could be used.

I also agree and like the thorough testing that went into the vetting process for these guidelines. Within last weeks readings it seems the surveys were majority or solely based on the surveys of papers and subjective to the authors. Having various rounds of testing with people who have generally high average of experience within the field grants great support to the guidelines.

  • One of my questions for the authors would be a post-mortem of the results and their impact upon the industry. Regardless of the citations it would be interesting to see how many platforms integrate these guidelines into their systems and to what extent.
  • Following up on the previous question, I would like to see another paper (possibly survey) exploring the different methods of implementations used throughout the different platform. A comparison between the different platforms would help to better showcase and exemplify each guideline.
  • I would also like to see each of these guidelines run against a sample of expert psychologists and determine their affects in the long run. Along with what was described in the paper (Making Better Use of the Crowd: How Crowdsourcing Can Advance Machine Learning Research) as algorithm aversion (“a phenomenon in which people fail to trust an algorithm once they have seen the algorithm make a mistake”), I would like to see if these guidelines would create an environment making the interaction to immersive that the human subjects are either rejecting it or completely accepting of it.

Read More

02/05/2020 – Bipasha Banerjee – Power to the People: The Role of Humans in Interactive Machine Learning

Summary 

The article was an interesting read on interactive machine learning published by Amershi et al. in the AI magazine in 2014. The authors pointed out the problems with traditional machine learning (ML). In particular, the time and efforts that are wasted to get a single job done. The process involves time-consuming interactions between machine learning practitioners and domain experts. In order to make this process efficient, continuous interactive approaches are needed to make the model interactive. The authors mentioned that the updates in the interactive strategies are quicker, get updated quickly based on user feedback. Another benefit of this approach that they pointed out where users with little or no ML experience could interact as the idea is input-output driven. They gave several case studies of such applications as the Crayons system. They mentioned some observations which demonstrate how the end users’ involvement affected the learning process. Some novel interfaces were also proposed in this article for interactive machine learning like the assessment of model qualities, timing queries to users, among others.

Reflection

I feel that interactive machine learning is a very useful and novel approach to machine learning. Having users get involved in the learning process indeed saves time and effort as opposed to the traditional approach. In traditional ML approaches, the collaboration between the practitioner and the domain experts are not seamless. I enjoyed reading about interactive based learning and how users are directly involved in the learning process. Case studies like learning for gesture-based music or image segmentation demonstrate how users provide feedback to the learner immediately after looking at the output. In traditional ML algorithms, we do involve human components during training. It is mainly in the form of annotated training labels. However, whenever domain-specific work is involved (e.g., the clustering of low-level protein problems), the task of labeling by crowd workers becomes tricky. Hence, this method of experts and end-users being involved in the learning process is productive. This is essentially a human-in-the-loop approach, as mentioned in the “Ghost Work” text. However, human interaction said it is different from the human interaction that occurs when humans are actively trying to interact with the system. The article mentioned various observations when dealing with humans, and it was interesting to see how humans behave, have biases. This was brought forward by last week’s reading about affordances. We found that humans do tend to have a bias, in this case, positive bias whereas machines tend to have an unbiased opinion (debatable as machines are trained by humans, and that data is prone to bias).

Questions

  1. How to deal with human bias effectively?
  2. How can we evaluate how well the system performs when the input data is not free from human errors? (E.g., humans tend to demonstrate how the learner should behave, which may/may not be the correct approach. They tend to have biases too)
  3. Most of the case studies mentioned are interactive in nature (teaching concepts to robots, interactive Crayons System, etc.). How does this extend to domains that are non-interactive like text analysis?

Read More

02/05/2020 – Nurendra Choudhary – Guidelines for Human-AI Interaction

Summary

In this paper, the authors propose a set of 18 guidelines for human-AI interaction design. The guidelines are codified 150 AI-related design recommendations collected from diverse sources. Additionally, they also validate the design from both the users’ and expert’s perspective.

For the users, the principles are evaluated by 49 HCI practitioners by testing out a familiar AI-driven feature of a product. The goal is to estimate the number of guidelines followed/not followed by the feature. The feedback form also had a field for “does not apply” with a corresponding explanation field. Also, the review included a clarity component to find out ambiguity in the guidelines. From the empirical study, the authors were able to conclude that all the guidelines were majorly clear and hence could be applied to human-AI interactions. The authors revised the guidelines according to the feedback and conducted an expert review

The guidelines are really suitable when deploying ML systems in the real-world. Generally, in the AI community, researchers do not find any immediate concrete benefits for developing user-friendly systems. However, when such systems need to be deployed for real-world users, the user experience or human-AI interaction becomes a crucial part of the overall mechanism.

For the experts, the old and new guidelines were presented and they agreed on the revised guidelines for all but one (G15). From this, the authors conclude the effectiveness of the review process.

Reflection

Studying its applicability is really important (like the authors did in the paper), because I do not feel all of them are necessary for the diverse number of  applications. It is interesting to notice that for photo organizers, most of the guidelines are already being followed and that they include the most number of “does not apply”. Also, e-commerce seems to be plagued with issues. I think this is because of the gap in transparency. The AI systems in photo-organizers need to be advertised to the users and it directly affects their decisions. However, on the other hand for e-commerce, the AI systems work in the background to influence user choices.

AI systems steadily learn new things and its several times not interpretable by the researchers who invented them. So, this I believe is an unfair ask. However, as the AI research community pushes for increased interpretability in the systems, I believe it is possible and will definitely help users. Imagine if you could explicitly set the features attached to your profile to improve your search recommendations.

Similarly, focus on “relevant social norms” and “mitigate social biases” are presently not currently focus but I believe these will grow over time to form a dominant area of ML research. 

I think we can use these guidelines as tools to diversify AI research into more avenues focusing on building systems that inherently maintain these principles. 

Questions

  1. Can we study the feasibility and cost-to-benefit ratio of making changes to present AI systems based on these guidelines?
  2. Can such principles be evaluated from the other perspective? Can we give better data guidelines for AI to help it learn?
  3. How frequently does the list need to evolve with the evolution of ML systems?
  4. Do the users always need to know the changes in AI? Think about interactive systems, the AI learns in real-time. Wouldn’t there be too many notifications to track for a human user? Would it become something like spam?

Word Count: 569

Read More

02/05/20 – Fanglan Chen – Principles of Mixed-Initiative User Interfaces

Horvitz’s paper “Principles of Mixed-Initiative User Interfaces” highlighted several principles important for allowing AI engineers to enhance human-computer interaction through a carefully designed coupling of automated services with direct manipulation by humans. The author demonstrated a middle ground of the human-computer interaction debate over opportunities of total automation of user needs (via intelligent agents) versus the importance of user control and decision making (via graphical user interfaces). By showing how to turn the proposed principles into potential improvements of an application, LookOut system for scheduling and meeting management, this paper explored the possibility to design innovative user interfaces and new human-computer interaction modalities by considering (from the ground up) designs that benefit of the power of direct manipulation and potentially valuable automated reasoning.

I think this discussion can be framed by noting the interesting duality between artificial intelligence(AI) and human-computer interaction(HCI). In AI, the goal is to mimic the way humans learn and think in order to create computer systems that can perform intelligent actions beyond naive tasks. In HCI, the target is to design computer interfaces that leverage off humans and provide aids to the users in the execution of intelligent actions. The basic idea of mixed-initiative interaction is to let agents work most effectively through a collaborative process. From the agents side, the major challenge is to deal with the uncertainties of users’ interests and intentions, thus know how to proceed to coordinate the users in a variety of tasks. It is indispensable to get humans in the interaction through an interaction mode convenient to the users. To achieve this, intelligent agents must be designed to be able to focus on various subproblems, fill in details, identify problem areas, and collaborate with different users to find the best personalized solutions. Without this mixed-initiative, AI designs would be very likely to fall into either human control or system control approaches. We also need to be aware that the mixed-initiative, also called co-creative, framework may come with a high cost. The system controlled frameworks are prevalent nowadays because they can save companies’ efforts and money.  When we try to balance the operational expenses and improved customer service, it is important to ask how we can decide which framework to choose and at what stages we need to get humans involved.

In the mixed-initiative framework, a  user and an AI agent work together to produce final products. Let us take a look at an example of mixed-initiative research led by the University of Rochester. Through years on mixed-initiative planning systems, one of their projects is to develop systems that can enhance human performance in managing plans, such as transportation network planning. There is no denying the fact that how intelligent planning systems and humans solve problems are highly different: automated agents require complete specifications of the goals and situation before knowing where to start; human experts incrementally learn about the scenario and modify the goals during the process of developing the plan. Faced with this dilemma, the research team decided to try to design a collaborative planning system which takes the advantages of both the user and machine to build the plan. The idea is that users bring intuition, concrete goals and trade-offs between goals, and advanced problem-solving strategies, while the agents bring an ability to manage details, allocate resources, and perform quantitative analysis of proposed actions. In this way, the capability of humans and the AI agents in creating desired output would be extended. I think the following questions are worthy of further discussion.

  • What is the boundary between human control, system control, and mixed-initiative frameworks? 
  • How can we decide which framework to choose and at what stages we need to get humans involved to make the systems better?
  • How can we bring personalized user experience in consideration of the countless uncertain decisions?
  • Does all kinds of tasks require a mixed-initiative? What kind of projects would benefit more from the mixed-initiative framework?

Read More