02/05/20 – Runge Yan – Principles of Mixed-Initiative User Interfaces

February 5, 2020February 5, 2020 Runge Yan Leave a comment

A propriate collaboration of user and machine is promising in efficiency and the effort for developing “agents” is persuasive. With the real-world case of LookOut system, several principles and methods are demonstrated.

To make sure an additional interface is worth using, the system should follow certain principles in design and implementation. significant value should be added into automation; user’s attention directly influences the effect of service; negotiation between costs and benefits often determines the action; a variety of goals should be understood; maintain a continuous learning process, etc.

LookOut system provides calendaring and scheduling services based on emails and user behaviors: in an interactive situation, the system goes through a 2-phase analysis to decide whether the assistance is needed & what level of service would suit best (manual operation, automated assistance or social agent). The possibility of user having a goal and the likelihood of machine providing service (action/dialog) comprise determine the threshold of best practice.

With the principles and problems addressed, a combination of reasoning machine and direct user operation is likely to be improved further on.

Reflection

I’m quite surprised that this paper is published in 1999. By that time, the concepts and guidelines of HCI are clearly addressed. Some of the points are exactly what we have today while other are somewhat developed into today’s idea. Although it’s a simple task compared to the interaction we encounter today, details on “agents” and both sides of interaction are quite comprehensive. The principles take several crucial elements into consideration: additional value, user attention, decision threshold, learning process, etc. These are basically what in my mind when I think about the complicated interaction process. A 2-phase analysis is essential for an “agent” we’d like to count on; the several modalities fit well in real-time situation; a failure recover mechanism and a evolving learning process.

When I’m using windows 98 and XP on my father’s PC, I’ve seen a cute lion icon on desktop provided by a popular antivirus software “Rising (Rui Xing)”. The lion was quite smart, as I look back at it: It won’t bother you when your mouse is navigating through another software window; if your mouse pass and stay around it, it will gently ask you if you need any service or you just want to play a bit; also, it is draggable and will stay in a scope where you often let it stay. The most amazing thing is that if I stopped what I’m working on and stared a little bit at the lion, it would become all sleepy and began snoring in a really cute way!

I’ve known several basic ideas about HCI and now I got so many “That was amazing” lookback. I don’t know my (indirect, meaningless) behavior largely determined the action of the machine.

Question:

If (as I see it) this paper addresses such important guidelines in HCI, what holds back the (fast) development of the entire system? / What can we do better to accelerate this process?
How important it is to make user feel natural as they interact with machine? Should users be notified about what’s going on? (Like “If you play a lot with the lion it will infer what you want at a certain time based on your behavior”) Is that one of the reasons why the companies collect our data and we are uncomfortable with it?

02/05/20 – Runge Yan – Power to the People: The Role of Humans in Interactive Machine Learning

February 5, 2020February 5, 2020 Runge Yan Leave a comment

When given a pattern and clear instruction on classification, a machine can learn quickly on a certain task. Cases are addressed to provide a sense of what’s users’ role in interactive machine learning: how does the machine influence the users and vice versa. Then, several features of people involving in interactive machine learning are stated as guidelines to understand the end-user effect on learning process:

People are active, tend to give positive awards and they want to be a model for the learner. Also, with the nature of human intelligence, they want to provide extra information in a rather simple decision, which lead to another feature that proper transparency in the system is valued by people and therefore help reduce error rate in labeling.

Several guidelines are presented in the interactivity. Instead of a small number of professionals designing the system, people can involve more in the process and collect the data they want. A novel interactive machine learning system should be flexible on input and output: User could try input with reasonable variation, assess the quality of the model and ask even the model directly; the output can be evaluated by the users rather than “experts”, a possible explanation of error case can be provided by users and the modification of models is no longer forbidden for users.

Details are discussed to further suggest what methods are better fit in a more interactive system: Common language, principles and guidelines, techniques and standards, volume handling, algorithm and collaboration between HCI, etc. This paper laid a comprehensive foundation for future research on this topic.

Reflection

I once contributed to a dataset on sense making and explanation. My job is to write two similar sentences where only one word (phrase) is different – one of them is common sense and the other is nonsense. For further information, I wrote three sentences trying to explain why the nonsense is appropriate, with only one of them best describe the reason. The model should understand the five sentences, pick the nonsense and find the best explanation. I was asked to be kind of extreme, for example, to write down a pair of “I put an eggplant in the fridge” and “I put an elephant in the fridge”. A mild difference is not allowed, for example, “I put a TV in the fridge.” A model will learn quickly for extreme comparison, however, I’d prefer an iterative learning process where the difference narrows down (Still one of them is nonsense and the other is common sense).

When I try to be a contributor on Figure Eight (previously CrowdFlower), the tutorial and intro task is quite friendly. I was asked to identify a LinkedIn account – whether he/she ’s still working in the same position/ same company. The assistance of decision makes me feel comfortable – I know what’s my job and some possible obstacles along the way, and I can tell the difficulties increase in a reasonable way. When there’s more information in that cannot be described by selecting options, I’m able to provide additional notes to the system, which makes me feel that my work is valuable.

More interactivity is needed to improve the model into next level, but with a previous restricted rule and its restricted output, the openness of the system is a crucial point to determine.

Question

More flexibility means more workload on the system and more requirement on users. How to balance user contribution? For example, if this user wants to do an experiment input and that user is unwilling to do. Will the system accept both input or only the qualified users?
How do we address the contribution of the users?

01/29/20 – Runge Yan – Beyond Mechanical Turk

January 30, 2020 Runge Yan Leave a comment

Analysis on Amazon Mechanic Turk and other paid crowdsourcing platform

Ever since the rise of AMT, research and applications have been focusing on this prominent crowdsourcing platform. With the development of similar platforms, some concerns on the use AMT have been solved in various ways. This paper reviews AMT limitations and compares the solution among 7 other popular platforms: ClickWorker, CloudFactory, CrowdComputing Systems, CrowdFlower, CrowdSource, LeadGenius and oDesk.

AMT Limitations are presented in four categories: They are short in quality control, management tools, support for fraud prevention and automated tools. These limitations are further mapped to assessment criteria to focus on detailed solution from all the platforms.

These criteria include the identity and skill of contributors, extra workload management, complex task support and quality control by requesters, and generalized qualification, task collaboration, task recommendation by platforms. By comparing with AMT, future focus research is addressed. Also, the method used in this paper can be improved in a way we set an alternative platform as a baseline.

I tried to work on a platform…

I’ve been thinking since I tried to work on Amazon Mechanic Turk and ClowdFlower (I believe they changed the name to Figure Eight). How does a requestor post a task? The expected format of the input and output from the platform may not match the interface provided by the requestor. I can see most requestors have to transform/code for themselves, but the platforms also start to help here.

Both platforms require identification through credit card and AMT requires SSN. I’m able to use Figure Eight now but AMT refused my signup. I have a relatively new SSN and credit record, which is probably the reason I’m refused. Although CrowdFlower comes into people’s sight and was mentioned more than before, the difference in scale and functionality can be easily spotted in their website layout and structure.

Figure Eight provides a basic task for me to start – A people’s name, current company and position name are given, my job is to go to his LinkedIn profile and make sure of two things: Is he/she working in the same company, and if so, did he/she change to another position? How many positions of this people are active in?

This should be a simple task, even for a people who’s not familiar with LinkedIn. The reward is relatively low, though. For 10 correct answers I got 1 cent (I think I’m still in an evaluation period). More pay is on the road if I work in a recommended manner, i.e. to try out several simple tasks in different categories, and then put my hands on more complex task, and so on.

Still, I found myself quitting after I made 10 cents. I’m not sure if it’s because I was too casual in sample quiz that I got 8 correct out of 10 and that they decided to give me a longer trail. Compared to several fun-inspired tasks I tried, experience on Figure Eight is not so welcome as I see it.

Back to the analysis. There’s an example that represent many dilemmas in this tripartite workspace: AMT doesn’t care about workers’ profile while oDesk offers public worker profiles showing their identities, skills, rating, etc. It’s hard to maintain a platform that workers can switch between these two options when identification is preferred for some tasks and not for other tasks. And this may refrain requestors from posting their needs.

Questions

Do platforms cooperate? How to combine existing good solution to improve or come up with a better platform?

How does AMT dominate in crowdsourcing for so long and no other platform catches up with it? What are the most significant improvements on AMT in the recent years?

01/29/20 – Runge Yan – Human computation: a survey and taxonomy of a growing field

January 30, 2020January 30, 2020 Runge Yan Leave a comment

What is human computation?

Human computation is a process that makes use of human’s power to solve the problem that could not currently be solved (well) by computers. As several related field emerging in recent years, understanding similarities and difference between them and human computation will contribute in the research and applications.

Collective intelligence consists of crowdsourcing, social computing, a large portion of human computation and a small portion of data mining. Crowdsourcing comes in when a designated job is outsourced to a (large) group of people; social computing refers to when communication in online communities is mediated by technology; data mining is the process of finding patterns among huge amount of data; collective intelligence is a broad field where products of many people’s action are proved to be wiser.

Human computation can be classified according to 6 dimensions: motivation, quality control, aggregation, human skill, process order and task-request cardinality. Current research and projects on human computation share a combination of different values.

Reflection

This is a comprehensive paper and it covers a lot of dimension in human computation. I found myself stuck for a long time and then came up with many tiny passing thoughts. It’s hard to tell something about the whole similarities/difference or analyze if the number of dimensions should be six…

I believe a human computation system requires a lot of effort, just like the making progress on machine intelligence. Design, implementation, improvement, extra budget, etc. How do people determine the trade-off between the choice of the research towards a better model and the choice of a human computation assistance? I think for now the preference is to find help from human computation, but it’s changing gradually. As the paradox of automation evolves into a higher level, the tasks are also going through an improvement – they require higher-level human skills. While the smartness of machine approach 100% authentic, the role of human in human computation will approach “verifying the decision” rather than “filling the gap”.

If a guy gets all his paycheck from different tasks in human computation, his career will be totally different from someone sitting in the room focusing on certain responsibilities for a long time. The goals of the tasks vary so he/she has to switch really quickly between fields, actions and routines. Precision on first few objects would be lower and goes up gradually. All these require careful observation and adjustment compared to collecting the results from traditional employees.

All the implicit work is probably written in user privacies, but I don’t’ read them:P For example, situations where identity confirmation is needed always annoy me. As a CS student I should understand how important it is to secure my important information, however, “Select all squares with traffic lights” really bothers me. As I know more about machine learning and I realized that I’m helping to train Google’s AI, I became even more angry (I shouldn’t, I actually participated in several human computation with a motivation of altruism). I don’t know if it’s just me or it bothers other people, too.

Questions

How to find a good combination in human computation? An urgent task may pay (higher) to collect the results in need. How to prevent contributors from rushing for money? How to make sure different motivation reflect less on the outcome of the tasks? Will a paycheck have absolutely no influence on contributors? If not, how do we create a balanced inspiration to avoid the bias/influence?

What is the best practice to reveal what’s behind smart machine-move step by step? Contributors have been around for more than 10 years. If it’s not this course asking me to understand and utilize the tool of crowdsourcing, I may not realize that the underlying non-macine intelligence is all around. Why I haven’t heard much about this group of people and their job? Does it come naturally with the feature of their job or is there anything holding their opinion back from public?

01/22/20 – Runge Yan – Ghost Work

January 27, 2020January 27, 2020 Runge Yan Leave a comment

What is Ghost work?

Under the shadow of emerging AI technology and its applications, millions of workers find tasks on online platforms and earn their paycheck by contributing in “human computation”.

These tasks connect end users and machine as an “API”. The work process is totally invisible to the users, yet their contribution is vital to user experience on all kinds of services that rely on the blooming of artificial intelligence. These workers are completely ignored by the majority of users, and their effort are credited to the algorithms and models. The result is, robots are overrated on their performance and capability. As new software being put into use and the improvement of AI, ghost workers will be assigned to further tasks towards the realization of automation. This is the Paradox of automation’s last mile.

Are they all mindless and unworthy of mention?

I’ve been thinking about the smartness of apps on my phone. Every time I talked about something that can be purchased, soon after that I receive notification of purchase promotion about that product from many apps from all my electronic devices. I know Google, Amazon and Apple probably already projected a prediction on my user activity and preference, however, I’m surprised at their speed of self-confirmation. Is it just my history and the model that determine what to show me next, or there’s something I couldn’t see or imagine?

When we make phone calls only by landline, operators are necessary workforce, and we know their importance. It’s not the most decent occupation, but many people use this job to make ends meet. Today, most people are enjoying the all kinds of service brought to them by fast Internet and computing power, and they are unaware of the people that glue their interaction with the machines.

It’s unfair if we only give credit to the silent, formatted models and neglect the significant contribution of ghost workers’ creativity and flexibility. Just as other collaborate work, effort is effort. Human computation is already an indispensable part of the whole picture. Without their effort, I guess all the expectation on AI will be disappointed in some way.

The jobs (or the combination of tasks) are described as mindless, which means no specific skill is needed to satisfy the goal. As I tried to register on a crowdsourcing website – Figure Eight, it’s not hard to get started with all the helpful “quiz” and tips along your tasks.

Question

Even with this book and other research that expose Ghost work to public, will these “irrelevant” information draw their attention on this topic?
What else, similar to Ghost work, silently exist in the shadow of a great invention, research breakthrough, and technology?
Current law, social status, is it a good status quo? Is there something we can make progress on? Do they demand more “employee” benefit? (Compared to traditional in-office full-time employees)

From class

They are hidden deliberately to hide some shortcoming of the platform. So it’s difficult to decide whether we should expose them.

Are we creating jobs to fire other people?