03/04/20 – Lulwah AlKulaib- SocialAltText

Summary

The authors propose a system to generate Alt text for images embedded in social media posts by utilizing crowd workers. Their goal is to have a better experience for the blind and visually impared (BVI) when using social media. Existing tools provide imperfect descriptions some by automatic caption generation, and others by object recognition. These systems are not enough as in many cases their results aren’t descriptive enough for BVI users. The authors study how crowdsourcing can be used for both:

  • evaluating the value provided of existing automated approaches
  • Enabling workflows that provide scalable and useful alt text for BVI users

They utilize real-time crowdsourcing to test experiments with varied depth levels of interaction of the crowd in assisting visually impaired users. They show the shortcomings of existing AI image captioning systems and compare them with their method. The paper suggests two experiences:

  • TweetTalk: is a conversational assistant workflow.
  • Structured Q&A: that builds upon and enhances the state of the art generated captions.

They evaluated the conversational assistant with 235 crowdworkers. They evaluated 85 tweets for the baseline image caption, each tweet was evaluated 3 times with a total of 255 evaluations.

Reflection

The paper presents a novel concept and their approach is a different take on utilizing crowdworkers. I believe that the experiment would have worked better if they tested it on some visually impared users. Since the crowdworkers hired were not visually impaired, it makes it harder to say that BVI users would have the same reaction. Since the study targets BVI users, they should have been the pool of testers. People interact with the same element in different ways and what they showed seemed too controlled. Also, the questions were not all the same for all images, which makes this harder to generalize. The presented model tries to solve a problem for social media photos and not having a plan to repeat for each photo might make interpreting images difficult. 

I appreciated the authors’ use of existing systems and their attempt at improving the AI generated captions. Their results obtain better accuracy compared to state of the art work.

I would have loved seeing how different social media applications measured compared with each other. Since different applications vary in how they present photos. Twitter for example gives a limited amount of character count while Facebook could present more text which might help BVI users in understanding the image better. 

In the limitations section, the authors mention that human in the loop workflows raise privacy concerns and that the alt text would generalize to friendsourcing and utilizing social network users. I wonder how that generalizes to social media applications in real time. And how reliable would friendsourcing be for BVI users.

Discussion

  • What are improvements that you would suggest to better the TweetTalk experiment?
  • Do you know of any applications that use human in the loop in real time?
  • Would you have any privacy concerns if one of the social media applications integrated a human in the loop approach to help BVI users?

Read More

03/04/2020 – Ziyao Wang – Combining crowdsourcing and google street view to identify street-level accessibility problems

In this paper, the authors focused on the mechanism that using untrained crowdworkers to find and label accessibility problems in Google Street View imagery. They provide the workers images from Google Street View imagery to let them find, label and access sidewalk accessibility problems. They compared the results of this labeling task completed by six dedicated labelers including three wheelchair users and by MTurk workers. The comparison shows that the crowdworkers can determining the presence of an accessibility problem with high accuracy, which means this mechanism is promising about sidewalk accessibility. However, that mechanism still have problems such as locating the GSV camera in geographic space and selecting an optimal viewpoint, sidewalk width problem and the age of the images. In the experiments, the workers cannot label some of the images due to camera position, and the images may be captured three years ago. Additionally, there is no method to measure the width of the sidewalk, which is a need by the wheelchair users.

Reflections:

The authors combined the Google Street View imagery and MTurk Crowdsourcing to build a system which can detect accessibility challenges. This kind of hybrid system has a high accuracy in the finding and labeling of such kind of accessibility challenges. If this system can be used practically, the disables will benefit a lot with the help of the system.

However, there is some problems in the system. As is mentioned in the paper, the images in the Google Street View are old. Some of the images may be captured years ago. If the detection is based on these pictures, some new access problems will be detected. For this problem, I have a rough idea about letting the users of the system to update the image library. When they found some difference between the images from library and practical sidewalk, they can upload the latest pictures captured by them. As a result, other users will not suffer from the images’ age problem. However, this solution will change the whole system. Google Street View imagery requires professional capture devices which is not available to most of the users. As a result, the Google Street View will not update its imagery using the photos captured by the users, and the system cannot update itself using the imagery. Instead, the system has to build its own image library, which is totally different from the introduced system in the paper. Additionally, the photos provided by the users may be with low resolution, and it will be difficult for the MTurk workers to label the accessibility challenges.

Similarly, the problem that the workers cannot measure the width of the sidewalk can be solved if users can upload the width when they are using the system. However, it still faces the problem of lacking an own database and the system needs to be modified hugely.

Instead of detecting accessibility challenges, I think the system is more useful in tracking and labeling bike lanes. Compared with the accessibility of sidewalk, to detect the existence of bike lanes will suffer less from the age problem, because even the bike lanes were built years ago, they can still work. Also, there is no need to measure the width of the lanes, as all the lanes should have enough space for bikes to pass.

Question:

Is there any approach to solve the age problem, camera point problem and measuring width problem in the system?

What do you think about applying such a system to track and label bike lanes?

What other kinds of street detection problems can this system being applied to?

Read More

03/04/2020- Ziyao Wang – Real-time captioning by groups of non-experts

Traditional real-time captioning tasks are completed by professional captionists. However, the cost to hire them is expensive. Alternatively, some automatic speech recognition systems have been developed. But there is still problem that these systems perform badly when the audio quality is low or there are multiple people talking. In this paper, the authors developed a system which can hire several non-expert workers to do the caption task and merge their works together to obtain a high accuracy caption output. As the workers have a significant lower salary compared with the experts, the cost will be reduced even multiple workers are hired. Also, the system has a good performance collecting workers’ jobs and merging them to get a high accuracy output with low latency.

Reflections:

When solving problems with the requirement of high accuracy and low latency, I always hold the view that only AI or experts can complete such kind of tasks. However, in this paper, the authors showed us that non-experts can also complete this kind of tasks if we can have a group of people work together.

Compared with the professionals, hiring non-experts will cost much less. Compared with AI, people can handle some complicated situations better. This system combined this two advantages and provided a cheap real-time captioning system with high accuracy.

It is for sure that this system has lots of advantages, but we should still consider it critically. For the cost, it is true that hiring non-experts will spend much less than hiring professional captionists. However, the system needs to hire 10 workers to get 80 to 90 percentage accuracy. Even though the workers have a low salary, for example 10 dollars per hour, the total cost will reach 100 dollars per hour. Hiring experts will only cost around 120 dollars for one hour, which shows that the saving of applying the system is relatively low.

For the accuracy part, there is possibility that all the 10 workers missed a part of the audio. As a result, even merging all the results provided by the workers, the system will still miss this part’s caption. Instead, though the AI system may provide caption with errors, the system can at least provide something for all words in the audio.

For these two reasons, I think hiring less workers, for example three to five workers, to fix the errors in the system generated caption will save more money while the system can still maintain high accuracy. And with the provided caption, the workers’ tasks will be easier, and they may provide more accurate results. Also, for the circumstances in which AI system performs well, the workers will not need to spent time typing, and the latency of the system will be reduced.

Questions:

What are the advantages of hiring non-expert humans to do the captioning compared with the experts or AI systems?

Will a system hiring less workers to fix the errors in the AI generated caption be cheaper? Will this system perform better?

For the system mentioned in the second question, does it have any limitations or drawbacks?

Read More

03/04/2020 – Palakh Mignonne Jude – Combining Crowdsourcing and Google Street View To Identify Street-Level Accessibility Problems

SUMMARY

The authors of this paper aim to investigate the feasibility of recruiting MTurk workers to label and assess sidewalk accessibility problems as can be viewed by making use of Google Street View. The authors conducted two studies, the first, with 6 people (3 from their team of researchers and 3 wheelchair users) and the second, that investigated the performance of turkers. The authors created an interactive labeling interface as well as a validation interface (to help users to accept/reject previous labels).  The authors proposed different levels of annotation correctness comprising of two spectra – localization spectrum which includes image level and pixel level granularity and specificity spectrum which includes the amount of information evaluated for each label. They defined image-level correctness in terms of accuracy, precision, recall, and f-measure. In order to computer inter-rater agreement at the image-level, they utilized Fleiss’ kappa. In order to evaluate the more challenging pixel-level agreement, they aimed to verify the labeling by indicating that pixel-level overlap was greater between labelers on the same image versus across different images. The authors used the labels produced from Study 1 as the ground truth dataset to evaluate turker performance. The authors also proposed two quality control approaches – filtering turkers based on a threshold of performance and filtering labels based on crowdsourced validations.

REFLECTION

I really liked the motivation of this paper especially given the large number of people that have physical disabilities. I am very interested to know how something like this would extend to other countries such as India as it would greatly aid people with physical disabilities over there since there are many places with poor walking surfaces and do not have support for wheelchairs. I think that having such a system in place in India would definitely help disabled people be better informed about places that can be visited.

I also liked the quality control mechanisms of filtering tuckers and filtering labels since these appear to be good ways to improve the overall quality of the labels obtained. I thought it was interesting that the performance of the system improved with tucker count but the gains diminished in magnitude as the group size grew. I thought that the design of the labelling and verification interface was good and that it made it easy for users to perform their tasks.

QUESTIONS

  1. As indicated in the limitations section, this work ‘ignored practical aspects such as locating the GSV camera in geographical space and selecting an optimal viewpoint’. Has any follow-up study been performed that takes into account these physical aspects? How complex would it be to conduct such a study?
  2. The authors mention that image quality can be poor in some cases due to a variety of factors. How much of an impact would this cause to the task at hand? Which labels would have been most affected if the image quality was very poor?
  3. The validation of labels was performed by crowd workers via the verification interface. Would there have been any change in the results obtained if experts had been used for the validation of labels instead of crowd workers (since they may have been able to identify more errors in the labels as compared to normal crowd workers)?

Read More

03/04/20 – Akshita Jha – Pull the Plug? Predicting If Computers or Humans Should Segment Images

Summary:
“Pull the Plug? Predicting If Computers or Humans Should Segment Images” by Gurari et. al. talks about image segmentation. They propose a resource allocation framework that tries to predict when best to use a computer for segmenting images and when to switch to humans. Image segmentation is the process of “partitioning a single image into multiple segments” in order to simplify the image into something that is easier to analyze. The authors implement two systems that decide when to replace humans with computers to create fine-grained segments and when to replace computers with humans in order to get coarse segments. They demonstrate through experiments that this mixed model of humans and computers beats the state of the art systems for image segmentation. The authors use the resource allocation framework, “Pull the Plug”, on humans or computers. They do this by giving the system an image and trying to predict if an annotation should from a human or a computer. The authors evaluate the model using Pearson’s correlation coefficient (CC) and mean absolute error (MAE). CC indicates the correlation strength of the predicted score to the actual scores given by the Jaccard index on the ground truth. MAE is the average prediction errors. The authors thoroughly experiment with initializing segmentation tools and reducing human effort initialization.

Reflections:
This is an interesting work that successfully makes uses of mixed modes involving both humans and computers to enrich the precision and accuracy of a task. The two methods that the authors design for segmenting an image was particularly thoughtful. First, given an image, the authors design a system that tries to predict whether the image requires fine-grained segmentation or coarse-grained segmentation. This is non-trivial as this task requires the system to possess a certain level of “intelligence”. The authors use segmentation tolls but the motivation of the system design is to remain agnostic to these particular segmentation tools. The systems rank several segmentation tools by using a tool designed by the authors to predict the quality of the segmentation. The system then allocates the available human budget to create coarse segmentations. The second system tries to capture whether an image requires fine-grained segmentation or not. They do this by building on the coarse segmentation given by the first system. The second system refines the segmentation and allocates the available human budget to create fine-grained segmentation for low predicted quality segmentations. Both these tasks rely on the system proposed by the authors to predict the quality of candidate segmentation.

Questions:
1. The authors rely on their proposed system of predicting the quality of candidate segmentations. What kind of errors do you expect?
2. Can you think of a way to improve this system?
3. Can we replace the segmentation quality prediction system with a human? Do you expect the system to improve or would the performance go down? How would it affect the overall experience of the system?
4. In most such systems, humans are needed only for annotation. Can we think of more creative ways to engage humans while improving the system performance?

Read More

Subil Abraham – 03/04/2020 – Real-time captioning by groups of non-experts

This paper pioneers the approach of using crowd work for closed captioning systems. The scenario they target is classes and lectures, where a student can hold up their phone and record the speaker and the sound the transmitted to the crowd workers. The sound that is passed is given as bite sized pieces for the crowd workers to transcribe, and the paper’s implementation of the multiple sequence alignment algorithms takes those transcriptions and combines them. The focus of the tool is very much on real-time captioning so the amount of time a crowd worker can spend on a portion of sound is limited. The authors design interfaces on the worker side to promote continuous transcription, and on the user side to allow them to correct the received transcriptions in real time, enhancing the quality further. The authors had to deal with interesting challenges in resolving errors in the transcription, which they did by a combination of comaparing transcriptions of the same section from different crowd workers, using bigram and trigram data to validate the word ordering. Evaluations showed that precision was stable while coverage increased with increase in the number of workers, while having lower error rate compared to automatic transcription and untrained transcribers.

One thing that needs to be pointed out about this work is that I believe that ASR is always rapidly improving and has made significant strides from when this paper was published. From my own anecdotal experience, Youtube’s automatic closed captions are getting very very close to being fully accurate (however, thinking back on our reading of the Ghost Work book at the beginning of the semester, I wonder if Youtube is cheating a bit and using crowd work intervention for some their videos to help their captioning AI along). I also find that the author’s solution for merging the transcriptions of the different sound bites is interesting. How they would solve that was the first thing that was on my mind because it was not going to be a matter of simply aligning the time stamps because those were definitely going to be imprecise. So I do like their clever multi part solution. Finally, I was a little surprised and disappointed that the WER was at ~45% which was a lot higher than I expected. I was expecting the error rate to be a lot closer to professional transcribers but unfortunately not. The software still has a way to go in that.

  1. How could you get the error rate down to the professional transcriber’s level? What is going wrong there that is causing it to be that high?
  2. It’s interesting to me that they couldn’t just play isolated sound clips but instead had to raise and lower volume on a continuous stream for better accuracy. Where are the other places humans work better when they have a continuous stream of data rather than discrete pieces of data?
  3. Is there an ideal balance between choosing precision and coverage in the context of this paper? This was something that also came up in last week’s readings. Should the user decide what the balance should be? How would they do it when there can be multiple users all at the same location trying to request captioning for the same thing?

Read More

03/04/20 – Fanglan Chen – Combining Crowdsourcing and Google Street View to Identify Street-level Accessibility Problems

Summary

Hara et al.’s paper “Combining Crowdsourcing and Google Street View to Identify Street-level Accessibility Problems” explores the crowdsourcing approach to locate and assess sidewalk accessibility issues by labeling Google Street View (GSV) imagery. Traditional approaches for sidewalk assessment relies on street audits which are very labor intensive and expensive or by reporting calls from citizens. The researchers propose using their designed interactive user interface as an alternative to proactively deal with this issue. Specifically, they investigates the viability of the labeling sidewalk issues amongst two groups of diligent and motivated labelers (Study 1) and then explores the potential of relying on crowd workers to perform this labeling task and evaluate performance at different levels of labeling accuracy (Study 2). By investigating the viability of labeling across two groups (three members of the research team and three wheelchair users), the results of study 1 is used to provide ground truth labels to evaluate crowd workers performance and to get a baseline understanding of what labeling this dataset looks like. Study 2 explores the potential of using crowd workers to perform the labeling task. Their performance is evaluated on both image and pixel levels of labeling accuracy. The findings suggest that it is feasible to use crowdsourcing for the labeling and verification tasks, which leads to the final result of better quality.

Reflection

Overall, this paper proposes an interesting approach for sidewalk assessment. What I think most is how feasible we can use that to deal with real-world issues. In the scenario studied by the researchers, the sidewalk under poor condition has severe problems and relates to a larger accessibility issue of urban space. The proposed crowdsourcing approach is novel. However, if we take a close look at the data source, we may question to what extent it can facilitate the assessment in real-time. It seems impossible to update the Google Street View (GSV) imagery on a daily basis. The image sources are historical instead the ones that can reflect the current conditions of the road sidewalks. 

I think the image quality may be another big problem in this approach. Firstly, the resolution of the GSV imagery is comparatively low and sometimes under poor light conditions, which is challenging to let the crowd workers make the correct judgement. There is possibility to use some existing machine learning models to enhance the image quality via increasing its resolution or adjusting the brightness. That could be a potential place to introduce the assistance of machine learning algorithms to achieve better results in the task.

In addition, the focal point of the camera was another issue which may reduce the scalability of the project. The CSV imagery is not collected merely for the sidewalk accessibility assessment, which would usually contain a lot of noises (e.g. block objects). It would be interesting to conduct a study about how much percent of the GSV imagery is of good quality in regards to the sidewalk assessment task.

Discussion

I think the following questions are worthy of further discussion.

  • Are there any other important accessible issues existing but not considered in the study?
  • What are improvements you can think about the authors could improve their analysis?
  • What other potential human performance tasks can be explored by incorporating street view images?
  • How effective do you think this approach can deal with the urgent real-world problems?

Read More

03/04/2020 – Palakh Mignonne Jude – Pull the Plug? Predicting If Computers or Humans Should Segment Images

SUMMARY

The authors of this paper aim to build a prediction system that is capable of determining whether the segmentation of images should be done by humans or computers, keeping in mind that there is a fixed budget of human annotation effort. They focus on the task of foreground object segmentation. They utilized varied domain image datasets such as the Biomedical Image Library with 271 grayscale microscopy images sets, Weizmann with 100 grayscale everyday object images, and Interactive Image Segmentation with 151 RGB everyday object images with the aim of showcasing the generalizability of their technique. They developed a resource allocation framework ‘PTP’ that predicts if it should ‘Plug The Plug’ on machines or humans. They conducted studies on both coarse segmentation as well as fine-grained segmentation. The ‘machine’ algorithms were selected from among the algorithms currently used for foreground segmentation such as Otsu thresholding, Hough transform, etc. The regression model was built using a multiple linear regression model. The 522 images from the 3 data sets mentioned earlier were given to crowd workers from AMT to perform coarse segmentation. The authors found that their proposed system was able to eliminate 30-60 minutes of human annotation time.

REFLECTION

I liked the idea of the proposed system that capitalized on the strengths of both humans and machines and aims to identify when the skill of one or the other is more suited for the task at hand. It reminded me about reCAPTCHA (as highlighted by the paper ‘An Affordance-Based Framework for Human Computation and Human-Computer Collaboration’) that also utilized multiple affordances (both human and machine) in order to achieve a common goal.

I found it interesting to learn that this system was able to eliminate 30-60 minutes of human annotation time. I believe that if such a system were to be used effectively, it would enable developers to build systems faster and ensure that human efforts are not wasted in any way. I thought it was good that the authors attempted to incorporate variety when selecting their data sets, however, I believe that it would have been interesting if the authors had combined these data sets with a few more data sets that contained more complex images (ones with many images that could have been in the foreground). I also liked that the authors have published their code as an open source repository for future extensions of their work.

QUESTIONS

  1. As part of this study, the authors focus on foreground segmentation. Would the proposed system extend well in case of other object segmentation or would the quality of the segmentation and the performance of the system be hampered in any way?
  2. While the authors have attempted to indicate the generalizability of their system by utilizing different data sets, the Weizmann and BU-BIL datasets were grayscale images with relatively clear foreground images. If the images were to contain multiple objects, would the amount of time that this system eliminated be as high? Is there any relation between the difficulty of the annotation task and the success of this system?
  3. Have there any been any new systems (since this paper was published) that attempt to build on top of the methodology proposed by the authors in this paper? What modifications/improvements could be made to this proposed system to improve it (if any improvement is possible)?

Read More

03/04/20 – Sukrit Venkatagiri – Toward Scalable Social Alt Text

Paper: Elliot Salisbury, Ece Kamar, and Meredith Ringel Morris. 2017. Toward Scalable Social Alt Text: Conversational Crowdsourcing as a Tool for Refining Vision-to-Language Technology for the Blind. In Fifth AAAI Conference on Human Computation and Crowdsourcing.

Summary:
This paper explores a variety of approaches for supporting blind and visually impaired people (BVI) with alt-text captions. They consider two baseline methods using existing computer vision approaches (Vision-to-Language) and Human Corrected Captions. They also considered two workflows that did not depend on CV approaches—TweetTalk conversational workflow, and Structured Q&A workflow. Based on the questions asked from TweetTalk, they generated a set of structured questions to be used in Structured Q&A workflow. They found that V2L performed the worst, and that overall, any approach with CV as a baseline did not perform well. Their TweetTalk conversational approach is more generalizable but also difficult to recruit workers. Finally, they conducted a study of TweetTalk with 7 BVI people and learned that they found it potentially useful. The authors discuss their findings in relation to prior work, as well as the tradeoffs between human-only and AI-only systems, paid v/s volunteer work, and conversational assistants v/s structured Q&A. They also extensively discuss the limitations of this work.

Reflection:
Overall, I really liked this paper and found it very interesting. I think their multiple approaches to evaluating human-AI collaboration was interesting (AI alone, human-corrected, human chat, asynchronous human answers), in addition to the quality perception ratings that were  obtained from third party workers. I think this paper makes a strong contribution, but wish they could go into more detail to clarify exactly how the system worked, the different experimental setups, and any other interesting findings that were there. Sadly, there is an 8-page page limit, which may have prevented them from going into more detail.

I appreciate the fact that they built on and used prior work in this paper, namely MacLeod et al. 2017, Mao et al. 2012, and Microsoft’s Cognitive Services API. This way, they did not need to build their own database, CV algorithms, or real-time crowdworker recruiting system. Instead, it allowed them to focus on more high-level goals.

Their findings were interesting. Especially the fact that human-corrected CV descriptions performed poorly. It is unclear how satisfaction is different between the various conditions, for first-party ratings. It may be because users had context through conversation and but was not included in their ratings. The results also show that current V2L systems have worse accuracy than human-in-the-loop approaches. Sadly, there was no significant difference in accuracy between HCC and description generated after TweetTalk, but SQA improved significantly. 

Finally, the validation with BVI users is welcome, and I believe more Human-AI work needs to actually work with real users. I wonder how the findings might differ if they were used in a real, social context, or with people on MTurk instead of the researchers-as-workers.

Overall, this was a great paper to read and hope others build on this work, similar to how the authors here have directly leveraged prior work to advance our understanding of human-AI collaboration for alt-text generation. 

Questions:

  1. Are there any better human-AI workflows that might be used that the authors did not consider? How would they work and why would they be better?
  2. What are the limitations of CV that led to the findings in this paper that any approach with CV performed poorly?
  3. How would you validate this system in the real world?
  4. What are some other next steps for improving the state of the art in alt-text generation?

Read More

03/04/20 – Sukrit Venkatagiri – Pull the Plug?

Paper: Danna Gurari, Suyog Jain, Margrit Betke, and Kristen Grauman. 2016. Pull the Plug? Predicting If Computers or Humans Should Segment Images. 382–391. 

Summary: 
This paper proposes a resource allocation framework for predicting how best to allocate a fixed budget of human annotation effort in order to collect higher quality segmentations for a given batch of images and methods. The framework uses a “pull-the-plug” model, predicting when to use human versus computer annotators. More specifically, the paper proposes a system that intelligently allocates computer effort to replace human effort for initial coarse segmentations. Second, it automatically identifies images to have humans re-annotate by predicting which of the images the automated methods did not segment well enough. This method could be used for a variety of uses cases, and the paper tests it on three datasets and 8 segmentation methods. The findings show that this method significantly outperformed prior work across a variety of metrics, ranging from quality prediction, initial segmentation, fine-grained segmentation, and cost.

Reflection:
Overall, this was an interesting paper to read that is largely focused on performance and accuracy. The paper shows that the methods are superior to prior work and is now the state of the art for image segmentation when it comes to these three datasets, and for saving costs. 

I wonder what this paper might have looked like if it was more focused on creativity and innovation, rather than performance and cost-savings. For example, in HCI there are studies of using crowds to generate ideas, solve mysteries, and critique designs. Perhaps this approach might be used in a way that humans and machines can provide suggestions and they build off of each other.

More specifically, related to this paper, I wonder how the results would generalize to datasets other than the three used here, or to real-world examples, for perhaps self-driving cars, etc. Certainly, a lot more work needs to be done, and the system would need to be real-time, meaning human computation might not be a feasible method for self-driving cars. Though, certainly they could be used for generating training dataset for self-driving car algorithms.

This entire approach relies on the proposed prediction module, and it would be interesting to explore other edge cases where the predictions are better made by humans rather than through machine intelligence.

Finally, the finding that the computer segmented images more similarly to experts than crowd workers was interesting, and I wonder why—was it because the computer algorithms were trained on expert-generated training sets? Perhaps the crowd workers would perform better over time or with training. In that case, the results might have been better overall when combining the two.

Questions:

  1. How might you use this approach in your class project?
  2. Where does CV fail and where can humans augment it? What about the reverse?
  3. What are the limitations of a “pull-the-plug” approach, and how can they be overcome?
  4. Where else might this approach be used?

Read More