Authors: Kotaro Hara, Vicki Le, and Jon Froehlich
Summary
This paper discusses the feasibility of using AMT crowd workers to label sidewalk accessibility problems in Google Street View. The authors create ground truth datasets with the help of wheelchair users, and found that Turkers reached an accuracy of 81%. The paper also discusses some quality control and improvement methods, which was shown to be effective i.e. improved the accuracy to 93%.
Reflection
This paper reminded me of Jeff Bigham’s quote – “Discovery of important problems, mapping them onto computationally tractable solutions, collecting meaningful datasets, and designing interactions that make sense to people is where HCI and its inherent methodologies shine.” It’s a great example for two important things mentioned in the quote : a) discovery of important problems, and b) collecting meaningful datasets. The paper’s contribution mentions that the datasets collected will be used for building computer vision algorithms, and this paper’s workflow involves the potential end-users (wheelchair users) early on in the process. Further, the paper attempts to use Turkers to generate datasets that are comparable in quality to that of the wheelchair users, essentially setting a high quality standard for generating potential AI datasets. This is a desirable approach for training datasets, which can potentially help prevent problems in popular datasets as outlined here: https://www.excavating.ai/.
The paper also proposed two generalizable methods for improving data quality from Turkers. Filtering out low-quality workers during data collection by seeding in gold standard data may require designing modular workflows, but the time investment may well be worth it.
It’s great to see how this work evolved to now form the basis for Project Sidewalk, a live project where volunteers can map accessibility areas in the neighborhood.
Questions
- What’s your usual process for gathering datasets? How is it different from this paper’s approach? Would you be willing to involve potential end-users in the process?
- What would you do to ensure quality control in your AMT tasks?
- Do you think collecting more fine-grained data for training CV algorithms will come at a trade-off for the interface not being simple enough for Turkers?