Summary
Chilton et al.’s paper describes a flexible workflow for creating visual blends that the authors dub “Visiblends’ (do you see what they did there?). Lack of imagination in naming notwithstanding, the workflow involves an input of two concepts, brainstorming, image classification and blending, and evaluation of the automatic blending. They performed three studies where their workflow was tested, showing that decentralized groups of people could appropriately brainstorm and generate blends in microtasks. The first phase involves associating different words with the input themes to create a broad base of kinds of images. The second phase involved the searching for related imagery. The third phase asked crowdworkers to annotate the found images for basic shapes and coverages of those shapes. The fourth stage is performed by an AI and involves shape matching between images to combine the two themes, while the final stage (also by AI) blends the images based on the image matching. Their studies confirm that decentralized groups, collaborative groups, and novices can all use this workflow to create visual blends.
Personal Reflection
I liked this work, overall, as a way for people to get interesting designs out of a few keywords of whatever they’re working on. I was somewhat surprised that the second step (“Finding Images”) was not an automatic process. I had figured when I read the introduction that this step was automated by image recognition software, since these are not complex images but images of single objects. However, when it was explained in the Workflow section, it makes it clear that these images are essentially another phase of the brainstorming process. However, I was concerned that it was perhaps a complex microtask since it asked for an implementation of several somewhat complex filters as well as ten images from those filters.
I thought the images in Figure 7 were somewhat deceptive, however. They stated in the caption for that image that there was “aesthetic editing by an artist,” which implies they had a visual designer already employed. If that was the case, why is the expert not performing the expert task? I would have liked to see the actual resultant images as they show (some of) in the later studies.
The refinement process they introduced in the first study was also interesting in that the refinement was more than just asking for more results – the user actually iterated on the design process to find similar shaped items between the two categories. This shows an aspect of human intelligence to solve a problem that the AI had difficulty solving – realizing why the AI was having trouble was a key part of the process.
Lastly, I would have liked to see what would have happened if all three groups (decentralized, group collaboration, and novices) were given the same concepts to generate ideas from. Which group might have performed the best? Also, since this is quite decentralized, I would have liked to see an mTurk deployment to see how the crowd could perform this task as well.
Questions
- As discussed above, an interesting part of this paper was how human intelligence was employed to refine the AI’s process, thereby giving it better inputs. Are there other ways that using human insight into why an AI is having issues is a good way to solve the problem?
- When a workflow that creates microtasks like this, is it more helpful to test it with participants that come into a lab or through a crowdworking site like Mechanical Turk? Should they both be performed? Why or why not?
- Would you use a service like this for a creative image in your work? Why or why not?