04/29/2020 – Myles Frantz – Visiblends: A flexible workflow for visual blends

Summary

Advanced graphics designs such as visual blending are a very difficult art to master. Let alone for a singular person to understand and able to create them. Utilizing different designs through various companies to ensure the widest market share and the greatest public opinion requires time, effort, and typically a team of experts to ensure quality. Aiming to ensure a lower barrier of entry for younger or smaller companies, this team created VisiBlends, a crowd sourced framework aiming to ease the burden on teams. This framework breaks down the task of visual blends into 6 different stages to ensure there is validation throughout the whole framework. Throughout the various studies the team had created to measure the results, there was a majority success stapling this usage as a potential tool to generate visual blend assistance through the whole process. 

Reflection

I think this problem domain is similar to the research problem domain in various aspects. Researchers look at the bleeding edge of technology and aim to solve new or old problems using new techniques. These techniques are usually accumulated through various studies, various other methods, or even a new method that uses existing technology to extend previous work. It is interesting to see how the process for creating a new research idea can also be applied throughout the 6 different steps within this teams framework, however with the lab mates, advisor, and committee filling in for the users being crowd sourced within the original framework. 

I do appreciate the machine learning algorithm learning to match the two different pictures (or ideas). This provides several starting places for uses while continuously learning the better matching locations. I believe this could be improved by allowing a human in the loop factor or an override however. Within this kind of work the art is continuously being improved upon in a real time scale. To keep up design artists may have a better idea of new ideas to be used and could be used as an early wave, better feeding the algorithm with new results. 

Questions

  • Visual blend problems are potentially a factor when marketing for a new item or product. Blending two (and potentially very different) objects helps to grab the attention of passersby who only give it a split second while either scrolling past the story or going past the ad. Have you had any experience in creating a visual blend for one of your products and did you use this technique to reach a specific demographic? 
  • One of the perceived problems throughout Visual Blends is the specific pairing of elements that can be related to each other. Notably within this study this was overcome by obtaining the consensus (throughout the Mechanical Turk Workers) the initial idea that came to their head with the idea. Do you think this could be alleviated by using another Machine Learning algorithm pairing with the stream of information from a social media website (accounting for the social consensus of a group)?  
  • This technique could potentially remove part of the need for specific marketing teams who specialize in visual blending. Do you think this program could become widespread or integrated with another bigger system already in use by another company? 

Read More

04/29/2020 – Myles Frantz – IdeaHound: Improving large-scale collaborative ideation with crowd-powered real-time semantic modeling

Summary

Creating solutions for everyday problems requires out of the box solutions. For common problems that people have already worked through this is a problem since many common solutions or work arounds have already been created. Crowd sourcing these solutions are bound to run into redundancy issues if there is no arbiter to ensure there is forward momentum. To alleviate the need for such an arbiter while ensuring there are fresh ideas generated throughout the system, the team created a public sharing platform for sharing ideas. Utilizing the crowd platform, notepads (symbolizing new ideas or small excerpts) can be retrieved from others on the same platform through a similarity ranking. 

Reflection

 I can greatly appreciate the effort the team has gone to solve this problem. From working within a team of technologically advanced peers, I have been able to collaborate and learn greatly from each of the members. Unfortunately for many teams (teams cross-located across the globe or teams working remotely similar to the Corona Virus working from home order) the kind of general idea generation is greatly limited to the technology being used. Whiteboard technology like Draw.io and Microsoft Whiteboard is great but takes extra time to be adapted and written out. A design like this hosted by a company’s platform could provide greater collaboration across teams for potential side projects or to create a better future direction for the company. 

I do appreciate how simplistic and relatively familiar the design that the team made of the platform. Utilizing a similar design of sticky notes from Windows (at least version 7) is a great way to lower the “ramp up” or time of education the user has to take to learn the system. Despite this the team reported the workers needed to use a lot of mental effort to accomplish their tasks that were given to them. It is interesting to see not only with a common design there is always the “writers block” or the mental stopping point that is common throughout many people.

Questions

  • Collaboration is a great way for ideas to be shared and improved. This is only increased when the participants have different experiences and backgrounds, helping to provide potentially new perspective onto the idea. What was your best idea collaboration you have had so far (can be from a professional environment or research environment)? 
  • Unfortunately as described earlier there are a multitude of problems and issues that may arise when teams of people work or interact remotely. These problems are only further exemplified through the recent stay at home orders across the globe due to the Corona Virus typically only permitting essential workers. What types of issues (technological or not) have you experienced working with someone across technology? 
  • Given how wide spread and with a potential spread of the Corona Virus this is predicted to last until at least August of 2020. With this it is likely people will still be hesitant to go directly back to work and even more so openly work together. Would you use this technology to help brainstorm with others in your field (or even in general)? 

Read More

04/22/20 – Myles Frantz – The Knowledge Accelerator: Big Picture Thinking in Small Pieces

Summary

Maintaining a public and open source website can be difficult since the website is supported by individuals that are not paid. This team investigated using a crowd sourcing platform to not only support the platform but create articles. These articles (or tasks) were broken down into micro tasks there were manageable and scalable by the crowd sourced workers. These tasks were integrated throughout other HITs and were given extra contributions in order to relieve any extra reluctance on editing other crowd workers work. 

Reflection

I appreciate the competitive nature of comparing both the supervised learning (SL) and a reinforcement learning (RL) in the same type of game scenario of helping the human succeed by aiding the as best as it can. However as one of their contributions, I have issue with the relative comparison between the SL and RL bots. Within their contributions, they explicitly say they find “no significant difference in performance” between the different models. While they continue to describe the two methods performing approximately equally, their self-reported data describes a better model in most measurements. Within Table 1 (the comparison of humans working with each model), SL is reported as having a better (yet small) increase and decrease in Mean Rank and Mean Reciprocal Rank respectively (lower and then higher is better respectively). Within Table 2 (the comparison of the multitude of teams), there was only one scenario where the RL Model performed better than the SL Model. Lastly even in the participants self-reported perceptions, the SL Model only decreased performance in 1 of 6 different categories. Though it may be a small decrease in performance, they’re diction downplays part of the argument their making. Though I admit the SL model having a better Mean Rank by 0.3 (from Table 1 MR difference or Table 2 Human row) doesn’t appear to be a big difference, I believe part of their contribution statement “This suggests that while self-talk and RL are interesting directions to pursue for building better visual conversational agents…” is not an accurate description since by their own data it’s empirically disproven. 

Questions

  • Though I admit I focus on the representation of the data and the delivery of their contributions while they focus on the Human-in-the-loop aspect of the data, within the machine learning environment I imagine the decrease in accuracy (by 0.3 or approximately 5%) would not be described as insignificant. Do you think their verbiage is truly representative of the Machine Learning relevance? 
  • Do you think more Turk Workers (they used data from at least 56 workers) or adding requirements of age would change their data? 
  • Though evaluating the quality of collaboration is imperative between Humans and AI to ensure AI’s are made adequately, it seems common there is a disparity between comparing that collaboration and AI with AI. Due to this disconnect their statement on progress between the two collaboration studies seems like a fundamental idea. Do you think this work is more idealistic in its contributions or fundamental? 

Read More

04/22/20 – Myles Frantz – Opportunities for Automating Email Processing: A Need-Finding Study

Summary

Email is a formalized standard used throughout companies, college, and schools. It is also steadily used as documentation throughout companies, keeping track of requirements. Since emails are being used for increasingly more reasons, people have more usages for it. Through this the team has studied various usages of emails and a more integrated way to automate email rules. Using a thorough survey this team has created a domain specific language. Integrating this with the Internet Message Access Protocol (IMAP) protocol, users are also able to create more explicit and dynamic rules. 

Reflection

Working within a company I can greatly appreciate the granularity the provided framework. Within companies’ emails are used as a “rolling documentation”. This rolling documentation is in line with Agile, as it represents new requirements added later in the story. Creating very specific rules pertaining to certain scrum masters may be necessary to contain for reminders upon the rest of the team. Continuing the automation into tools could also lead further into a more streamlined deployment stream, enabling an email to signal a release from the release manager. Despite the wide acceptance of emails, there is the more available direct integration of tools like Mattermost. This availability is solely due to the being open for the application programmable interface that Mattermost provides. Despite the tools Google and Microsoft give throughout emails, the open source community provides a faster platform sharing this information. 

In addition to the rules provided through the interfaces, I believe the python email interface is an incredible extension throughout automating emails. The labeling system provided within many email interfaces is limited to rudimentary rules. The integration of such rules could potentially create better reminders through schools or an advisor advisee relationship. Using a reminder rule could create help issue reminds about grants or ETD issues. Since these rules are written in python, these can be shared and shared amongst group labs to ensure emails that are required are automatically managed. Instead of being limited to a single markdown based language, Python can the most popular language according to the IEEE top programming language survey. 

Questions

  • Utilizing a common standard ensures a there is a good interface for people to learn and get used to throughout the different technologies and companies. Do you think the python scripting is a common interface compared to the other markdown languages for the non-computer science-based users? 
  • The python language can be used in various platforms due to its libraries. In addition to the libraries, many python programs are extensible with to various platforms through an application programmable interface. Utilizing the potential of integrating with other systems throughout the python background, what other systems do you think this email system can be integrated with? 
  • This system was created while adapting current technology. Using the common Internet Message Access Protocol, this uses the fundamental mail protocol. This type of technology is adaptable to current usages within various servers. What kind of usages rules would you integrate with your university email? 

Read More

04/15/2020 – Myles Frantz – Algorithmic accountability

Summary

With the prevalence of technology, the mainstream programs that help the rise of it not only dictate the technological impact but also the direction of news media and people’s opinions. With journalists turning to various outlets and adapting to the efficiency created by technology, the technology used may introduce bias based on their internal sources or efficiencies and therefor introduce bias into their story. This team measured multiple algorithms against four different categories: prioritization, classification, association, and filtering. Using a combination of these different categories, these are then measured within a user survey to measure how different auto complete features bias their opinions. Using these measurements, it has also been determined by the team that popular search engines like Google specifically tailor results based on other information the user has previously searched. For a normal user this makes sense however for some investigative journalist these results may not accurately represent a source of truth. 

Reflection

Noted by the team, there is a strong conflict in the transparency used within an algorithm. These transparency discrepancies may be due to certain government concerns dependent on certain secrets. These creates a strong sense of resiliency and distrust against the use of certain algorithms based. Though these secrets are claimed for national security, there may be misuse of power or overstepping of definition that overuses the term for personal or political gain and are not correctly appropriated. These kinds of acts may be located at any level of government, from the lowest of actors to the highest of rankings.  

One of the key discussion points raised by the team to fix this potential bias in independent research is to teach journalists how better to use computer systems. This may only seem to bridge the journalist’s new medium they are not familiar with. This could also be seen as an attempt to create a handicap for the journalists to better understand a truly fragmented news system. 

Questions

  • Do you think introducing journalists into a computer science program would extend their capabilities or it would only further direct their ideas while potentially removing certain creativity? 
  • Since there is a type of monopolization throughout the software ecosystem, do you believe people are “forced” to use such technologies that tailor the results? 
  • Given how a lot of technology uses user information for potential misuse, do you agree with this information being introduced with a small disclaimer acknowledging the potential preference? 
  • There are a lot of services that offer you better insights to clean your internet trail and clear any biases internet services cache to ensure a faster and more tailored search results. Have you personally used any of these programs or step by step guides to clean your internet footprint? 
  • Many programs capture and record user usage with a small disclaimer at the end detailing their usage on data. It is likely many users do not read these for various reasons. Do you think if normal consumers of technology were to see how corrective and auto biasing the results could be that they would continue using the services? 

Read More

04/15/20 – Myles Frantz – Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking

Summary

Within this very politicly charged time, it is hard for the average person to decipher any accurate information from the news media. Making this even more difficult are all the media sources creating contradictory information. Despite the variety of companies running fact-checking sources, this team created a fact-checking system that is based on mixing both crowd source and machine learning. Using a machine learning algorithm with a user interface that allows mechanical turk workers to tweak the reputation and whether citations support a claim. These tools allow a user to tweak sources retrieved to read the raw information. The team also created a gamified interface allowing better and more integrated usage of their original system. Overall, the participants appreciated the ability to tweak the sources and to determine the raw sources supporting or not supporting the claim. 

Reflection

I think there is an inherent issue with the gaming experiment created by the researchers. Not part of the environment but based on the nature of humans. Using a gamified method, I believe humans will inherently try gaming the system. Using a smaller scale of this implemented within their research experiment while restricting it in other use cases. 

I believe a crowd worker fact checker service will not work. Given a fact checker service that is crowd sourced is an easy target for any group of malicious actors. Using a common of variety of techniques, actors have used Distributed Denial Of Service (DDOS) attacks to overwhelm and control the majority of responses. These kind of attacks have also been used for controlling block chain transactions and the flow of money. Utilizing a fully fledged crowd sourced fact-checker, this can easily be prone to being overridden through the various actors. 

In general I believe allowing users more visibility into the system encourages more usage. Using some program or some Internet of Things (IoT) device people are likely feeling as though they do not have much control over the flow of the internal programming. Creating this insight and slight control of the algorithm may help give the impression of more control to the consumers of these devices. This amount of control may help encourage people to put their trust back into programs. This is likely due to the the nature of machine learning algorithms and they’re iterative learning process. 

Questions

  • Measuring mechanical turk attention is done usually by creating a baseline question. This ensures if the work is not paying attention (I.e. clicking as fast as they can) they will not answer the baseline question accurately. Given the team did not discard these workers do you think the removal of their answers would support the theory of the team? 
  • Along the same lines of questioning, despite the team regarding the user’s other interactions as measuring their attentiveness, do you think it is wise they ignored the attention check? 
  • Within your project, are you planning on implementing a slide like this team did to help interact with your machine learning algorithm? 

Read More

04/08/2020 – Myles Frantz – CrowdScape: Interactively visualizing user behavior and output

Summary

Crowd Sourcing provides a quick and easily scalable way to request help from people, but how do you ensure they are properly paying attention instead of cheating in some way? Since tasks are handed off through some platform that handles the abstraction of assigning work to the workers, the requesters cannot guarantee the participants full attention. This is where this team has created CrowdScape, to keep better track of the attention and focus of the participants. Utilizing various Javascript libraries, CrowdScape is able to keep track of the participants through their interaction or lack of interactions. This program is able to track participants’ mouse clicks, key strokes, and browser focus changes. Since Amazon Turk is a web-based platform, Javascript libraries are perfectly able to track this information. Through the various visualization libraries retrieved, the team is able to demonstrate the visualization that provides extra insight information to the requestors. Through these advanced visualizations it’s demonstrated how the team is able to determine the workers behavior, including if they only have rapid clicks and shift windows fast or stay on the same window and stay focused on the window.

Reflection

I do appreciate the kind of insight this provides through delegating work. I have worked with mentoring various workers in some of my past internships, and it has provided various stress. With some of the more professional workers they are easier to manage, however with others it usually takes more time to manage them and teach them then doing the work themselves. Being able to automatically do this and discard the work of participants provides a lot of freedom to discard lacking participants since creators can not necessarily oversee the participants working.

I do however strongly disagree with how much information is being tracked and requested. A strong proponent of privacy, browser technology is not the best of domains to track and inject programs to watch the session and information of the participant. Though this is limited to the browser, any other session information, such as cookies, ids, or uids, could potentially be accessed. Though not necessarily able to be tracked from the app, other live Javascript could track the information via the CrowdScape program.

Questions

  • One of my first initial concerns with this type of project is the amount of privacy invasion. Though it makes sense to ensure the worker is working, there could always be the potential of leaked issues of confidential information. Though they could limit the key tracking to the time when the participant is focused on the browser window, do you think this would be a major concern for participants?
  • Throughout the cases studied through the team’s experiments, it seemed most of the participants were able to be discarded since they were using some other tool or external help. Do you think as many people would be discarded within real experiments for similar reasons?
  • Alongside the previous question, is it over reaching in a sense as to potential discredit workers if they have different working habits then expected?

Read More

04/08/2020 – Myles Frantz – Agency plus automation: Designing artificial intelligence into interactive systems

Summary

Throughout the field of artificial intelligence, many recent research efforts have been in effort to fully automate issues, ignoring the jobs that would be automated and closed. To ensure there is still progression between the two fields, this team has bootstrapped their previous work to create three popular and different technologies that work together to both visualize and aid the collaboration between workers and machine learning. Within data analysts, since there have been efforts to create automation in cleaning the raw data, one of the teams projects was adapted to visualizing the data in a loose Excel like table and suggesting transformations throughout the various cells. Further delving into the data analysts opportunities, they adapted more of their tools in order to copy data and automatically suggest automatic visualization and tables to better graph the information. Providing further information into the predictive suggestion, the team was able to produce multiple suggestions in which the users can choose which one they believe is the correct suggestion and further enhances the algorithm.

Reflection

Being a proponent of simplicity of design, I do appreciate how simplistic and connected their applications can be. Throughout the regularized and modularized programs, unless through Application Programming Interfaces it seems connecting various applications has to be done through standardized outputs that can be edited or adapted by someone or another external application. Being able to directly enable suggestions in data validation and connect it to an advanced graphing utility that also suggests new graphing rules and tools.

I do appreciate how applicable their research is. Though not completely unique, creating usable applications greatly expands how far a project will stretch and be used. If it can be directly and easily used the likelihood it will be extended and used throughout public projects.

Questions

  • Within the data analyst role, this may have aided but may not have completely alleviated all of the tasks that they have to do throughout an agile cycle, let alone a full feature. What other positions could be alleviated through these sort of tools?
  • Within all of the tool sets available, this may be one of many available on GitHub. Having a published paper may improve the program’s odds of being used throughout the future, however it does not necessarily translate to a well used and publicly used project. Ignoring any of the technical information (such as technical expertise, documentation, and programming styles), will this program be an upcoming project on GitHub?
  • Within the various use of standardized languages, the teams were able to make a higher abstraction that allows direct communication between the applications. Though making it easier on the development team, this may make it more restrictive on any tools looking to extend or communicate with the team’s set of tools. Do you think their created domain specific languages were required for their set of tools or if the languages was only created to help aid the developers for the connectivity between their applications?

Read More

03/25/20 – Myles Frantz – Evaluating Visual Conversational Agents via Cooperative Human-AI Games

Summary

Regardless of anyone’s personal perception of chatbots, with around 1.4 billion people using chatbots (smallbizgenius) they’re impact cannot be ignored. With the intention of answering rudimentary questions (often duplicated) many of these chatbots are focused in the Question-and-Answer (QA) domain. Throughout these, the usages and feelings towards chatbots varies usually based on the user, the chatbot, and the overall interaction. Focusing more on the human-centric aspect of the conversation the team proposed a Conversation Agent (CA, a chatbot within QA) with a method to inspect sections of the conversation and determine whether the user enjoyed the conversation. Within introducing a hierarchy of specific natural language classifiers (NLC), the team was able to determine through certain classifications or signals to determine a high-level abstraction of a message or conversation. While the CA did its job sufficiently, the team was able to determine through their created signal methodology that approximately 84% of people engaged in some sort of conversation (outside of a normal question and answer scenario) with the CA.

Reflection

I am surprised at the results gleaned from this survey. While I should not be surprised and should assume the closer CA (and AI in general) get to human-like they appear the better the interaction will be, the percentage of “playfulness” or conversational messages seemed relatively high. This may be due to the experience group of the participants (new hires from college), though this is a promising sign on the progress being made.

I appreciate the aspect (or angle) this research went into. Having a strong technical background, my immediate thought is to ensure all the questions are answered correctly and investigate how it can be integrated with other systems (like a Jenkins Slack bot, polling the survey of a project). The extent of a project (I believe) is not only dependent on how usable it is, but also how user-friendly it is. Given the example MySpace and Facebook, Facebook created a much easier to use and more user-centric experience (based on connecting people), while MySpace suffered from lack of improvement for both of these aspects and is currently degrading in usage.

Questions

  • With only 34% of the participants responding to the survey, do you think a higher percentage would’ve enforced and backup the data currently collected?
  • Given the general maturity and time allocations a new hire from college has, do you think current employees (who have been with the company for a while) would have this percentage of conversation? To shorten it, do you think the normally busy or higher-up employees would have given similar conversational time to the CA?
  • Given the percentage of new hires that responded and responded conversationally to the CA, the opportunity rises for the user to communicate wholly and disregard current work in favor of a conversation (potentially as a form of escapism). Do you think if this kind of CA were implemented throughout companies, these kinds of capabilities would be abused or would be used too much?

Read More

03/25/20 – Myles Frantz – All Work and No Play?

Summary

While the general may not realize their full interaction with AI, throughout the day people are truly dependent on it, based on their conversation with a phone assistant or even in the backend of their bank they use. While comparing AI against its own metrics is an imperative part to ensure the highest of quality, this team compared how two different models compared when working in collaboration with Humans. To ensure there is a valid and fair comparison, there is a simple game (similar to Guess Who) where the AI has to work with other AI or humans to guess the selected image based on a series of questions. Though the AI and AI collaboration provides good results, the AI and Human collaboration is relatively weaker.

Reflection

I appreciate the competitive nature of comparing both the supervised learning (SL) and a reinforcement learning (RL) in the same type of game scenario of helping the human succeed by aiding the as best as it can. However as one of their contributions, I have issue with the relative comparison between the SL and RL bots. Within their contributions, they explicitly say they find “no significant difference in performance” between the different models. While they continue to describe the two methods performing approximately equally, their self-reported data describes a better model in most measurements. Within Table 1 (the comparison of humans working with each model), SL is reported as having a better (yet small) increase and decrease in Mean Rank and Mean Reciprocal Rank respectively (lower and then higher is better respectively). Within Table 2 (the comparison of the multitude of teams), there was only one scenario where the RL Model performed better than the SL Model. Lastly even in the participants self-reported perceptions, the SL Model only decreased performance in 1 of 6 different categories. Though it may be a small decrease in performance, they’re diction downplays part of the argument their making. Though I admit the SL model having a better Mean Rank by 0.3 (from Table 1 MR difference or Table 2 Human row) doesn’t appear to be a big difference, I believe part of their contribution statement “This suggests that while self-talk and RL are interesting directions to pursue for building better visual conversational agents…” is not an accurate description since by their own data it’s empirically disproven.

Questions

  • Though I admit I focus on the representation of the data and the delivery of their contributions while they focus on the Human-in-the-loop aspect of the data, within the machine learning environment I imagine the decrease in accuracy (by 0.3 or approximately 5%) would not be described as insignificant. Do you think their verbiage is truly representative of the Machine Learning relevance?
  • Do you think more Turk Workers (they used data from at least 56 workers) or adding requirements of age would change their data?
  • Though evaluating the quality of collaboration is imperative between Humans and AI to ensure AI’s are made adequately, it seems common there is a disparity between comparing that collaboration and AI with AI. Due to this disconnect their statement on progress between the two collaboration studies seems like a fundamental idea. Do you think this work is more idealistic in its contributions or fundamental?

Read More