11/7 Reflection 6

Summary

In the paper “Visualizing Email Content: Portraying Relationships from Conversational Histories”, the authors document the ideation, algorithm design, and content analysis of their application Themail. The purpose of this program is to give users a photo-album type visualization of their relationship with another person using the user’s archived e-mails with that person. Themail takes the most frequently used words exchanged by e-mail between the pair, checks its uniqueness (whether it also frequently exchanged with many other people), and uses an algorithm to arrange the words in one of two meaningful formats.

  • Needle Mode: In the foreground, vertical stacks of words are placed on a timeline by month. Each stack consists of the most frequent/unique words exchanged during that specific month. This provided a more detail-oriented exploration of one’s e-mail archives. Despite its capability for searching these details, only 20% of users to greater interest in this mode.
  • Haystack Mode: In the background, larger words float faintly but visibly. These words represent the most frequent/unique words exchanged over the course the year being represented. This provides a more “big picture” exploration of trends and themes. 80% of users utilized this mode the most, indicating an interest in the greater relationship they had with people and the aesthetic quality of the application.

The algorithms used to generate word value and populate the screen was based off of a past algorithm which scores words based on their relative frequency in one document out of a collection. The team evolved this concept by comparing subsets of e-mails against supersets so that they may not only take relative frequency into account but also relative uniqueness. The equations for yearly and monthly word values are nearly identical, with the key difference being that yearly word frequencies are cubed in order to increase overall weight results.

 

Reflection

There were two aspects of this paper that interested me. Firstly, the aesthetic factor that made users more interested in the haystack rather than the needle mode goes to represent how important it is to remember that despite the technical nature of our field, a appreciated level of graphic design is involved in computer science. Secondly, the algorithm design brought to mind my team’s project for the semester. We too must come up with an algorithm that utilizes word frequency to detect importance in a certain context. While our topic is not based upon time, the paper nonetheless provides us with a unique perspective and possibly a stronger foundation from which we can build our equation. Namely, the use of inverse frequency to detect relative frequency rather than raw frequency may allow us to see if we’re focusing on the wrong keywords (as we are currently measuring by the latter). 

 

Questions

  • What other visual structures and organizational patterns were considered for this application? What benefits did this horizontal timeline with vertical stacks afford over the other options? What drawbacks did you have to accept with it?
  • The paper mentions the limitations of their content analysis in that the application cannot detect personal weight of e-mails. For example, an e-mail from a mother wishing happy birthday to her son might be taken as having greater word value than one where she is reminding him of a dentist appointment. Is there any progress being made on that front? What ideas are there for identifying overall message value?
  • What factors went into deciding to have the haystack and needle mode occupy the same screen rather than having a way to switch between the individual modes? Can you see any benefit to having a screen for each mode rather than a fused screen? Perhaps it would allow more features?

Read More

Reading Reflection #6

Summary

The paper “Visualizing Email Content: Portraying Relationships from Conversational Histories” introduces Themail, a different take on visualizing email archives. Using interaction histories found in email archives, the typographic visualization is able to create a visual representation of relationships. The visual representation is composed of words found in the content of exchanged messages to characterize and show how relationships change over time. Keywords are shown on a timeline in a series of columns where the keywords are shown in different colors and sizes based on their frequency and distinctiveness. Themail is capable of displaying multiple layers of information where yearly words are large faint words that show up in background and monthly words are columns of yellow words show up in the foreground. Yearly words are the most used terms over a year of email exchange while monthly words are the most distinctive and frequently used words over a month of email exchange. When studying the use of Themail by users, two main interaction modes were noticed: the haystack mode and the needle mode. About 80% of participants used the haystack mode while 20% used the needle mode. The haystack mode presents the overall patterns of the visualization while the needle mode focuses on the individual pieces of information.

Reflection

Themail reminded me of word clouds except the words are in column format. Personally, I do not see how it can be useful especially since I do not use email that often for long term communication. I did not understand the anecdotes of users enjoying the visualization for family use because it seems like the actual writings would have more value than the individual words. Nowadays, I think the use of this would better fit for text messaging or instant messaging because those ways are more common for communication. Visualizing instant messaging would probably different since people usually do not write as if they are writing an email when texting.

Questions

  • What other structures/organization of the words could be helpful?
  • How could this be used in other modes of communication?

Read More

Reading Reflection 11/7

Summary

In “Visualizing Email Content: Portraying Relationships from Conversational Histories,” the authors Viegas, Golder, and Donath describe relationships between individuals over email. Themail, the interface created that shows these interactions and relationships, displays words pulled from emails. These words are categorized via a yearly or monthly basis; yearly words portray the overall tone of relationship, whereas monthly emails are more detailed. On the interface of the application, yearly words are displayed in the background in light gray, and monthly words are yellow in the foreground. When a specific word is selected, emails that contain the word appear. In order to test Themail, a study was provided such that participants could use one of two modes: the haystack mode or the needle mode. The haystack mode allowed users to view an overall picture of the relationships they had over email. Over the course of the study, a majority of the users (about 80%) decided to use the haystack mode because they wanted to see their relationships with their family and friends and confirm their expectations with their findings. On the other hand, the needle mode allowed users to view specific pieces of data to identify patterns in their relationships. The other 20% of the population were more concerned with analyzing their workplace relationships rather than those with their friends and family. The authors also discuss how most users would not utilize this application daily, rather, it is more probable that users use Themail every so often.

Reflection

Something I found interesting was the difference between the haystack and needle mode users. In the description of the the needle mode, the paper discusses that haystack users wanted to see information they already knew, but the needle users wanted to determine information they didn’t know or couldn’t remember. In particular, the authors note that about 20% of users used the needle mode, which made me curious if all the users only utilized Themail for one sole purpose (i.e. only haystack or only needle). Additionally, it was intriguing how the range of email archives ranged from 90 MB to more than 1GB with an average of 456 MB because I know that there extremes to people who email – some delete all their emails, and some don’t delete at all. For those who delete all their emails, Themail would obviously not be as applicable of an application. I actually don’t ever clean out or delete my emails, so I probably have a lot of data to sort through; however, I don’t think fruitful information pertaining my personal relationships would appear. This is especially because in this day and age most people don’t communicate solely over email. It would be interesting to apply this to texts.

Questions

  • Though they would likely vary from person to person, which words were most popularly used?
  • How could the UI change to make it less cluttered and perhaps easier to read?
  • Is there a significance to placing the yearly words in the background rather than just displaying them on the side (or elsewhere in a different manner)?

Read More

Reading Reflection #6

Summary

The paper, “Visualizing Email Content: Portraying Relationships from Conversational Histories”, discusses Themail, a visualization that portrays relationships based off interaction histories preserved in email archives. Through Themail, key words that characterize an individual’s correspondence with another are displayed and the user can view how these words change over time. Themail can display multiple layers of information, yearly words or monthly words. Yearly words are shown as large faint words and are the most used terms over an entire year while monthly words are shown in yellow and are the most used terms over a month. Themail was evaluated by sixteen participants and from the results there were two main interaction modes that were observed. These two modes are called “the haystack” and “the needle”. The “haystack” is a “big picture” look at the information while the “needle” is more detail-oriented approach. Most of the participants used “haystack” mode over “needle” mode.

Reflection

I found this article to be an interesting read. With how often we receive emails, it’s easy to forget how emails can contain a lot of information about how we interact with others. I think this tool would be great for reminiscing, like how Facebook has the feature where it reminds you of the yearly anniversary of your friendship with someone. However, since this article has been written, email has become less used as it used to be. Most people interact online through instant messaging. I personally use email for professional purposes such as for work or school while I would chat with my friends and family through instant messaging tools or apps like Facebook Messenger and Google Hangout. Thus, it would be great to see a visualization tool that could be used on instant messaging.

Questions

  • How could this tool be updated to today’s social media?
  • What could be changed so that Themail would be more likely to be used on a daily basis?

Read More

Reading Reflection 6 – 11/7

Summary:

The article “Visualizing Email Content: Portraying Relationships from Conversational Histories”, written by Fernanda B. Viégas, Scott Golder, Judith Donath, is another look at the way we use email to communicate. The article focus around using Themail, a visualization that portrays relationships using the interaction histories preserved in email archives. The authors had participants use the visualization software to see how they would interact and analyze the data. They do this through analyzing a large dataset of the users emails and filter out unwanted email such as spam or one off emails until they have the relevant dataset of emails. They found that their were two main interaction modes with the visualization, exploration of “big picture” trends and themes (“haystack”) and more detail-oriented exploration (“needle”). They also found that the vast majority of people tended to stick with the haystack approach and would look at the connection to loved ones and family. Users that chose to analyze with the needle method were less focused on the relationships and were usually searching of a specific piece of information usually pertaining to work. While this method and the tool of Themail are interesting to users, the study found that most users would not usually utilize Themail or a similar application in their day to day lives. They would instead come back to it on occasion usually for the purpose of reminiscing and wanting to see the old connections and relationships they old with others.

Reflection:

Overall I really liked the article and the findings that the authors came to. I found it very intriguing to watch how users would use the system when it was provided to them with a proper dataset and could see an accurate representation of their connections. I understood that for the scope of their project why they filter out the spam and other areas they deemed unwanted but I also wish they had done some studies with it in so we could see the “fake” relationships that spam would cause the program to think the user has. I think it could be a very interesting experience to view the clutter that users get from spam and try to see its overall affect to the ecosystem. I found it curious that the two final groups, haystack and needle, were sections off the way they were. I feel like from the overall perspective of the app that all users would start out haystack and then utilize the needle function when searching for individual pieces of information. So I see them less as two distinct groups but more tools to use to comb through all the information as a whole.

Questions:

  • What could leaving all the spam and one time emails in show us?
  • Could a third group be added in that is an in-between stage? Not full stack but not quite needle.
  • Should email services work this type of technology into themselves for all users benefit?
  • Does this kind of technology create a worry for email privacy?
  • Just how much information is too much to give away for the convenience of data analysis and prediction?

Read More

Reading Reflection 6

The studies presents to us Themail, a program that gives users typographic visualization based off past emails. The team set out to answer two questions: what things does the individual talk about on email and how do their conversations differ across different people. They parsed their words and built two categories, yearly and monthly words. Yearly words revealed most used words over entire year of email exchange while monthly words were the most frequently used in an email conversation over a month. From that they developed two interaction modes for the user to explore with, haystack mode vs needle mode. The first dealt with trying to gain a big picture visualization of the relationships the user had, which most of the times the user already known. However, the needle mode served the purpose of finding specific information, which actually revealed information the user may not have been aware of. At the end, the majority of participants used the “haystack” mode to see their relationships between family members and friends. Alternatively, the users that used Themail in “needle” mode used it more to find work related information.

 

While reading about analyzing relationships via the context of messages, I thought about what kind of results this project would have if it was targeted to romantic couples. However, instead of parsing their email history it should parse their cell text history, for obvious reasons. The study should examine a couple’s text history from the beginning stages of dating until the couple splits or divorces. For couples that have been together for five, ten, fifteen, years, perhaps it’d be good to build a dataset of common words or phrases these couples use and compare them amongst each other with respect to how long they’ve been together. From that, perhaps researchers can build some kind of predictability models for any other couples as they would have plenty of data to compare against it. Does this couple, based off their messaging history, have a greater chance of staying together or breaking up in the near future?  The only challenge to this is most people wouldn’t be open to freely grant access to their messaging history.

 

The difference of words used between same-sex vs opposite-sex messaging?

Would this study yield the same result if it’d done for other languages?

Read More

Reading Reflection 11/7

Summary:

This paper talks about a new visualization they’re presenting called “Themail”. This visualization portrays relationships using the interaction histories preserved in email archives. It uses this content to show a word that corresponds to a particular individual for the user and how that word changes over their relationship. These words would appear in a list with the more prominent words being of larger sizes and differing colors. For example, words that were used throughout the year were grey to show they weren’t very important while words that appeared throughout a particular month were separated into their own columns. Furthermore, selecting one of these words would also show more information on how that word was specifically used in the emails. There were also two modes for viewing information, haystack for a broad overview and needle for specific information.

 

Reflection:

Using a single word or few words to represent a whole month or year of communication with another person is an interesting way to try and represent what those communications have been like. Given that this visualization is used for email exclusively I would expect that most representations will have more of a business tone, as most people use email for business. The paper also touches on this a bit as they point out the flaws that come with using email and the nature of emails in general.  Perhaps it would be useful to remember what specific business or work-related topics were being discussed heavily during a specific month. Overall it’s an interesting idea and may be useful for business applications.

 

 

Questions:

-What words do you think would appear in your own email contacts?

-What words are most common across all interactions at the monthly level? Yearly?

Read More

Reading Reflection 11/7

Summary:

“Visualizing Email Content” is a continuation of research based upon how we use emails in everyday life, and what those emails, when analyzed, show. Their previous research was based on patterns in the time where users created emails. This is based in the patterns of communication these emails create over time, especially between two users. They then go into a very lengthy literature review, followed by a description of their new research “Themail”. Themail is a way to visualize your email content and what you say regularly, both to a specific person or overall. It also works based on either a month by month or yearly basis. They say the basis for monthly and yearly is because yearly shows what is essentially a summary of your communications, whereas monthly varies much more due to select events occurring in a given month (they cite things like weddings or holidays). Then they go into detail on how their process works. First, they get an email archive, filter out everything that isn’t useful, such as spam, then calculate the key words from the email archive, reporting them back in their UI. They then talk about their two different modes of searching that were created after user feedback, the needle and haystack modes. Haystack provided a long-term view of the overall relationship with someone, where the needle mode showed very specific details of emails.

 

Reflection:

I like that they start with a large amount of literature review. It’s a really good example of what we need to be doing for our final reports. As well it just gives a lot more information than most if we’re interested in the topic. Their application of using monthly keywords to actually search for emails from that month is ingenious, and it’s something I didn’t even think about until they said it. I will admit, looking at their UI, it seems extremely cluttered. The grey words on the background make everything else very hard to read. Their way to filter out emails that aren’t useful to the analysis is very interesting though, and might be something for my group to consider when it comes to finding fake news articles. Perhaps checking if a user has used a site before? That seems a bit dangerous though. Their word scoring could also be useful, although time is not a variable in our analysis. The differences in the needle and haystack modes were interesting, although I wish they had been a bit better explained, I had some difficulty figuring out how they majorly differed (or at least what the needle mode was supposed to be doing, the haystack mode sounded like the original idea for their project).

 

Questions:

The researchers describe emails as “abstractly sterile”. Is there a better way to communicate in a way like email, but more akin to face to face speaking?

What do you think your emails show overall? How might an email set with a close friend differ from one with someone else like an employer?

How interested would you be in seeing an overview of something like your texts from a few years ago?

Read More

Reading Reflection 6

Summary
The article discussed a project called Themail which created a typographic visualization of an individual’s email content over time. Themail had two modes called haystack and needle which allowed users to choose whether they wanted to see a broad overview of information or specifics. Clicking a word displayed more information about the specific use of the word in emails. The interface showed a series of columns of words that varied in size and color depending on the usage. Yearly words or words that were most commonly used throughout the year were gray and in the background. Monthly words were in series of columns and represented both frequency and distinctness of its usage. This kind of visualization allows users to discover trends and visualize their relationships by studying common interactions and words. During testing, participants enjoyed the visualization and were able to both confirm their relationships and discover how these relationships changed over time.

Reflection
The method of visualization discussed in this article was interesting because it was able to display relationships over time. I don’t agree with the method they chose to filter out spam mail by only considering email addresses that the user has sent at least one email to. I believe that filtering out irrelevant and out of context emails was a difficult obstacle for them, but they were able to overall achieve a well designed visualization. They discussed the limitations of their project which were related to the nature of emails. They found that email had many uses and each use should be analyzed differently (i.e. forwarded emails, spam, etc.).

Questions
How could this visualization over time be applied to other mediums such as social media?
Would it be better to show more statistical information and less graphical information?

Read More

Reading Reflection 6

Summary:

Themail is a tool used to create a visualization of the content of emails between people. Words that are frequently used for long periods of time (yearly words) are displayed as well as words that are frequently used over the course of a month. It can be used to show the progression of a relationship between two people or the timeline of events that have occurred during a person’s life. The researchers sent Themail to participants to be used outside of a lab to get a more natural response. They found that 80% of the participants used Themail to look at the broader visualization of their emails (“ the haystack”) and 20% of the participants used it to look at the more detailed aspects (“the needle”). The participants that used “the haystack” mode tended to look at the relationships between friends and families. Themail was compared by users to photobooks as a way to look back on events.  On the other hand, participants that used “the needle” mode were more likely to look at the emails sent between coworkers or ones that were work related.

 

Reflection:

I thought the visualization Themail used was really interesting but also a bit confusing and chaotic at first glance. Having the yearly words kind of more faint in the background was an odd choice as you would think those words would have more significance if they’ve been used over the course of a year or more. I also really liked that the monthly words were interactive. It would be fun to scan through emails and easily access the original email the word originated from. I also liked the use of spacing to show how long a person went without communicating with the other person. Adding the use of circles to represent the email messages made the different visualizations almost overwhelming. Since the study was done with only 12 participants, I would like to see it done on a larger scale to see if the trends the researchers found still hold true.

 

Questions:

What would a similar study done on a social media platform (like Facebook Messenger) show?

How would the use of different visualizations change how the users interact with Themail?

 

Read More