Reflecting on my Experience with Online Teaching this Spring

This past spring, as I’ve done for many years now, I taught our CS 1331 – Introduction to Object-Oriented Programming (using Java) class to 300 mostly first and second year students. We started the class on campus and in-person but like all other colleges and universities, we transitioned to an online remote format around the midway point. This essay recaps my experiences in this transition and what it was like to teach a class in the remote form.

To begin, I must qualify that I am not the biggest proponent of remote online classes. I’ve been asked multiple times to create an online version of the 1331 class as well as my graduate Information Visualization course, but I have always declined. Well, there was no choice this semester.

Our spring break this year was March 16-20. During the week before that, concerns about Covid-19 really started to get much worse. Shortly after our in-person class on Wednesday March 11th, it was announced that students should plan to leave campus and not return after that. The rest of our semester would be completed remotely. On Thursday, due to concerns about the virus in such a big classroom and since many students were needing to make imminent travel plans, I decided to make that Friday’s class be our first online offering. I knew virtually nothing about online meeting event software, so I quickly learned a little about WebEx and set up a meeting where students could join. I asked them all to mute their audio upon joining. Unfortunately, one student realized that he could join the meeting under a fake name (the meeting did not require authenticated identification to enter) and then he chose to go on a profanity-laced diatribe about the second exam in the class which occurred the week before.

At first, I wasn’t sure if only I could hear him or if all the students could. It eventually became apparent that the other students could. I was frankly flustered, and I did not handle this situation well. I decided to just push on and hope the student would stop, which he eventually did, but not before making a total mess of that class. I should’ve stopped and confronted the individual. Well, hindsight is 20-20.

I bring this incident up for a couple reasons. First, an instructor cannot assume that all students will be on good behavior in a remote environment such as this. I’ve read reports on other small classes where professors described open and engaging meetings that they held, and I just smiled. When you have 300 (mostly) first year students, the chances of getting a knucklehead go up significantly. The experience also strongly influenced me to seek out a meeting tool that would allow me to better control students’ potential interruptions during the event. As they say, you live and you learn.

Georgia Tech decided to take an extra week off after our spring break to help faculty prepare for the transition to remote classes. I then used those two weeks to do a lot of research about the different online meeting platforms and their capabilities. GT officially supports WebEx and BlueJeans, but not Zoom, so those were the two options available to me. Ultimately, I chose BlueJeans Events (BJE) for all my remaining classes.

BJE has several characteristics that worked well for my class. Up to 500 people could join an event, and the sheer size of my class dictated that need. Furthermore, I could have students join as Attendees, and in doing so, they would be in a listen-only mode, that is, not able to speak and interrupt as the one student had chosen to do so earlier. A few of my teaching assistants (TAs) also could join as Moderators. I allowed students to ask questions via an online Q&A chat capability, but their comments would not be made visible to other students by default – a TA then had to OK the comment and pass it along. Again, I was concerned about students potentially disrupting the experience.

Another appeal of BJE was its flexibility in delivering an event with multiple screens. To explain this importance, I need to backtrack and explain a little about how I normally conduct this class in person. I make a very explicit decision not to use PowerPoint slides during lectures. Instead, I draw upon my saved notes for a topic and I hand-write class lecture notes on pieces of tablet paper that are displayed to the entire class via a document projector. In addition, I hook up my laptop (the room has two large display screens) and I can show code and run live demos on it. I firmly believe that hand-writing notes forces me to be more deliberate and take my time going through material. I also feel that by hand-copying these notes, students better connect with them and they also build up a nice notebook of the course material. In this sense, my class functions much like a traditional “lecture”. However, I make sure to frequently ask questions to the class, and I work hard early in the semester to encourage students to raise their hands to ask questions when they do not understand something. This seems to work, and it is not uncommon for me to get 10-20 questions in a class, sometimes making it a challenge to get through the material planned for that day. This is really the only form of “active learning” I use, but students seem to gravitate to it and appreciate it.

As an aside, I know that the traditional lecture style of class is not in-vogue right now. With that said, I think it is important to note that students seem to appreciate learning the material this way, as evidenced by their comments, my teacher rating (e.g., https://www.ratemyprofessors.com), and the fact that my class is routinely one of the first to fill up and have a long waitlist.

My primary goal in transitioning to a remote format was to make the experience as close to the live, in-person lectures as I could. I set up my laptop with its webcam to show a video of me giving the lecture. In addition, I had an iPad that I rarely used but which came in handy at this time. I bought the GoodNotes app for taking notes on a simulated piece of notebook paper using the Apple Pencil. This allowed me to keep the live note-taking aspect of class. In addition, I was able to save the notes as a pdf afterwards and make them available to students.

I debated whether to pre-record a class’ content or deliver the lectures live. Ultimately, I decided to deliver the lecture live but use BJE capability to record the lecture for subsequent viewing. I hoped that having students connect with class at the usual time on MWF each week would help foster a sense of connection and continuity. However, my students were now scattered all over the world, so it was simply inconvenient for some to join live (2pm in Atlanta is early morning in much of Asia, for instance). Below is an image taken from a lecture. The right display shows the screen of my iPad with the handwritten notes. BJE allows students to use a slider to adjust the proportional size of the two displays, even during playback of a recording.

twoscreens

As mentioned above, when students had questions during a lecture, they could ask the questions via the Q&A chat. I tried to pause from time to time to see if questions had arisen, and I asked my TAs to simply interrupt me when a good, relevant question popped up. It turned out that the TAs were able to answer many of questions in the chat without asking me. Often, they waited until the end of lecture to push questions up to me. I felt that this had pluses and minuses. On the positive side, many different students could ask questions and get answers. Conversely, these questions just didn’t feel as timely as those from an in-person class, and I think those questions actually slow me down even more and provide the students beneficial pauses during a lecture. Furthermore, students only watching the lecture via a recording clearly couldn’t ask questions.

Much like in my normal class, I tried to inject a little fun and diversion into classes. Because I was using my laptop for most classes, that meant running via a wireless connection. The wireless router at my house is in the basement, so I set up class there at a table in my basement. Students got used to hearing “Coming to you live from the basement again today.”  One particularly sunny and pleasant Friday I decided to start class out in my backyard, and I showed students our koi pond. That evoked some humorous comments in the chat such as “Dang, Professor Stasko’s flexing on us” and “Inspiration to be good at CS and have a backyard like that someday :’)”. Multiple students commented to me at the end of the class that they enjoyed these changes of pace.

backyard2 pond2

Some diversions were unplanned, however. Our dog Buster wandered into the shot during a lecture (below) prompting one student to comment “Doggo” in the Q&A which received 75 upvotes, the most of the entire semester! Buster ended up making a couple more visits later in the term, each time stealing the show.

buster

I was able to count roughly how many students joined live during the lectures. The most I noticed was about 220 during one of the first classes after returning from the break. By the end of the semester, when students learned that they could simply watch a lecture’s recording, it seemed to settle into about 150-160 of the nearly 300 students connecting live for a lecture. Students were connecting from about half the states in the U.S. and around the world in countries like India, UAE, Kenya, Bulgaria, St. Maarten, and Paraguay, among many others.

The recordings received a lot of traffic as well, with a few ending up with over 300 views by the end of the term. I created a page within Canvas (our online course support platform) with links to the BlueJeans Events, lecture recordings, pdf’s of notes, and other relevant materials (see below). This was popular with students who liked having a “one stop shopping destination” for the course’s remote content.

remote-index

Perhaps the biggest challenge I faced moving the course to a remote format was how to administer exams that would be fair to all and not susceptible to being easily compromised. That was a huge challenge/headache, and for the sake of brevity, I’ll put that off for another column at a future date.

After students took the Final Exam for the course, I deployed an anonymous survey in Canvas to help learn their opinions of the remote class experience. I asked questions about the transition, comparing in-person to remote lectures, opinions about recitations and office hours, and simply gave students the opportunity to say anything on their minds. 73 students completed the survey.

Some initial questions focused specifically on the in-person lecture experience versus the remote analog to that. I asked about their experiences viewing the remote lectures and how they thought that the two experiences compared. See the charts below for a breakdown of their viewing habits.

lectures

In terms of a qualitative comparison, many students commented that the remote lectures felt akin to the in-class versions because of common activity of note-writing and capture. Some students felt that the notes were clearer on the iPad than on paper, while others noted that internet lags, traffic, and buffering were enough to hurt the live remote experience. When asked about the remote lectures, these student comments represented a very common view:

    “They went pretty well. Very similar to our in class format.”

     “The lectures were almost as good as the in person ones. It wasn’t really lacking anything.”

Multiple students noted pragmatic challenges in watching live, however. For example, one commented

     “Something I noticed was that when it was live, my connection was very bad so I couldn’t watch it. It is also very hard to watch it live because my family is very distracting.”

Views on the ability to ask questions and get answers were mixed, with support for both ways. I would estimate that a small majority of students said they preferred the in-person experience for Q&A. Perceptions such as those below were common:

     “I thought that bluejeans events was the best way to do lecture, however, I wasn’t a fan of the fact that Professor Stasko had to be interrupted by TAs to get questions answered.”

     “They were fine, but I feel like the Q&A part was a lot less effective online. I usually enjoy hearing student questions during class because I find them interesting and helpful.”

     “There was much less interaction between professor and students and students with other students after going online.”

Contrasting views were common as well though:

     “I really appreciated the recordings and thought the TA’s handled questions well during live lectures. Online made it a more comfortable place to ask questions.”

One advantage of the remote set-up was very clear – students appreciated the ability to re-watch and review lectures later. This was the strongest, most consistent opinion when comparing the in-person experience to the remote one. Relevant student comments included:

     “I actually liked it. since i live in a different timezone, i watched recordings for all my classes and this class had the best quality of videos and it was better than the actual lectures since i could pause and take time to take notes. i really liked how we could change the portion of the notes in the video.”

     “I actually liked how I was able to rewatch lectures after class as sometimes during class, the concept didn’t click yet or my wifi buffered and I missed a part of lecture, so it was really nice to be able to go back and rewatch/relearn at my own pace.”

     “I liked the live lectures and recordings because I could see Dr. Stasko’s notes clearly and I could go back in the recordings later if I missed something.”

Ultimately, I sought to learn about a holistic comparison of the two delivery modes, when everything is considered. I asked the students to compare overall the in-person experience in the first half of the semester to the remote experience in the second half. With the strong views about the value of the lecture recordings and being able to re-watch them, I was prepared to see a preference emerge for the remote experience. However, that guess would be wrong. Here are the student views when directly asked about their preference:

compare

The students clearly spoke. Nearly 70% preferred the in-person experience. I was really struck by this result. In a follow-up question on the survey, I asked the students to explain why they chose their answer on the previous question. No one clear reason as to why students preferring the in-person experience did so, but many answers seemed to speak to the actual experience of meeting face-to-face, the level of engagement and of the event being a happening, a true experience.  Here are some example responses:

     “In person was slightly better just for all the interactions between professor, TAs, and students. It’s just unfortunate that a remote setting cannot replicate that.”

     “I found it much harder to get engaged in the online lectures, but that’s not Professor Stasko’s fault (distance learning is distant, and coming off an extended spring break into online classes didn’t help at all). The one benefit of these online lectures though, was that I could pause and resume as I wished and re-watch the lectures for review. In terms of effectiveness, I think remote lectures were just as good, but in terms of being as enjoyable, they weren’t.”

     “Personally, I prefer the interactive nature of in-person classes.  I also missed the professor’s humor.  I felt that remote classes felt more like lectures than in-person ones.  Perhaps due in part to sitting in the front of the classroom, I felt that in-person classes had more of a small classroom, personal, engaging feel to them.  However, online classes definitely felt more like I was a student sitting in a class with information thrown at me.  I would like to point out that this is not the professor’s fault, but a fault in the nature of remote classes.”

     “Learning is better for me when I have peers around me I can quickly turn to and ask a question if needed. It also feels so much better since I had friends in the class, and the environment was just really positive for focusing and learning.”

Another common comment from students is that viewing lectures remotely at home on their computer is simply prone to distractions.  Students stated:

     “I found it much easier to pay attention to in-person lecture, rather than having the distractions of my laptop in remote learning. However, I found the quality of the lecture itself to be the same.”

     “I just loved the environment with everyone listening to the professor. You also concentrate a lot more because you are in the classroom. At home it is very hard to concentrate because family is very distracting and loud.”

     “Nothing to do with the course itself, it’s personally just a lot harder to keep myself focused and stay on task at home.”

Sometimes other factors led students to prefer an in-person experience. One student commented:

     “While most of the class experience was able to be mirrored in an online setting, one glaring flaw was the inability to do in-person group work. As a student with ADHD, it was significantly harder to actually understand and retain new CS topics without communication with others in person. I understand that this is a tough thing to accommodate, but I feel it is important to mention because it made learning and performing as well as I did almost impossible. I am hoping that I never have to take another online CS class again, but that is out of most people’s control, unfortunately. Thank you for the concessions you made to make this transition more tolerable.”

When I process all of the feedback from the students, ultimately it feels like the vast majority preferred and missed the in-classroom experience. Despite the remote lectures being similar to the in-person ones in terms of content, and the added capability of subsequent viewings of lecture recordings, the students prefer coming together as a group and meeting face-to-face.

I think that some of this comes from a self-realization in the students that they just learn better this way. The online environment is loaded with distractions (email, Instagram, video games, etc., etc.) and in many cases, the home environment is too. Students seemed to appreciate the shared routine of coming together as a group at a time and a place. Mark Guzdial wrote about this same phenomenon recently in one of his blog columns.

My own personal feelings about the spring semester mirror those of the students. Part of me enjoyed the online experience as something new, a change of pace. I was able to learn some new tools and technologies, and try out this remote teaching activity that I had explicitly avoided until now. But I also had a little bit of a hollow feeling after each lecture that was a stark contrast from the energy and motivation that I typically feel after an in-person class. Ultimately, it is a shared experience between the students and me. We come together three times a week and interact with each other. They sit with their friends and talk about what has been going on in their lives. I enjoy seeing their faces, learning what might be on their minds at that time, and maybe think about where I can inject a joke or two.

It’s not just information conveyance, transferring knowledge of Java and object-oriented programming to the students. It’s a performance, an engagement. I worry that some of the advocates of online education miss this point. College is much more than learning a bunch of facts and becoming proficient in some field or fields. Students aren’t just gaining a certification. They are interacting with other people, learning to take care of themselves beyond what good old mom and dad have been doing, and figuring out what they want to do with the rest of their lives.

I ended up the spring term realizing just how much I missed the classroom and interacting face-to-face with students. Delivering lectures remotely was different and interesting in some ways, but … you can have it. For me, I’ll continue to enjoy seeing my students in the classroom. Of course, outside factors have changed the world this spring and we do not know what the situation will be like in the fall. We must make sure that we do as much as possible to keep everyone (even 58-year old professors) safe and healthy. I firmly believe that we need to follow the guidance of our health professionals and proceed cautiously as we return to campus. I just look forward to the time when I can get back to the classroom to see my students’ faces, whenever that is.

A Fall ’19 Sampler of Student InfoVis Projects

I have been teaching our graduate class on Information Visualization (CS 7450) since the fall of 2000. Of perhaps all the different projects, initiatives, and activities that I’ve been involved with here at Georgia Tech, this course is the one that I have enjoyed the most and feel closest to.

Every year about 75 students take the class. The main component of their grade, for as long as I can remember it, has been a group project, typically of three or four students. Each team designs and develops a visualization system for a domain of their choosing. More specifically, each team must identify an interesting topic and find a dataset or datasets relevant to that topic. They identify a set of questions that the data may help answer and a high-level goal or purpose for the project. The teams next develop multiple visualization design ideas. Based on feedback from the TAs and myself, they move forward to create a working visualization system. While I specify no required platform for system development, nearly all teams now use d3.

The reason for this article is to publicize some of the top projects developed this fall by our students. I’ve created a webpage with links to the systems for these projects so you can explore each one individually. This fall, the projects split into two main styles. The first is a classic exploratory/analytic system with multiple views and many controls for selection, filtering, and reorganizing the views. The second style was more of a narrative or storytelling presentation, often employing a scrollytelling technique on a web page.

music

The two most impressive projects from the fall were both of the exploratory system style. The first, shown just above, presents songs and artists from the Spotify music service. Viewers see the breakdown of an artist’s songs along multiple dimensions such as key, tempo, and dance-ability. It also highlights similar songs and even provides short previews of the songs when they are moused over. The second project, shown at the top of this article, focused on a different type of artist, the greatest painters of all time. The visualization presents paintings as the four key colors from each. Viewers can organize the paintings by these colors, by artist, or by the date of the painting. When the viewer selects a painting, the system identifies other similar paintings and also presents characteristics of that artist’s works. I could literally spend hours playing with each of these systems to learn more about the data they’re presenting.

Another project of this style, shown just below, presents information from one of the Democratic presidential debates, highlighting when the different candidates spoke on various key issues and how the public reacted on social media. Yet another project presents information about videos trending on YouTube for the first three months of this year. Finally, one of the most impressive engineering efforts of the term is a visualization that depicts all the details (19 variables) of 46 million parking tickets in New York City! The system uses elasticsearch to manage access to that massive database.

debate

On the narrative/storytelling side, one of my favorites showed coffee beans from around the world and their characteristics. And this one easily could be characterized as an analytical visualization as well because it contains multiple highly interactive components that allow the viewer to explore this rich data set about coffees. Another strong project focuses on the political polarization between the Democrats and Republicans in America politics over the years. Unfortunately, it looks like the two sides are drifting further and further apart. A beautifully done and somber visualization, shown below, illustrates the scale of the Syrian refugee crisis. Two other nice projects each adopting a scrollytelling style present information about the growing scourge of plastics in our oceans and the rising popularity of TED talk videos.

refugee

I’ve chosen to highlight a subset of the projects from the class here, but this was only about half of the 19 projects in total that occurred. Others focused on topics such as human trafficking, endangered animals, faculty in computer science departments, and presidential speeches. What a varied set of domains the students addressed this fall!

In the past, I required each team write a short paper about their project. As that grew to feel more and more like a chore, a few years back I pivoted and asked each group to make a video introducing their topic and demonstrating their system. In my eyes, this has been a great success. This past fall, in particular, the production quality of these videos improved immensely. All the teams presented their videos during our three hour final exam period and I came away from the session just amazed at how good they all were. The project summary page also includes a link to the video for the featured projects.

The people deserving all the credit for these projects are the students themselves. For brevity, I haven’t included the names of the individual team members in this posting. You can learn their identities by following the links to the individual projects. (All students have given permission for them to be identified in this manner.)

Impressions from VIS ’18

The VIS ’18 Conference concluded a few weeks ago, and I finally had some time to sit down and pull together a few reflections about this year’s conference. For the second time ever, VIS ventured to Europe, and this year the meeting was held in Berlin, Germany. Attendance was an all-time record with 1256 participants. To handle such a large group, the meeting was held in the Estrel Hotel (shown above) which was rumored to be the largest hotel in Germany. It is located in the eastern part of Berlin, and it was difficult for me not to think about how different circumstances are today from some 30 years ago when few of us would have been able to set foot there.

germany

A clear theme that permeated the conference this year was the emergence of AI, machine learning, and related technologies, and just how visualization might connect to these topics. Many researchers feel that visualization can play a key role in developing “explainable AI” in the future. Pat Hanrahan’s keynote talk at the VDS Symposium perfectly aligned to this theme and was thoughtful and inspirational as usual. He defined analytical thinking as “A structured approach to answering questions and making decisions based on facts and data” and argued for its importance in our daily lives. He also characterized “responsible analysis” as being explainable, understandable, transparent, fair, vetted, and ethical, and communicated his belief that data visualization can and should be an important component of this concept.

In another invited talk at VDS, Kirk Goldsberry of ESPN, formerly with the NBA’s San Antonio Spurs, spoke about his experiences bringing data visualization to sports analytics. He is perhaps most famous for his heatmap visualizations of shot locations in professional basketball. Much of his talk focused on why we don’t see visualization used more in sports analytics. One simple answer he gave was “politics” but he enumerated three more specific reasons: 1. The constraints of media; 2. Sports analysts don’t know how to make visualizations; and 3. Sports executives don’t demand visualizations – it’s simply not a part of their culture. He also argued that visualization experts underestimate how much general managers only care about an answer (“Just tell me how much the house should cost, dude.”) In a very pertinent metaphor, Kirk believes that visualization scientists are good take at take-off and flying the plane, but we need to be better at landing it. He also interjected what was likely my favorite quote of the entire conference when he characterized legendary NBA player and announcer Charles Barkley as “more of a qualitative social scientist.” And I learned from Kirk’s talk that Harvard University has only eliminated one academic department in its history: geography.

This year VIS hosted a day-long VisInPractice event which included many invited talks presented by visualization practitioners. I only was able to see about half the talks and they were terrific. In one, Shan He of Uber presented the company’s Kepler geovisualization toolkit and system. It looked simply fantastic and left me eager for the opportunity to try it out. In another presentation, Lisa Charlotte Rost described her former and newly updated blog review of visualization authoring tools. She analyzed the growing space of tools and highlighted the rise of “data drawing apps” such as Lyra, Data Illustrator, and Charticulator. She did conclude with some thoughts about what all visualization author tools must improve at: better user interfaces, make it easier to build artsy, responsive charts, and make the software act more as a teacher, helping the user to learn the paradigm and tool as one uses it.

I greatly enjoyed the practitioner symposium and lamented the talks during it that I was unable to see. As visualization research becomes more and more focused and narrower in scope, I tend to miss the more design-focused work developing interesting and creative visualizations that we used to see at VIS. New workshops and symposia have sprung up to fill that void, such as OpenVis, Tapestry, Information+, and eyeo. (I’m looking forward to attending Tapestry for the first time next week.)

Beyond these symposia, recently I’ve also enjoyed following on Twitter and through their blogs a number of the stars of the visualization practitioner community, people such as Lynn Cherny, Andy Kirk, Neil Richards, Cole Knaflic, Scott Murray, and John Schwabish. Heck, even though he’s a professor, I’ll lump Alberto Cairo in there too. All these people consistently develop and identify interesting, thought-provoking visualizations, usually grounded in some domain and data set. I’ve found many of the visualizations in their posts to be inspirational in my own work and I also use many of them as examples to show in my visualization classes. I’d really welcome more involvement by these folks at VIS in the future.

vis18

One small thread of an idea in visualization design that I noticed throughout the conference was the use of motion and animation. The NY Times’ March ’18 story about effects of racism was described to use the “wandering dots” technique, the Times’ former work on helping to illustrate uncertainty in elections by using a spinning roulette wheel metaphor, and HOPs (hypothetical outcome plots) that illustrate uncertainty by making random draws from a distribution and animating through the resulting different visualizations, all were examples of this idea. While some of these visualizations are now a few years old, it was interesting to me how this idea popped up in different talks throughout the conference.

As for papers that caught my eye, I tend to gravitate toward the InfoVis Conference sessions in general, so most were from there. Just a few of the many that stood out (mostly because of my own personal interests) include:

  • The Draco system for embedding visualization design principles as constraints that can drive the generation of appropriate visualizations for a given data set
  • Efforts to develop new metrics for how users interact with visualization interfaces
  • The VAP system that automatically (drawing from dblp, the vispub data set, and the keyvis data set) generates text profiles, augmented by visualizations, of visualization researchers given just their name
  • Studies of different visualizations’ efficacies on phones and watches
  • Techniques for unifying tables and tables with text in long document viewers
  • A survey and analysis of uses and popularity of visualization dashboards
  • The Charticulator system for creating visualizations from data sets without needing to program
  • The ATOM grammar and toolkit for constructing unit-style visualizations
  • Litvis.org, a site and approach for creating visualization design, explanation, narrative, and reflection notebooks, much as done with Jupyter notebooks for data analysis.

I’ll shamelessly add a plug for two papers in-part from my research group: our work on the low cost ICE-T approach to evaluating the value of a visualization and the Voder system that combines interactive data facts with visualizations to help data analysis and presentation.

Oh, I learned once again a simple maxim for the conference: If you’re going to present a paper about color, be ready for objections after your talk.

Topics receiving focus at the InfoVis Conference are always changing. For fun, I grabbed the conference session titles/topics from both this year and ten years ago to see how things have changed. Below is a list of the sessions, 2008 is on the left and 2018 on the right. Right away, one can see how much the conference has grown, almost doubling the number of sessions over the ten years. Beyond that, there is a core of consistent topics, but new themes are emerging. Two that jumped out to me include interaction with different types of displays (Immersive analytics and Devices: large & small) and the rise of perceptual/cognitive studies and uncertainty (three sessions this year).

infovis-topics

Perhaps the topic of most discussion across the entire week was the evolution of the conference itself. A small committee has been studying the possibilities for how the meeting may evolve in the future. It has been noted that the preponderance of different conferences and symposia within the meeting may confuse newcomers who aren’t quite sure where their work fits. This review committee presented the results of their study to the conference during Wednesday’s lunch session. One potential option is to unify more, resulting in one main conference with many subareas underneath it. This seems to be the most popular potential path forward.

Many details remain to be worked out, however, and unfortunately those details (e.g., what are the subareas, what is the new conference named, how does the reviewing work, etc.) are quite challenging. The growing size of the meeting can be viewed both positively and negatively. The growth indicates the popularity and increased interest in visualization, which is terrific. However, that growth also results in more parallel sessions and conflicts, and just an overall busier and more hectic week. (Perhaps the most common sentence I heard uttered during my week there was “Oh, I missed that presentation.”)

Nonetheless, I think that keeping a large central showcase conference for our discipline, much like CHI is for the HCI research community, is likely a good thing. It provides an opportunity for many people to meet and exchange ideas. We may potentially see subareas grow and blossom into their own meetings. For CHI, related conferences such as UIST, CSCW, Multimedia, ISS, and others did just that. We even have seen this to a lesser degree in visualization with the emergence of symposia such as OpenVis, Tapestry, and Information+, as I mentioned earlier in this post. One difference in the visualization community is the presence of geographically-based conferences such as EuroVis and PacificVis that are not focused subareas, but smaller versions of the broad discipline as a whole. Well, it will certainly be interesting to see how things develop over the next few years.

Next year the VIS Conference moves to Canada as it will be held in Vancouver in October. Vancouver is a beautiful city and I would not be surprised if a new attendance record is set yet again.

Finally, I wanted to end with a picture. On Friday afternoon after the conference had ended, some colleagues and I wandered around to a number of the must-see tourist locations in the city. One was Checkpoint Charlie, the infamous crossing point between East and West Berlin during the Cold War. While there, I snapped the photo below. The text reads “You are leaving the America sector”, but I couldn’t help thinking about the irony (?) of the KFC sign just below/above it. Given enough time, I guess things can and do change.

chch

 

ICE-T @ InfoVis ’18

At the recent IEEE InfoVis Conference in Berlin, my research group collaborated with colleagues on two published papers. This blog entry gives a quick and dirty overview of one of those two papers: “A Heuristic Approach to Value-Driven Evaluation of Visualizations.”

icedtea

The first project was done with Emily Wall, Meeshu Agnihotri, and Alex Endert here at GT, and Laura Matzen, Kristin Divis, and Michael Haass of Sandia National Lab in New Mexico. Back in 2014 I published a paper at the BELIV Workshop that grew out of some frustrations I’d had with the evaluations one finds in many infovis papers. Often, the evaluation consists of a small set of benchmark tasks done with students at the local university. The tasks themselves typically are low-level, detailed questions about a data set that one would answer using a visualization. While there is nothing fundamentally wrong with that approach, it just seemed that it did not strike to the core of why a visualization might be helpful or what utility it could provide.  I felt that such evaluations, while reasonably assessing the usability of a visualization system, failed to assess the visualization’s true value.

Hence, I developed a formula that attempted to capture the value of a visualization. The formula consisted of four components: the time a visualization saves, the insights the visualization spurs, the essence of the data set it conveys, and the confidence about the data it evokes. This is an oversimplification, but hopefully enough for you get the basic idea. For more details about the approach, you can examine the original paper. Over time, I grew frustrated that this value equation was purely descriptive – there was no way to use it to actually evaluate the value of a visualization. This frustration motivated the project Emily and I presented at InfoVis.

We began the research by seeking to identify more specific characteristics of or statements about each of these components. We surveyed visualization literature and conducted multiple brainstorming sessions and workshops to generate ideas. Ultimately, we developed a hierarchical framework in which each of the four components contains a small set of guidelines, and each guideline contains a small set of heuristics. A visualization then will be rated using these heuristics, and the scores for each individual heuristic accumulate to provide a rating for each component and for the visualization overall. Ideally, a small set of people would rate a visualization using the framework. These people should have a background and experience in data visualization. The approach is designed to be in the family of “discount” evaluation approaches one finds in HCI, much along the lines of Jakob Nielsen’s heuristic evaluation technique.

affinity

In our InfoVis paper, we describe a user study in which we had 15 visualization experts evaluate three different visualizations of the same data set using the method. Although the experts expressed some concerns about the technique and the heuristics themselves, ratings across all the experts were quite consistent and aligned with our a priori assessments of the utility of the different visualizations compared to each other. And we used the concerns raised by the experts to refine and clarify some of the heuristics. Thus, the method seems to show promise as a relatively low-cost way of determining a visualization’s potential value or utility.

Finally, we struggled to come up with a name for the approach for quite a while. Ultimately, we took the four lead letters of the value equation’s components (TIEC) and made an anagram of them: ICE-T. To researchers struggling to find an appropriate, helpful method of evaluating their systems, hopefully the ICE-T method will be just as refreshing as a cold iced tea on a warm summer day.

You can learn more about the ICE-T approach in our InfoVis paper, hear and see a replay of the talk we gave in Berlin, or browse the project webpage we have created to help others utilize the technique. Currently, it contains a pdf summarizing the hierarchical value framework, but we are working on an interactive version of the framework (form) that also will generate a spreadsheet of result data and a report about each evaluation.

Voder @ InfoVis ’18

At the recent IEEE InfoVis Conference in Berlin, my research group collaborated with colleagues on two published papers. This blog entry gives a quick and dirty overview of one of those two papers: “Augmenting Visualizations with Interactive Data Facts to Facilitate Interpretation and Communication.”

In the broad field of data analysis, recently there has been an increasing effort to “automatically” generate insights about a data set. Sophisticated techniques from the database and AI communities help generate these insightful observations about the data, usually in a natural language expository form. Now, precisely what constitutes an “insight” is a matter of debate, something that I explored in a previous column. In our research, we choose to use the term “data fact” instead, reserving “insight” for deeper and more meaningful realizations about a data set.

Our paper at InfoVis was the lead effort of PhD student Arjun Srinivasan, with help from Steve Drucker at Microsoft, and Alex Endert and me here at GT. The key contribution of the work is to think of these data facts that can be generated for a data set not as static utterances, but as interactive components of a more comprehensive data analysis system. We built a system called Voder1 that illustrates this principle in action.

When an investigator specifies/creates a chart that visualizes variables of interest, Voder generates data facts corresponding to those variables. As the investigator moves the cursor over the facts, the visualization changes (perhaps just a highlight) to emphasize and help explain the fact being examined. Furthermore, Voder presents alternative visualizations that also illustrate the fact, and it gives the investigator different options in how to embellish the visualization to communicate the fact.

voder

Voder also provides a search capability in which the investigator can type in terms such as variables on the data set or analytic queries (e.g., “correlation”, “outlier”), then Voder generates visualizations and data facts pertinent to the query terms. Thus, the system facilitates a flexible data analysis process that can start with visualizations, with data facts, or with keyword searches, and supports easy, fluid transitions between each of these aspects. Voder also provides a “presentation” mode where interactive data facts and visualizations can be compiled as slide decks or dashboards.

A formative user study of the system with people of varying visualization backgrounds identified a great deal of promise for the approach. Less-experienced participants appreciated the help Voder provided for interacting with visualizations. Experts appreciated that too, but also hoped for deeper observations in the data facts. Attendees that we spoke to after the talk at InfoVis expressed excitement about the potential of the system for assisting visualization literacy and education as well.

You can learn more about Voder in our InfoVis paper, hear and see a replay of Arjun’s talk from the conference, or see a video of the system at the project webpage.

——————-

1Voder is a disc-shaped voice-box translation device from Star Trek. It was also Bell Lab’s device that was the first machine to generate human language.

What’s an Insight?

One of the key notions of data visualization is that it can inspire insight about the data being presented. The idea of generating or spurring insights has been a core objective that visualization developers strive to achieve. But just what is an insight? How do we identify the insights that a visualization inspires? This is a tough question that the visualization research community has been grappling lightbulbwith for quite a while.

I had cause to revisit that question late last fall when the topic of our weekly Visualization Group meeting was a paper from SIGMOD ’17, “Extracting Top-K Insights from Multi-dimensional Data”, by Tang, Han, Yiu, Ding, and Zhang.1  In this fascinating project, the research team developed methods to automatically (algorithmically) identify the top insights that can be gleaned from a data set such as sales data over time for a group of products. Note that this research comes from the Database community, which is obviously quite different from the data visualization research community.

To better understand what the developed algorithm does, suppose we have sales records of five different products over a five-year period. Potential insights from that data might be that a particular product’s sales show an increasing trend over time (i.e., the delta or change from year to year is growing), or that another product’s sales ranking within the group is falling each year.

Amidst all the debate within the visualization research community about what constitutes an insight, I was curious to see how Tang et al would characterize one. They describe an insight as “an interesting observation derived from aggregation in multiple steps.” Furthermore, the researchers explain that such insights have two typical usages in business applications, to “provide informative summaries of the data to non-expert users who not know exactly what they are looking for” and to “guide directions for data exploration.”

The heart of the paper is their algorithm for finding the “Best-k” insights from a data set. Needless to say, it is quite complex and simply beyond me to completely follow it, but ultimately it is about identifying insights and quantifying their “interestingness”. Most insights they find seem to take on one of two flavors: “point” insights where values are remarkably different from others or “shape” insights that show rising or falling trends.

The paper contains a case study on car and computer tablet sales data. Their algorithm identified the following example top insights:

  • When measuring the importance of SUV sales for a certain brand, brand F is outstanding number 1.
  • There is a rising trend of SUV’s market share.
  • In 2014, SUV exhibits most advantage over other categories than ever.
  • The yearly increase of tabular sales is slowing down.
  • 2012/04-07’s yearly increase of tablet sales is remarkably lower than ever.

Finally, the authors conduct a user study in which they have data analysts and managers rate the insights found by their algorithm along usefulness and difficulty dimensions. The algorithm fares well on both measures. Additionally, a comparison study of senior database researchers identifying insights via “traditional” methods uncovers the dramatic result that the time taken (average) using SQL was 29.2 minutes, using Excel pivot tables was 14.2 minutes, and using the Best-k algorithm was 0.17 seconds. The machine triumphs yet again!  :^)

I was fascinated by their characterization of data insights and their descriptions of insight characteristics. But how do those notions compare with other communities’ views of insight?

I believe that a very common impression of an insight, one harbored by many people, is as a kind of “a-ha” moment when a person figures out an answer or a solution to a problem that has been simmering for a while. This perception reminds me of the famous scenario where a light bulb goes on over a person’s head while they’re in the shower, a true “Eureka!” moment.

But I don’t feel that’s how the data visualization community most commonly views insight. Chris North actually defined an insight as being an individual observation about data by a person, a unit of discovery.2  He believes that insights are complex, deep, qualitative, relevant, and unexpected. Would the insights found by Tang et al’s algorithm meet those criteria? I’m not sure.

Personally, I have always resonated with the characterization of insights by Chang, Ziemkiewicz, Green, & Ribarsky.3  Their view contrasts with the spontaneous a-ha perception described above. Instead, they believe that insight is much more about knowledge-building and model-confirmation. It is like a substance that people acquire with the aid of systems.

When I hear someone say that a “visualization gave them insights about a data set”, I tend to be thinking along the lines of Chang’s characterization. In fact, my former GT colleagues Ji Soo Yi, Youn-ah Kang, Julie Jacko, and I reflect on insight in an old BELIV workshop paper.4  In it, we focus on the processes that one undertakes in order to gain insight. This frequently occurs in “sensemaking” scenarios. We found four processes through which people frequently obtain insight using visualizations, including provide an overview, adjust, detect a pattern, and match a mental model.

I have always been struck by the importance of context and existing domain knowledge to insights too. A person’s pre-existing knowledge about a data set and its domain has a big influence on what they will consider a data insight. For a data set about wines of the world, the set of insights a novice uncovers may simply be ho-hum background information to a wine connoisseur. When determining insights about a data set, it’s likely safest to assume the person doing the exploration is unfamiliar with the data and its domain, in order to establish a common baseline.

Looping back to the paper by Tang et al, ultimately I’m not sure that I’d describe the statements that their algorithm produces as “insights”. Maybe they’re interesting data facts or data observations, but insights somehow feel to me like deeper understandings of the characteristics and implications of a data set. This in no way diminishes the remarkable achievement of Tang et al. That they can automatically identify salient and useful observations about a data set is quite remarkable.

As we move forward, it will be interesting to see if the different academic sub-communities (cognitive science, databases, KDD, visualization) can come to some shared understanding of just what insight is and how we can better help people find them. Once we do that, then maybe we can start to develop evaluation methods to determine whether particular visualizations actually do a good job generating insights.  I’m also especially excited by systems that will be able to combine techniques from multiple areas – for example, systems that automatically generate insights about a data set, support those insights through illustrative visualizations, and allow analysts to manually explore the data through visualizations to uncover their own unique insights.

1 B. Tang, S. Han, M.L. Yiu, R. Ding, and D. Zhang. “Extracting Top-K Insights from Multi-dimensional Data.” In Proc. of SIGMOD ’17. May 2017, pp.  1509-1524.

2 C. North. “Toward Measuring Visualization Insight.” IEEE Computer Graphics & Applications  26, 3 (May 2006), pp. 6-9.

3 R. Chang, C. Ziemkiewicz, T.M. Green, and W. Ribarsky. “Defining insight for visual analytics.” IEEE Computer Graphics & Applications  29, 2 (March 2009), pp. 14-17.

4 J.S. Yi, Y. Kang, J. Stasko and J. Jacko, “Understanding and Characterizing Insights: How Do People Gain Insights Using Information Visualization”, In Proc. of BELIV ’08, April 2008, pp. 39-44.

Impressions from VIS ’17

The VIS ’17 Conference was held almost two months ago in downtown Phoenix, AZ. This column is my woefully late recap of the meeting with a few reactions and thoughts about how it went this year. I was so busy upon returning to school after the conference that I just kept putting off writing this. (If I ever consider being an AC for the CHI Conference again, someone please slap me.) Now that final exams are almost here, I’ve finally gotten a little chunk of time to pull this together.

I really enjoyed the location of the conference this year. I’d never been to Phoenix before, so I wasn’t sure what to expect. The conference was held at the big downtown convention center with a lot of hotels nearby. Even though it’s in the middle of the city, it didn’t feel like that. It was relatively quiet in the surrounding area and it certainly was easy to get around. Plenty of restaurants were nearby too. One night the streets were buzzing as the Diamondbacks beat the Rockies in the NL Wild Card game, and all the fans streamed out afterwards. The home stadiums of the Diamondbacks (baseball) and Coyotes (ice hockey) are close by, and add to the atmosphere of this part of the city.

The convention center itself is quite large. As VIS has grown, we now must use facilities like it just to be handle the number of attendees. The rooms for paper presentations also were huge, perhaps even a little too big. Of course, we’re at the mercy of the layout of the venue on this, and no one wants meeting rooms that are too small. However, the sessions often felt a little sterile and impersonal to me this year. It seemed like relatively few questions were asked after talks, and I wonder if the room size and atmosphere somehow contributed to that, even a little.

mirrorThree topics/themes stood out to me this year. The first one was data science. In workshops, tutorials, and papers, the topic was everywhere. It seems like VIS is just mirroring what we’re seeing throughout academia now as more schools create Data Science degrees, programs, and even in some cases, departments. Visualization is only one piece of data science too, and sometimes a piece that is overlooked. Machine learning is clearly a large component of the data science equation, and it was ever present at the conference. It seemed like half of the VAST papers were about interfaces to machine learning algorithms and systems.

The second theme that grabbed my attention, particularly at InfoVis, was the growing presence of evaluation-focused papers. I guess this is to be expected – As our area matures and it becomes tougher and tougher to come up with new visualization techniques and systems, it shouldn’t be surprising to see more evaluation papers show up. InfoVis seems to feel a little more like CHI every year to me. (Not sure how I feel about that.)

The final topic I noticed this year was a simple one, word clouds. I couldn’t believe how many papers were about them! OK, OK, maybe that’s an exaggeration, but there was one paper session that seemed to be all about them. While they can be great for advertisements and fun, I always remember Jacob Harris’ great column and quote: “Every time I see a word cloud presented as insight, I die a little inside.” Anyhow, I did like the EdWordle paper and especially the interactive demo at http://www.edwordle.net/.

While I enjoyed many papers at the conference, a few stood out to me. Sandia National Lab’s work developing a data visualization saliency model is fascinating. The computer vision community has good models that can predict where people will look within a picture, that is, what parts of the picture will first draw a person’s attention. The Sandia team is working on developing a similar model for predicting the parts of an abstract data visualization that will draw focus. This model has some very different heuristics than what one finds with natural, photographic images.  I also really enjoyed Jorge Poco’s talk and demo about extracting color maps form bitmap images of visualizations. It was fantastic how he and his colleagues can identify color legends and ultimately allow a person to change them, which would then be reflected in the image. I also enjoyed Dragicevic and Jansen’s replication study of whether charts persuade people to trust textual arguments more, and Lam, Tory, and Munzner’s paper about the challenges of moving from high-level analysis goals to low-level analysis tasks.

Giorgia Lupi’s closing capstone talk on “Data Humanism” was fantastic as well. Giorgia is one of the two correspondents in the Dear Data series of visual postcards about their lives. She sees data visualization becoming much more personal in the future and she advocates that people explore and draw with data to discover what it holds. Giorgia’s column in Medium is a companion to and highlights the key points from her capstone talk.

I certainly missed one thing from many of the talks this year, demos. Sitting in the few presentations that had one reminded me how well a demo can make the ideas of a paper more concrete and illustrate potential applications of the research. Jarke van Wijk’s papers of the past stand out in this respect to me – So often I remember thinking about them, “Wow, that’s cool.” Here’s hoping that more authors and presentations include demos in the future.

mission

Attendance was down a little at the conference this year. I believe that just over 1000 people attended, while the previous few years were up in the 1200-1300 range. I’m hoping this was a momentary blip, perhaps due to the location (Phoenix) being a little out of the way for many people. I certainly see visualization continuing to grow as a topic, so I fully expect VIS to keep growing as well.

With that said, I do have mixed feelings about one side effect of the growth of the conference. Back quite a few years ago, the InfoVis Symposium was a single track. All attendees who were interested in that topic attended all those sessions together and effectively shared the same experience during the week. (The SciVis Conference, then called just “Visualization”, had multiple tracks due to its larger size, but I always stayed at InfoVis.) With today’s configuration of multiple tracks, panels, journal papers, and the addition of VAST, attendees are torn between and eventually scattered about many possible sessions at any one time, and they tend to gravitate their own existing interests. Papers that are up against other popular topics may receive relatively little traffic. The single track/shared experience model of the past promoted more exposure to papers and topics outside of a person’s comfort zone. I definitely feel that the single track helped our community prosper and grow. Its loss is an inevitable consequence of growth, which also has its benefits, but sometimes I long for the “all in it together” days of the past.

Next year we’re on to Berlin, VIS’s second trip outside of the United States, in what should be an exciting meeting. Before that, the AVI and EuroVis conferences fall in back-to-back weeks late next spring in Italy and the Czech Republic, respectively. Europe clearly will be the hub of academic data visualization research in 2018!

Next column: Some thoughts about insights from visualization.

 

Impressions from EuroVis ’17

I recently returned from EuroVis ’17 in Barcelona, Spain. The conference was held at the Universitat Politècnica de Catalunya (UPC) which is close to Camp Nou, the home stadium for Barcelona’s famous soccer team, in the suburbs outside the city center. It is a pleasant and relatively quiet area of the city compared to the bustling La Rambla, Gothic quarter, and beachfront. It was my first time ever in Barcelona, and I had heard so many great things about the city, so I was eager to visit.

EuroVis is similar in scope to IEEE VIS, the but the three main research areas of information visualization, scientific visualization, and visual analytics are woven together into one program as opposed to the three conferences you see at VIS.  The conference is much smaller than VIS — this year just over 300 people attended. Typically, at any time of the meeting, there are about three sessions occurring in parallel. Beyond regular papers, EuroVis hosts the STAR (State-of-the-Art Reports) presentations as well. Think of them as in-depth surveys of specific subareas of visualization. These reports now appear as papers in the journal Computer Graphics Forum, as do the full research papers in EuroVis.

The conference received 170 paper submissions this year and 46 (27%) were accepted for presentation. Of the traditional five visualization paper types, “Algorithm” led with 74 submissions, followed by “Design study”-52, “Evaluation”-20, “Theory”-13, and “System”-11.  The Algorithm and Design Study areas also had the highest acceptance percentages at 32% and 27%, respectively.

barca3

In addition to full papers, EuroVis takes short paper (four pages of content plus a page of references) submissions, typically for work that is newer and still developing. This year the conference received 64 short paper submissions and accepted 30. Each of these papers is published as an archived conference paper and it receives a 15-minute talk slot at the conference, so researchers definitely should consider this track in the future. The conference also accepted 35 posters for presentation during the week.

If I had to think of one word to describe the conference this year, it would be “Hot”. No, by that I don’t mean that the papers were dynamic and sizzling, although there were many good presentations. I’m simply referring to the temperature!  Every day the high temperature was close to 90º F, and there wasn’t a cloud in the sky the whole week.  Typically, the most valuable commodity at our conferences is good wireless service. Instead, this year it was air conditioning and shade. But hey, I’ll take that anytime over clouds and rain. Just think of it as good practice for VIS this fall in Phoenix.

The conference began with a timely and fascinating keynote talk by Fernanda Viégas and Martin Wattenberg of Google. They discussed many ways that machine learning and visualization are connecting and benefiting each other. Martin and Fernanda showed a number of examples, both from their work and others, of how visualization can help people better understand what is going on (beyond the black box, so to speak) in machine learning. Their talk was complemented by Helwig Hauser‘s closing capstone that examined how visualization is moving onto larger and larger data sets. Up front, he pondered what problems our community has “solved” in the last 25 years. While it may be difficult to think of many, he rightfully also asked when is a problem really ever “solved”? Developing “sufficient” solutions to a bevy of problems simply may be good enough and may be an indicator of good progress. He provided many examples where visualization has done just that.

I saw many nice presentations at the conference and was trying to come up with a theme or two that emerged, but I had a tough time doing so. Perhaps one broad theme I observed was many papers dealing with the HCI aspects of visualization. Topics ranging from evaluation to interaction to storytelling all seemed to have a strong presence this year. Another nice set of papers concerned text and document visualization as well.

barca5

EuroVis traditionally hosts a nice conference dinner on Thursday evening.This year it was at a restaurant on Montjuic, a mountain (actually more of a hill) on the southwest side of the city. The restaurant’s deck afforded a beautiful view down onto the city. The conference organizers also graciously sponsored a guided tour of the famous Sagrada Familia basilica in downtown Barcelona on Wednesday evening. The church is simply stunning both inside and out, and has become an iconic landmark for the city.

One of my favorite aspects of EuroVis is that the conference provides lunch for attendees there at the conference site. Not having to trudge offsite to a restaurant simply gives more time to sit and talk with fellow attendees, old friends, and new acquaintances. The smaller size of EuroVis compared to VIS also makes it easier to find colleagues. All these things combine to provide a little more relaxed lunchtime. I think my lunch conversations were my favorite aspect of the conference this year. It was great hearing what so many friends are working on currently.

In a lucky coincidence, my home university, Georgia Tech, participates in a cooperative study-abroad program with UPC that hosted EuroVis.  Our faculty spend the summer there and teach our courses to our own students who also travel there for the term. My fellow Interactive Computing faculty member and good friend Mark Guzdial was literally teaching classes in the same buildings in which EuroVis was occurring. He even was able to drop in and hear my presentation at the conference. IC PhD student Barbara Ericson is teaching the undergraduate infovis class there this summer too. She asked me about giving a guest lecture while there, but I figured that I’d take a break from the teaching.  :^)

If you haven’t submitted a paper to or attended EuroVis yet, I strongly encourage you to do so. I hadn’t attended until about five years ago, but now I try to make it back as often as I can. The paper quality is excellent and it’s usually hosted at a beautiful European city. Next year’s conference is in Brno, the second largest city in the Czech Republic. (With VIS ’18 in Berlin, apparently they didn’t take my suggestion that EuroVis should be in New Orleans, LA.) Just be on the watch out for dragons that look like alligators.

barca2

Tips for being a Good Visualization Paper Reviewer

This past year I was papers co-chair for the IEEE VAST (Visual Analytics) Conference, and it gave me the opportunity to read lots of paper reviews again. I had been papers co-chair for VAST once before, in 2009, and twice for the IEEE InfoVis Conference shortly before that. Additionally, I’ve been a (simple) reviewer for hundreds of papers since starting as a professor in 1989, and my students, colleagues, and I have written many papers that have received their own sets of reviews. Add it all up, and I’ve likely read over a thousand reviews in my career.

So what makes a good review, especially for visualization and HCI-related research papers? Reading so many VAST reviews this spring got me thinking about that topic, and I started jotting down some ideas about particular issues I observed. Eventually, I had a couple pages of notes that have served as the motivation of this article.

I have to say that I was inspired to think about this by Niklas Elmqvist’s great article on the same topic. I found myself noting some further points about the specific contents of reviews, however, so view these as additional considerations on top of the advice Niklas passed along.

I decided to keep the list simple and make it just a bullet list of different points. The starting ones are more specific recommendations, while the latter few turn a little more philosophical about visualization research on the whole. OK, here they are:

• Suggest removals to accompany additions – If you find yourself writing a review and suggesting that the authors add further related work, expand the discussion of topic X, insert additional figures, or simply add any other new material AND if the submitted paper is already at the maximum length, then you also need to suggest which material the authors should remove. Most venues have page limits. If you’re suggesting another page of new content be added, then which material should be removed to make room for it? Help the authors with suggestions about that. It’s entirely possible that authors could follow reviewers’ directions to add content, but in order to create the space, they remove other important or valuable sections. Deciding which material to take out is often one of the most difficult aspects of revising a paper. Point out to the authors sections of their paper that were redundant, not helpful, or simply not that interesting. That is review feedback they actually will appreciate receiving.

• Be specific – Don’t write reviews loaded with generalities. Be specific! Don’t simply say that “This paper is difficult to read/follow” or “Important references are missing.” If the paper is difficult to follow, explain which section(s) caused particular difficulties. What didn’t you understand? I know that sometimes it can be difficult to identify particular problematic sections, but do your best, even if it is many sections of the paper. Similarly, don’t just note that key references are missing – tell the authors which ones. You don’t need to provide the full citation if the title, author, and venue make it clear which papers you mean, but do provide enough explanation so that authors can determine which article(s) you believe have been overlooked. Further, provide a quick explanation about why a paper is relevant if that may not be clear. Finally, a particular favorite (not!) of mine, “This paper lacks novelty.” No, don’t just leave it at that. If a paper lacks novelty, then presumably earlier papers and research projects exist that do similar things. What are they? How are they similar (if it isn’t obvious)? Explain it to the authors. The unsupported “lacking novelty” comment seems to be a particular way for reviewers to hide out and is an element of a lazy review.

• Don’t reject a paper for missing related work – I’ve seen reviewers kill papers because the authors failed to include certain pieces of related work. However, this is one of the easiest things to fix in a paper upon revision. Politely point out to the authors what they missed (see the previous item), but don’t sink a paper because of that. Now, in some cases a paper’s research contribution revolves around the claim of introducing a new idea or technique, and thus if the authors were unaware of similar prior work, that can be a major problem. However, I haven’t found that to be the case too often in practice. We all build in some way on the work of earlier researchers. Good reviewers help authors to properly situate their contribution in the existing body of research, but don’t overly punish them for not being aware of some earlier work.

Fight the urge to think it’s all like your work – When reviewers have done prior research in the area of a new paper, it often seems easy for them to think that everything is just like their work. As a Papers Chair, I’ve read quite a few reviews where a reviewer mentions their own prior work in the area as being highly relevant, but I honestly couldn’t see the connection. This is a type of bias we all have as human beings. Simply be aware of it and be careful to be fair and honest to those whose work you’re critiquing.

• Don’t get locked into paper types – Paper submissions to the IEEE VIS Conference must designate which of five paper types they are: Model, Domain study, Technique, System, and Evaluation. Tamara Munzner’s “Process and Pitfalls” paper describing the five paper types and explaining the key components of each can be valuable assistance to authors. There’s no rule that a paper must only be one type, however. Recently, I’ve seen reviewers pigeon-hole a paper by its type and list out a set of requirements for papers of that type. Sometimes this does a disservice to paper, I feel. It is possible to have innovative, effective papers that are hybrids of multiple types. I’ve observed very nice technique/system and technique/domain study papers over the years. The key point here is to be open to different styles of papers. It’s not necessary that a paper be only one type (even if the PCS paper submission system forces the author(s) into making only one selection).

• Spend more review time on the middle – For papers that you give a score around a 3 (on a 1-5 scoring system), spend a little more time and explain your thoughts even further than normal. This will help the Papers Chairs and/or Program Committee when considering the large pile of papers having similar middling scores at the end. By all means, if you really liked a paper and gave it a high score, do explain why, but it’s not quite so crucial to elaborate on every nuance. Similarly, if a paper clearly has many problems and won’t be accepted, extra review time isn’t quite so crucial. For papers that may be “on the fence”, however, carefully and clearly explaining the strengths and limitations of those papers can be very beneficial to the people above you making the final decisions on acceptance.

• Defend papers you like – If you’ve reviewed a paper and you feel it makes a worthwhile contribution, give it a good score and defend your point of view in subsequent discussions with the other reviewers. Particularly if your view is a minority opinion, you can feel pressure to be like the others and not to be seen as being too “easy”. Stand up for your point of view. There simply aren’t that many papers receiving strong, positive reviews. When you find one, go to bat for it and explain all the good things you saw in it.

• Don’t require a user study – OK, here’s one that’s going to ruffle a few feathers. There is virtually nothing that I dislike more in a review than reading, “The paper has no user study, thus I’m not able to evaluate its quality/utility.” Simply put, that’s hogwash. You have been asked to be a reviewer for this prestigious conference or journal, so presumably that means you have good knowledge about this research area. You’ve read about the project in the paper, so judge its quality and utility. Would a user study, which is often small, over-simplified, and assessing relatively unimportant aspects of a system really convince you of its quality? If so, then I think you need higher standards. Now, a paper should convince the reader that its work is an innovative contribution and/or does provide utility. But there are many ways to do that, and I feel others are often better than (simple) user studies. My 2014 BELIV Workshop article argues that visualization papers can do a better job explaining utility and value to readers through mechanisms such as example scenarios of use and case studies. Unfortunately, user studies on visualization systems often require participants to perform simple tasks that don’t adequately show a system’s value and that easily could be performed without visualization. Of course, there are certain types of papers that absolutely do require user studies. For example, if authors introduce a new visualization technique for a particular type of data and they claim that this technique is better than an existing one, then this claim absolutely should be tested. Relatively few papers of that style are submitted, however.

• Be open-minded to new work in your area – I’ve noticed reviewers who have done prior research and published articles in an area who then act like gatekeepers to that area. Similar to the “Thinking it’s all like your work” item above, this issue concerns reviewers whose past work on a topic seems to close their mind to new approaches and ideas. (It’s even led me, as a Papers Chair, to give less credence to an “expert” review because I felt like the individual was not considering a paper objectively.) I suspect there may be a bit of human nature at work here again. New approaches to a problem might seem to diminish the attention on past work. I’ve observed well-established researchers who seem to act as if they are “defending the turf” of a topic area – Nothing is good enough for them; no new approach is worthwhile. Well, don’t be one of those close-minded people. Welcome new ideas to a topic. Those ideas actually might not be that similar to yours if you think more objectively. In the past, I have sometimes thought that we might have a much more interesting program at our conferences if all papers received reviews from non-experts on a topic. People without preexisting biases usually seem to better identify the interesting and exciting projects. Of course, views of experts are still important because they bring the detailed knowledge needed sometimes to point out potential subtle nuances and errors.

 

Hopefully, these points identify a number of practical ways for researchers to become better reviewers. I’ve observed each of these problems occur over and over in the conferences that I’ve chaired. My hope is that this column will lead us all to reflect on our practices and habits when reviewing submitted papers. I personally am an advocate that reviews should be shared and published. At a minimum, faculty advisors can share their reviews with their students to help them learn about the process. It can become another important component of our training future researchers. Furthermore, I’d be in favor of publishing all the reviews of accepted papers to promote more transparency about the review process and to facilitate a greater discourse about the research involved in each article.

I look forward to hearing your views about these items. Do any strike a particular chord with you?  Disagree about some?  Please feel free to leave a comment and continue the discussion on this important topic.

Impressions from VIS ’16

The VIS 2016 Conference in Baltimore was held just over a month ago. During the conference, I jotted down a few thoughts and impressions that have served as the basis for this post. The goal here  of particular papers, but is instead some high-level observations about trends and topics of conversation among attendees while there.

One big theme of the conference to me this year was visualization education and pedagogy. I think this thread piggybacked on the excellent Education Panel at VIS ’15 in Chicago. A key idea to emerge from that panel was the use of active learning methods and interactive exercises in visualization courses. At the panel, Marti Hearst from Berkeley and Eytan Adar from Michigan talked about their use of such methods in their respective classes. This fall I’ve tried to incorporate some of these kinds of activities in my graduate CS 7450 Information Visualization course. While my course still primarily follows a lecture/Q&A style (with plenty of videos and demos thrown in), I’ve sought to have at least one interactive exercise per class. In general, these exercises have followed one of two styles. First, I have the class generate analytic tasks or questions for the topic being covered that day. This is particularly effective in the section of the course where we examine visualizations of different types of data (time series, network, hierarchy, text, etc.). A second technique I’ve used is to give a small design challenge, and have students pair up and create visualization design ideas for about 10 minutes. Volunteers then show their designs and we discuss each’s pluses and minuses.

Getting back to this year’s conference, the education focus began with a workshop on pedagogical issues in data visualization. It was exciting to see so many attendees in this workshop, and most seemed to be teaching visualization courses at their respective schools. This theme continued in the main conferences at the meeting: InfoVis had a session with education as a primary theme, and VAST had a session where most of the papers were about visual analytics systems for analyzing and understanding data generated from MOOC classes. The majority of these papers were from Hong Kong University of Science and Technology.

A second theme of the meeting this year that I found interesting was simply “color.” From Theresa-Marie Rhyne’s tutorial to Brown University’s InfoVis paper about the Colorgorical system to the InfoVis Best Poster about Colour Palettes, color seemed to be a topic on everyone’s mind this year. Of course, that’s not surprising at a visualization conference, but it just seemed to have increased emphasis this year. I think that’s a great thing. It helps all of us visualization researchers to have visual perception experts teach us more about all issues color-related.

Another big topic of conversation at the meeting was the panel “On the Death of Scientific Visualization.” It’s been pretty obvious, both via number of submissions and attendance in the meeting rooms, that for a few years now interest in infovis and visual analytics have been expanding while that for scivis has been contracting. I don’t conclude from this that scivis is going away, however. The continued development of better techniques for scientific visualization is extremely important. I simply view this changing interest as being a function of the potential audience in these different subareas. The audience for scientific visualization is just that – scientists, for the most part. This is a relatively small set of people, but extremely important ones! The audience for infovis tools is much bigger, and in many cases, is the general public at large.

I think a huge turning point in these conferences was the InfoVis ’07 Conference in Sacramento. One session of the conference was titled “InfoVis for the Masses.” That was a theme echoing throughout the community that year as Hans Rosling’s GapMinder system and TED video had everyone talking, IBM’s ManyEyes system was extremely popular, and the NY Times had begun to excel at data-driven storytelling on their website. From that point forward, infovis grew tremendously in interest and popularity. So what I see with scivis currently is not at all the “death” of that field. I simply believe that InfoVis and VAST have grown tremendously and they each have a broader reach.

On Monday of conference week I attended the BELIV Workshop that focuses on evaluation-related issues in visualization. I’ve been fortunate to have attended every one of the BELIV workshops going back to the very first one in 2006 in Venice, Italy (not a bad spot for a meeting). I’ve long thought that the evaluation challenge – how do we tell why a specific visualization is more effective than another – is one of the very top open problems in visualization research. Unfortunately, many traditional HCI-based evaluation methods simply don’t get the job done of comparing visualization’s utility, appeal, and effectiveness. (This idea was at the heart of my value-driven evaluation paper from the BELIV ’14 workshop.)

Reflecting back, I have to admit that I’ve been a little disappointed in the paper contributions at BELIV over the past couple meetings. It just doesn’t seem like interesting, new, useful ideas are emerging on this topic. I think that’s partly understandable as this is a very difficult problem to address – That’s what makes it such an important, challenging open problem to our community. But hopefully we’ll see some innovative evaluation methods and new approaches develop over the next few years. This is a great problem for young researchers to take on.

My final thought about the conference this year emerged as I sat through one of the last paper sessions and struggled to understand the research being presented in it, much as I had done for many of the earlier sessions. While part of this might be explained by the quality of the talks themselves (Jean-luc Dumont’s captivating capstone talk emphasized that issue as did Robert Kosara’s blog on common speaking mistakes), I don’t think that was the primary reason. I simply see it as a natural maturing of the field. Many of the individual subareas of visualization research (geovis, text vis, vis for ML, network vis, biomedical vis, time series data vis, etc.) have matured significantly now and have their own rich body of existing papers. To make a new contribution in these areas, one needs to do some very advanced research. Hence, it shouldn’t be too surprising that it is difficult for someone not well-versed in all the subarea literature to have difficulty following the papers in that session of the conference.

I see this as a natural maturation of our field – Something that occurs in other domains as well and is simply difficult to avoid. It’s kind of too bad in a way though because I think it makes the conference papers as a whole a little less accessible to newcomers not having a deep visualization background or even us old-timers who haven’t kept up on a specific subarea. But it shows that as a community we are growing, making progress, and solving problems, all good things.

Well, those are some summary thoughts from VIS this year. I’m looking forward to next year’s conference in Phoenix, a city that I have never visited before. Ross Maciejewski tells me that the conference will take place in a nice area downtown and it will definitely be warm!

Next column: Being a good visualization paper reviewer