The Changing Goals of Data Visualization
The visual representation of data has gone through a number of phases, with its goals switching back and forth between analysis and presentation over time. Many introductions to visualization tend to portray historical examples as all being done for the same purpose. That, I argue in this short, incomplete, and cherry-picked history, is not true.
Early to Mid–1800s: Playfair, Nightingale, Snow, Minard
The first uses of graphics to represent data, interestingly, were very bare and abstract, and at the same time were mostly tools for communication. The abstract nature of these early charts is surprising when you consider the amount of ornamentation and decoration that was common with even simple household objects in the early to middle of the 19th century. John Snow’s and Charles Minard’s maps were downright stark compared with many maps drawn at the time.
Those charts were drawn to communicate, not to analyze. Snow’s cholera map often wrongly serves as an example of visual analysis, when it was drawn to convince. Similarly, Florence Nightingale’s chart of deaths in the Crimean War was used to illustrate her argument that improvements in hygiene would save many lives, and William Playfair illustrated the trade balance between England and its trade partners.
In the first thirty years of the 20th century, Otto Neurath designed a visual language, the ISOTYPE, that not only showed numbers in ways that were easy to read, but that also communicated what they meant. Want to show statistics about workers and factories? Show them as little worker and factory icons. Each icon represents a certain number of the respective object, making it easy to compare and to read off numbers.
Neurath wanted to educate people about the world: ISOTYPE is the acronym for International System of TYpographic Picture Education. His illustrations were meant to stand on their own, without somebody there to explain. Nightingale and Playfair worked under the assumption that there would be explanatory text or they themselves would be there to make the argument, supported by the graphic. Neurath aimed to make his images self-contained and self-explanatory.
1960–70s: Bertin and Tukey
Thirty years later, the focus shifted from presentation to analysis, and the explanatory parts of the graphics disappeared again. John Tukey in particular was interested in what he could learn from the graphical depiction of the data, not whether it would work as a good presentation device. Bertin used graphical means both for analysis and communication, though his more presentational graphics were mostly maps.
Tukey invented a number of different plot types, among them the box plot, the bubble chart, the radar chart, and more. Bertin, in addition to his seminal theoretical work, created the reorderable matrix, a simple yet powerful tool for finding clusters in data. It represents one of the first uses of interaction in visualization.
The late 1970s and early 1980s saw a new development: the elaborate information graphic, which had existed for a while, was starting to be used to communicate numbers. Nigel Holmes is perhaps the most prominent designer of this kind of visualization.
Holmes actually uses the term explanation graphic, which is not only less misused, it also more clearly describes the goal: to explain the data and its context. In addition, Holmes also clearly wanted to draw the reader’s attention and entertain. The result were information graphics that were very elaborate and unique, but always based on actual, real data.
In stark contrast to Holmes, Edward Tufte advocates a minimal and unembellished style, with a strong emphasis on displaying the data and just the data. While Tufte keeps talking about showing information, his focus is clearly on displaying data for analysis. What sets his perspective apart from current information visualization research is that he almost entirely talks about static representations (often on paper, for its high potential information density), which the user can examine and use to explore the data and answer questions.
Tufte’s influence is clearly felt in the visualization field today, and his name is often invoked when elaborate information graphics are criticized. Tufte and Holmes represent the two extremes of the embellishment spectrum, and while Tufte’s end has been explored quite well by the scientific community, there is still work to be done on the Holmes side.
2000s: INFOGRAPHICs vs. Visualization
Today’s deluge of infographics is a mix of many different styles, with the loud and crazy ones unfortunately sticking out (and perhaps being the most common). Often, they are used to attract eyeballs and links to otherwise mundane articles; which is not an issue in principle, Holmes’ work partly served a similar purpose.
What many of these infographics lack, unfortunately, is accuracy and depth. While the information graphics of the 1980s were generally useful for understanding the context of the data, many of today’s infographics just add eye candy that is of little practical use, while playing fast and loose with the data.
At the same time, the academic visualization community is all about analysis and exploration of data, and almost entirely ignoring information and explanatory graphics. There is clearly value in the work that is being done, but I also feel that a huge opportunity is being missed.
In its roughly 200 years of history, our idea of visualization has changed considerably, and work has been done for different purposes. Visual representations are very malleable, and can often serve different purposes reasonably (or even equally) well. To properly understand why things were done a certain way, we have to look at the work based on what we know about its creator's goals and ideas. If we ignore this context, we are doing a disservice to the people we inherit from, as well as limit our own ability to build on their work.