A Tale of Two Types of Visualization and Much Confusion
The term visualization is used to mean different things in different contexts, and even visualization that is based on data can be done for different reasons and with different goals. Mixing up these different types of visualization leads to misunderstandings and confusion. Here is an attempt at teasing apart the two major types of data-based visualization, and understanding the differences.
This discussion is based on a recent paper by yours truly that deals with visualization in a larger context and how the different types are different. That paper will also be the subject of another posting in about a week.
The image below shows data about the people on board the Titanic using a technique called Parallel Sets. Knowing something about that disaster, and being able to read the labels, it is fairly easy to find out what is being shown, understand the visualization method, and learn something about the data (e.g., the relative chances of men and women to survive). And even if understanding this requires some work and experience, the goal of this method is to communicate the data, as efficiently as possible. There are many other visualizations of this data that show different aspects more or less effectively.
If a visualization is designed to visually represent data, and to do that in such a way as to gain new insights into that data, it shall be called a pragmatic visualization. The basic idea is that using the human visual system (instead of automatic means like data mining or statistics), we can gain insight into data, and develop an understanding of the data and the structures in it. To determine whether a visualization is pragmatic, we simply ask if it allows us to efficiently read the data (or at least the relationships between subsets) from the display.
Artists also use visual means to show data, but do that in a different way. When media artists make what Manovich calls data art, they are usually much more interested in the data than in the visual representation. What these projects have in common is that they use visual representation, but not in the pragmatic sense. Their main goal is to show an idea or alert the viewer to the fact that all communication is (at least potentially) being monitored. The visual means to do this are fairly simple, though. Even more, one might ask why they use visual means of representation in the first place, when the actual work is really mainly a conceptual one. Are they trying to pretty things up?
The website theyrule is a prime example for this. The point of that website is to collect data about the boards of directors of American companies, showing the many ways these companies are connected by having the same people on their boards, or people who know each other from being on the same boards of some other companies.
A different example is Jason Salavon's American Varietal, a piece commissioned by the U.S. Census Bureau (shown at the top of this posting). In a comment to a recent posting about this project on infosthetics, somebody asked, "Does this convey the information?" Of course it doesn't, that is not the goal!
What all this leads to is a way to analyze the differences between the two types of visualization. Pragmatic visualization has the following properties.
- Communicate Data. The main point of this type of visualization is to effectively and efficiently make the user understand the data.
- Visual Efficiency. The means by which visualization works, and what makes it so interesting, is that it uses the visual channel to convey a lot of information. To do this, visualizations are designed to be perceptually efficient and make it as easy as possible to see the interesting information.
- Data is given. Pragmatic visualization is not concerned with collecting data, though it often requires cleaning and/or interpolating data, and preprocessing it in various other ways. But obtaining the data, or showing the fact that the data exists at all, is not the point of this type of visualization.
Artistic visualization is almost the exact opposite.
- Communicate Concern. The data is a vehicle to communicate deeper concerns or ideas. Making the public aware of the possibilities of surveillance is a typical example of such a project. Generating a useful visual interface can be a part of this (like in theyrule), but it is not the main (or only) goal.
- Visual Effectiveness not an Issue. Many artistic visualizations are not designed to be effective, but are either strongly based on metaphors (to an extent that can hurt perception) or about the exploration of a form. Artistic visualizations have a sublime or contemplative quality that will be discussed in another posting.
- Data Collection. Because the existence of the data is often a part of the message, the collection of the data is an important part of the work. This can also reflect the amount of work that went into data collection, which can be a considerable effort.
Looking at one type of visualization expecting the other will lead to disappointment and misunderstandings. While there is undoubtedly an argument to be made about the two types of visualization being able to learn from each other, the first step is recognizing and appreciating the differences. Web sites like infosthetics blur the line and lead to confusion. That is not to say that there is no place for aesthetics in visualization (quite the opposite!), but that it is important to understand the different possible goals that can be served by visualization, and measuring the results using the right yardsticks.