When Details Hide the Story
Kaiser Fung doesn't like this graphic that accompanied a recent story about the bird flu in the Wall Street Journal. His redesign shows a lot less overlap and a lot more detail; so much, in fact, that it obscures the point of the chart.
Here is the offending graphic. The overlapping circles are hard to read with any precision and are hiding information. It's a pretty dramatic chart though, with the color choice reinforcing that something severe is happening.
Under the heading Is It Worth the Drama?, Fung proposes this no-drama redesign.
His point seems to be to show individual days, since that is what the original data contains. But that graph is way too detailed and shows information that is of no consequence. The original story kindly links to the underlying data, which made it easy to dig into this a bit and try things out.
I immediately had a sense that we're seeing a weekly pattern here. The time frame is only a few months, and there are fairly regular gaps that I figured were weekends. So I made a quick chart in Tableau of just the main part of the data, with the color showing the day of week. The color legend isn't important here since the point is to find the ones that have the highest peaks (I did label those).
Clearly, certain days stick out, and they're generally not the weekends. Most activity seems to be focused on Tuesday and Friday, and I'm sure there are reasons for that (like meetings where somebody in charge signs off on the results, since the dates are the confirmation dates). The one Sunday in mid-April was presumably an emergency meeting when the first big outbreak had been found.
Either way, this is not very interesting. The inner workings of the U.S. Department of Agriculture's Health Inspection Service are not the point of the story. Both Fung's redesign and my weekday-focused chart introduce a lot of noise that drowns out the actual point.
Instead, what makes a lot more sense to look at is a less granular view. Let's look at weekly totals instead of showing each day.
Now we're looking at an outbreak: it starts out quickly, peaks, then drops off more slowly. The weekly boundaries are still arbitrary, a better way of looking at this would be some sort of smoothing. But this is a much better view because it shows a more relevant level of detail.
This view is also more similar to the original than Fung's. Perhaps that could have been done better, but the intent was clearly there: the overlapping circles combine and create a sort of smoothing effect. They also cause some issues, no doubt. But the point of this is really not about individual days, it's about the grand totals and the speed with which the outbreak happened.
It's easy to jump on charts that hide parts of the data, and often those are poorly done. But insisting on showing more data does not always lead to a better solution. That is particularly true when the data is as noisy as in this case, and when there are artifacts that only distract from the actual point.
We always need to ask why the chart was made, for what purpose, to what end.
Posted by Robert Kosara on October 25, 2015.