• Skip to content
  • Skip to primary sidebar
  • Skip to footer

eagereyes

Visualization and Visual Communication

  • Explore
    • Starter Pack
    • Blog Calendar
    • EagerEyes Decade
    • Blogroll
  • Practical
    • Basics
    • Pie Charts
    • Techniques
    • Book Reviews
    • Journalism
  • Academic
    • Speaking Mistakes
    • Acceptance Rates
    • Papers
    • Conference Reports
    • Lists of Influences
    • Criticism
    • Peer Review
  • Admin
    • About
    • Contact
    • License
From Data to Trends

Robert Kosara / June 3, 2012

From Data to Trends

After my recent abstraction exercise created some interesting discussion but kind of went off in a slightly wrong direction, here is another experiment.

Let’s take the data from Nigel Holmes’ famous Monster chart and turn that into a simple bar chart. I chose a similarly basic one here as the one in Bateman et al.’s study, but the filled bars are slightly less ugly. The x axis shows years in the 20th century, the y axis millions of dollars.

The point here is fairly straight-forward: costs are rising. But how fast are they rising? Are they rising as faster at the end of the time period depicted than in the middle? It’s not that obvious from this chart. So let’s try a line.

The line makes the rate of change easier to see, because it is turned into an angle. We are good at spotting differences in angles, especially with lines that are laid end-to-end. The increase was actually stronger in 1978, flattened out in 1980, and then increased again in 1982. Interesting.

But the overall story is still one about the overall increase. After all, that’s the message of the monster, and it’s what is most obvious in the bar/teeth chart. So let’s depart from the actual numbers a bit and smooth the line out somewhat.

This is smoothing over the 1978 bump for a simpler message: costs have been rising, and they are rising at a faster pace in the early 1980s than they did in the early 70s. That sentence glosses over the little bump, so why does the chart need to show it? In fact, we could even go a tiny bit further and smooth out the line to make it look nicer and do away with the individual points.

This is now a more abstract version of the chart that no longer tells you how many data points there were and where: they could be anywhere on the line. The message is still the same, all I have done is remove some detail that is not essential for the overall point.

Where is data and detail necessary, and where does it become a distraction? To make a point, you don’t need every detail. Just like there is no point in showing every single tiny variation in the consumer price index to talk about how the real value of the minimum wage is dropping, it’s often unnecessary to depict every single detail in the data when presenting it.

Filed Under: Journalism

Robert Kosara is Senior Research Scientist at Tableau Software, and formerly Associate Professor of Computer Science. His research focus is the communication of data using visualization. In addition to blogging, Robert also runs and tweets. Read More…

Reader Interactions

Comments

  1. Sebastian says

    June 3, 2012 at 10:58 pm

    Wouldn’t it a good idea as well to look at the data on a log-scale? For growth processes etc. this will reveal information on the relative growth – which might be at least as interesting in this case in my opinion!

    Reply
  2. derek says

    June 4, 2012 at 12:58 am

    If you’ve taken the real data away, why have you still got those extraordinarily precise scales? Shouldn’t you detune those as well? If you don’t, it looks like you’re writing a check your data won’t cover. It’s inappropriate precision.

    Reply
  3. Jorge Camoes says

    June 4, 2012 at 2:28 am

    Robert, I am not impressed by the impressionist style in data visualization, but there is a relationship between detail and chart size. It would be easier to accept your last line chart if you were using sparklines.

    Reply
    • Hig says

      June 8, 2012 at 8:47 pm

      Overall I agree with the folks arguing this is underkill. I don’t think the details distract.

      When I read a graph, I look for variation in the data to get a sense of noise in the system. The slight up and down in the data-precise curve doesn’t distract me from the overall increase, but it does tell me that there’s actually very little deviation from this trend.

      Given the original data was election expenditures, I’d argue for a bar graph, not a line graph. The line suggests that interpolating between expenditures is reasonable – e.g. that expenditures in 1977 were around $150 million. Actually they were 0 though, since it’s not an election year. Bars that are relatively narrow would match the data better, and I don’t think anyone would have trouble seeing the rising trend. However, if the change in trend is critical, I can see the argument for ignoring this subtlety.

      Also the graph would lend itself to being small since there’s not much in it. Two labeled points on each axis and you could shrink it to <100×100 pixels.

      Reply
  4. Jon Peltier says

    June 4, 2012 at 3:55 am

    Replacing the bar chart with a line chart was a good move. The rest was underkill. There was little enough data in the original chart. You chose to place all the curvature at the beginning, arbitrarily. Removing the data points makes this a bit dishonest.

    Reply
  5. David N says

    June 4, 2012 at 6:39 am

    Just to play a bit of Devil’s Advocate – so would a LOESS curve be a “… bit dishonest” of a data representation? As the abstract version is essentially (although not exactly) that.

    Reply
  6. Jon Peltier says

    June 4, 2012 at 10:27 am

    David –

    Good point. That’s why I always show a LOESS curve with the points in the background. But it’s dishonest to enhance detail from one part of the curve (the bottom half’s curvature) while ignoring the rest (making the top half straight). In fact, if you lay the smoothed curve over the segmented curve that omits no points, you will find that the smooth curve falls beneath three of the points, not just beneath the bump for ’78.

    The only reason not to use a straight line fit is to support an argument that prices are increasing faster in the 80’s than in the 70’s. But if we look at percent change, it was less than 35% from 72-74, more than 55% from 74-76 and 76-78 and about 25% from 78-80 and 80-82.

    Reply
    • David N says

      June 4, 2012 at 12:47 pm

      Absolutely agreed on the background – as in, (very) light gray – actual data points.

      And I (and hopefully everyone else whom reads this far) appreciate the additional info/insights you provided.

      Reply
      • Jon Peltier says

        June 4, 2012 at 1:02 pm

        Yeah, sorry if I’m letting my data loss aversion show, as well as my distrust for charts without points.

        Reply
  7. Danyel F says

    June 6, 2012 at 6:19 pm

    I’m going to have to join several of the others in arguing for overkill. Lots of the “deceptive” charts that I see involve carefully selecting sampling intervals, hiding bumps by smoothing averages, and implying trends.

    A person seeing that last chart without hearing your verbal caveats (“the apparent thousands of points are actually six of them, sampled every other year–and they’ve been smoothed to hell and back”) would make a false statement about the chart.

    If all you want to communicate is “it went up, you know”, I’d go with this.
    http://www.my-hometownrealty.com/wp-content/uploads/2011/04/arrow-up.jpg

    Otherwise, if you want data–and not art–let’s put in a couple of data points. Just here and there.

    Reply

Leave a Reply to Jorge Camoes Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

More Journalism Articles

  • What is Data Journalism?
  • When Bars Point Down
  • NewsVis.org, The Directory of News Visualizations
  • Storytelling and Focus
  • Storytelling: Minard vs. Nightingale

Recently Popular

  • Understanding Pie Charts
  • Data: Continuous vs. Categorical
  • What is Visualization? A Definition
  • How The Rainbow Color Map Misleads
  • The Simple Way to Scrape an HTML Table: Google Docs
  • Chart Junk Considered Useful After All
  • Facebook
  • GitHub
  • LinkedIn
  • RSS
  • Twitter

Subscribe via Email

Footer

  • About
  • Contact
  • License

Copyright © 2006–2019 Robert Kosara · All original materials are available under CC-BY-SA