• Skip to main content
  • Skip to primary sidebar
  • Skip to footer

eagereyes

Visualization and Visual Communication

  • Explore
    • Starter Pack
    • Blog Calendar
    • Blogroll
    • eagereyesTV YouTube Videos
  • Practical
    • Basics
    • Pie Charts
    • Techniques
    • Book Reviews
    • Journalism
  • Academic
    • Speaking Mistakes
    • Acceptance Rates
    • Papers
    • Conference Reports
    • Lists of Influences
    • Criticism
    • Peer Review
  • Admin
    • About
    • Contact
    • License
How to Get Excited About Standard Datasets

Robert Kosara / March 21, 2018

How to Get Excited About Standard Datasets

It can be hard to get excited about the standard datasets that we keep using to show how visualization and statistics work. But if that’s the case for you, it’s not the datasets’s fault, it’s you! Here’s how to keep that spark going!

Cars

What could be more interesting than cars? I mean, come on – they’re cars! And I’m not talking about boring Priuses or self-driving cars or any of that newfangled stuff. No, these are from the time when cars were still cars: the 1970s and early 80s. That’s what the cars dataset is all about (there are, it turns out, lots of car-related datasets, but there’s only one true cars). Real cars. Manly cars.

So yeah, cars. Like, from the 1970s. Look at them! All those cylinders (whatever those are)! Four and six and even eight cylinders! Crazy! Also weight and mileage and stuff. Who knew they had those in the 70s?

You can learn fascinating things, like that heavier cars have lower mileage – who knew? Or that more cylinders mean lower mileage. I know, somebody should really tell those car makers about that. Even acceleration is correlated with weight, you can’t make this stuff up!

Cars just never get old. I mean, cars. Who doesn’t love cars? Cars, cars, cars…

Iris

If the cars dataset seems a bit dated, surely the iris data will answer your burning questions. Who hasn’t stared at an iris plant and gone crazy trying to decide whether it’s an iris setosa, versicolor, or maybe even virginica? It’s the stuff that keeps you up at night for days at a time.

Luckily, the iris dataset makes that super easy. All you have to do is measure the length and width of your particular iris’s petal and sepal, and you’re ready to rock! What’s that, you still can’t decide because the classes overlap? Well, but at least now you have data!

Actually, it turns out that this data is even older than the cars! It’s from a 1936 paper! They sure knew their irises in the 30s. And it’s not like plants change all that much in 80 years.

Titanic

Of all the datasets, the Titanic data is clearly the most dramatic. Who isn’t obsessed with the disaster that happened over 100 years ago? Who hasn’t seen the movie that came out in 1997, which is, uh, just over 20 years ago now? I mean, who over the age of 40, of course (millennials don’t know anything, as usual)?

Well, the data is fascinating either way. You can see how people in the first class did much better than those in the second and third classes! Fascinating insights that you would never have guessed! And the crew mostly died too. It’s almost as if wealth bought you survival. Of course, by now they’re all dead so it’s not like it matters anymore.

Isn’t it amazing how much you can learn from just four variables, though! It doesn’t even matter who all those people were, they’re just numbers now anyway. They’ve all turned into data.

Love the Classics

The classic datasets are fine. If they bore you, maybe it’s you who’s boring? If they don’t interest you, maybe you have the wrong interests? Generations of students have learned to love them, and so will you!

Filed Under: Blog 2018

Robert Kosara is Senior Research Scientist at Tableau Software, and formerly Associate Professor of Computer Science. His research focus is the communication of data using visualization. In addition to blogging, Robert also runs and tweets. Read More…

Reader Interactions

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

More Blog 2018 Articles

  • New Vis Research Blog: Multiple Views
  • Tapestry 2018 Program, Call for Demos
  • IEEE VIS 2018, Part 3: New Approaches, Meet the Founders, Perception and Cognition
  • IEEE VIS 2018, Part 2: Time, Evaluation, Dashboards, The Future of VIS
  • IEEE VIS 2018, Part 1: VisComm, VisInPractice, BELIV, Best Papers

Recently Popular

  • Data: Continuous vs. Categorical
  • Understanding Pie Charts
  • The Simple Way to Scrape an HTML Table: Google Docs
  • Spreadsheet Thinking vs. Database Thinking
  • How The Rainbow Color Map Misleads
  • An Illustrated Tour of the Pie Chart Study Results
  • You Only See Colors You Can Name
  • Facebook
  • GitHub
  • LinkedIn
  • RSS
  • Twitter
  • YouTube

Subscribe via Email

Footer

  • About
  • Contact
  • License

Copyright © 2006–2021 Robert Kosara · All original materials are available under CC-BY-SA