• Skip to main content
  • Skip to primary sidebar
  • Skip to footer

eagereyes

Visualization and Visual Communication

  • Explore
    • Starter Pack
    • Blog Calendar
    • Blogroll
    • eagereyesTV YouTube Videos
  • Practical
    • Basics
    • Pie Charts
    • Techniques
    • Book Reviews
    • Journalism
  • Academic
    • Speaking Mistakes
    • Acceptance Rates
    • Papers
    • Conference Reports
    • Lists of Influences
    • Criticism
    • Peer Review
  • Admin
    • About
    • Contact
    • License
When Rankings Are Just Data Porn

Robert Kosara / December 18, 2016

When Rankings Are Just Data Porn

Rankings are a common way of talking about data: who made the most money, who won the most medals, etc. But they hide issues in the underlying data. Is the difference between first and second meaningful or just noise? Here is a data video that nicely demonstrates the problem.

Watch the first few minutes of this video about baby names in the U.S. over time. I find it fascinating, though not for the reason most people probably do.

I couldn’t make it past the first part about girls’ names, and not because the map was so enthralling. I kept staring at that bar chart on the right. That was way more interesting and revealing to me.

Let’s start in 1910: Mary leads, with somewhere between 4.5% and 5% of girls born that year being named Mary. Five percent! That means that over 95% of newborn girls were not named Mary. How is something popular when it’s less than 5%?

But it gets better. Watch the video again and keep an eye on the scale of the bar chart. It keeps getting smaller, until in 2014, the most popular name, Emma, is barely above 1%. Almost 99% of newborn girls are not named Emma! Again, how is that a popular name?

What I find interesting about this is how it shows how names are becoming more diverse. There are many more names now, and parents no longer feel bound to give only common or well-established names.

The data behind this is not easy to find. The Social Security Administration lists the top five girls’ and boys’ names by state for the years 1980 to 2015. Though bafflingly, they report the number of kids rather than a percentage of births. The differences in population between states (and thus, presumably, between the number of births) are not trivial, however.

But the absolute numbers do show how small the margins are. In 2014, Vermont had the same number of Emmas as Olivias (40 each), in South Dakota, 60 Harpers outnumbered 59 Avas. A single decision here would have flipped the ranking. In many states, the difference between the top names is a handful of births or a few dozen. These are not sweeping favorites, they’re flukes.

Even the populous states (or, more precisely, the ones with many births) have narrow margins, though. In 2014, California had 502,879 births according to the National Center for Health Statistics. Ranked first among girls’ names was Sophia, with 3,172 births, over Isabella, with 2,717. That’s a difference of 455, or less than 0.02% of newborn girls (assuming 50% girls, since I can’t find a gender breakdown). In Florida, Sophia lost to Isabella that year, 1,461 over 1,237, or about 0.2%. Texas boasts 2,183 Emmas over 2,153 Sophias, another margin in the hundredths of a percent.

These rankings are meaningless. The differences they are based on are so tiny that they are of no consequence. Rankings have an air of authority and precision, but they hide all of that uncertainty. Maybe that’s why they’re so popular.

To be fair, the person who made this video tried to take this into account a bit by encoding the difference between first and second in the color’s saturation, but I don’t see how anybody would be able to keep track of that. If you really pay attention, you can see the map get fainter over time, though.

But really, the entire idea of a single most popular name per state is nonsense, especially in the last 20 years or so. It makes for a pretty animated map, sure. But in the end, it’s just data porn.

Filed Under: Blog 2016

Robert Kosara is Data Visualization Developer at Observable. Before that, he was Research Scientist at Tableau Software (2012–2022) and Associate Professor of Computer Science (2005–2012). His research focus is the communication of data using visualization. In addition to blogging, Robert also runs and tweets. Read More…

Reader Interactions

Comments

  1. Olha Buchel says

    December 19, 2016 at 2:57 am

    The choropleth is not the best method to visualize such data. Geographically weighted summaries offer a better solution.

    Reply
  2. Leonard says

    December 19, 2016 at 11:28 am

    You can find the raw data for this here: https://www.ssa.gov/OACT/babynames/limits.html

    Names are diverse, so I agree with you that the video’s focus on rank #1 is relatively meaningless (although names do have clear fashions, which a ranking can reflect, even though the absolute rank isn’t that meaningful).

    That said, I think the video is intended to be entertainment/data porn rather than educational. It was dramatic when all the states apparently flipped to the same name in the same year (ignoring the fatal flaw that larger states have a much bigger visual impact than more populated states).

    Reply

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

More Blog 2016 Articles

  • A Roundup of Year-End News Graphics Roundups
  • The Dumbest User Interface of 2016
  • The EagerEyes Holiday Shopping Guide
  • The Problem with Vis Taxonomies
  • RJ Andrews’ Profiling the Parks

Recently Popular

  • The US ZIPScribble Map
  • New, Improved Traveling Presidential Candidate Map
  • The Travelling Presidential Candidate Map
  • The Interactive ZIPScribble Map
  • Data: Continuous vs. Categorical
  • Chart Junk Considered Useful After All
  • New video: Gauges for Data Visualization, The NY Times Election Needle, and Circular Bar Charts
  • Facebook
  • GitHub
  • LinkedIn
  • RSS
  • Twitter
  • YouTube

Subscribe via Email

Footer

  • About
  • Contact
  • License

Copyright © 2006–2022 Robert Kosara · All original materials are available under CC-BY-SA