The United Kingdom’s Met Office recently released temperature data for about 1700 weather stations across the globe from 1701 to 2009. Here is an interactive visualization (built using Protovis) of that data for you to explore.
If you are reading this in your newsreader or using Internet Explorer, you will not see anything interesting below. Use a modern browser like Safari, FireFox, or Chrome. Internet Explorer is missing some crucial features that Protovis depends on.
I will post a more detailed analysis on Monday, but here are some quick facts: about 1.4 million data points from a varying number of stations (around 1700 at peak) over more than 300 years. The data points are monthly averages per station, which I aggregated into overall monthly and yearly averages and standard deviations.
The visualization consists of two views. The bottom view shows yearly averages (dark blue line) and standard deviation spread (light blue background) over about 270 years (no good reason for this choice other than to fit into my theme). Mouse over the view to see the detailed data (average, min/max, and standard deviation) displayed in the lower right.
The top view shows the monthly averages for the currently selected year (from the bottom view). The gray shape in the background gives you some context about the range of values over all years for each month, so you can see whether the current one is close to the top or bottom. Note how the winters are getting less cold and the curve across the year moves up and gets flatter as you move forward in time.
10 responses to “Interactively Explore Climate Data”
This is very nice. Definitely shows the trends in a clear way. I’d love to have you get involved in the discussion at the #climatedata forum – http://climatedata.blprnt.com
This is a very excellent representation. The 2d geo viz and our 3d one both show the changing temp within a year but do very very little in showing the gradual year on year increase.
Your representation highlighting the flattening of the curve over time is brilliant!!
Nice! I haven’t seen anyone interactively link the annual cycle to long-term changes in this way before. Neither the grey background nor the standard dev shows up for me (I’m running FF 3.5.5 under XP).
I agree, this is really elegant and quite revealing. But, a question from the southern hemisphere: how have you handled the differences between northern and southern seasons? If you are taking the data per month as is (rather than shifting the “phase” of the southern year to compensate), then you’re mixing northen winter with southern summer, and vice versa – so the seasonal fluctuations in the year graph are being flattened somewhat. It might be that the northern hemisphere seasonal shape in the year graph, is due to the dominance of northern hemisphere data in the set?
Also, I wonder what happened in 1951 to create that spike? Did a whole lot of new data sources become available?
I didn’t do any processing, and nothing to account for north/south differences. These are land-based weather stations, and there is a clear majority in the northern hemisphere. I will look into this further for my next article, though. I also can’t explain the 1951 spike yet, but I’m also going to look into the number of stations over time, and will certainly take a closer peek at that year.
I pointed out a couple of potential problems on twitter this morning and thought it made sense to attach them to this post in a more clear form.
First of all, I like very much the general approach taken here and agree that linking the annual curve with the longer term trend works well. As Robert admitted this morning these graphs are for all the stations. The video by Michael Schieben at
shows quite clearly how the number and geographic location of these weather stations varies over time.
This means we cannot simply draw conclusions regarding the overall temperature trend or the annual curve because the underlying data does not cover a consistent geographic region. The fact that the average temperature shown in the curve is so obviously higher in the 1900’s than in the 1700’s is at least partially due to the fact that the data from the 1700’s was all from europe whereas the data from the 1900’s was global and included warmer parts of the world. The jump in average temperature in 1951 is because the data from many new weather stations in Africa were added at that time. Again, Michael’s video shows this clearly.
You can see that Robert’s curve gets smoother over time. This is because there are more stations over time so it averages out more of the variation in the data.
I suspect some of the flattening of the annual curve is because initially the data was all from europe and gradually more stations from the southern hemisphere were added in later years. The difference in seasonality between the hemispheres means that adding them together will flatten it.
Another clue that there is a problem here is that a naive interpretation of this temperature curve shows an increase of ~ 8C between 1800 and 2000 – I believe this is much much higher than the scientific consensus of what the increase has been.
Note that I’m not claiming there has been no increase in average temperatures – a very strong majority of the world’s climate scientists seem to think there has been. I would like to see Robert (or somebody) repeat this visualization with a consistent set of weather stations so that we can properly control for geographic variation.
I would also like to say that I think it is wonderful that Robert, Jer, Michael, Flink labs and so many others have embraced this challenge.
Jeff has pointed out concerns about using actual temperatures to compute a global average. Since the number and distribution of weather stations has changed over the years, the trend chart does not reflect the true change in temperature.
Climate scientists like James Hansen of NASA recogniized this issue in the 1970s and started using temperatuyre anomalies to overcome this problem. You can find out the exact details at sites like NASA’s GISSTEMP.
The concept is to develop a long term (1961-1990 for NASA) baseline period for each station by day, then calculate the anomaly (act temp – baseline) for each station for each day. This way, you can calculate the changes from the baseline period.
Since anomalies are differences between a station’s actual and baseline values, it is appropriate to compute average anomales across a number of stations.
Most longterm climate studies use temperature anomalies rather than the raw data.
I have a number of charts of long term anomaly trends at http://chartsgraphs.wordpress.com/climate-trends/.
D Kelly O’Day
I looked at the data set and found out that the distribution of weather stations changes a lot over time. At around 1950 africa gets a lot of new weather stations. That’s why you can’t simply average over weather stations in different latitude positions.
I made some visualizations myself to prove this visually, see my last article as vis4.net.
Forgot the link to the mentioned article..