Tableau started the beta of its Tableau Public program today, and what better way to kick the tires than to run some more climate data through it? Below, you can look at temperature data from 343 weather stations over twenty years (77172 obervations) to compare the difference from the baseline numbers in the 1970s and 2000s.
[Read more…] about Temperature Baseline Differences
Data
A Look At Climate Data
Wether you believe that global warming is real or not, a bit of validation of the source data is still interesting. This is my second look at the global temperature data recently released by the UK’s Met Office, this time using Tableau. There are some interesting data issues here, and a rather analytical visualization.
[Read more…] about A Look At Climate Data
Interactively Explore Climate Data
The United Kingdom’s Met Office recently released temperature data for about 1700 weather stations across the globe from 1701 to 2009. Here is an interactive visualization (built using Protovis) of that data for you to explore.
[Read more…] about Interactively Explore Climate Data
The Simple Way to Scrape an HTML Table: Google Docs
Raw data is the best data, but a lot of public data can still only be found in tables rather than as directly machine-readable files. One example is the FDIC’s List of Failed Banks. Here is a simple trick to scrape such data from a website: Use Google Docs.
[Read more…] about The Simple Way to Scrape an HTML Table: Google Docs
A Browser for Data.gov
Data.gov‘s selection of data is slowly growing, but even with less than 300 datasets, it is difficult to keep an overview of what is there. Below is a little Java applet that provides a way to drill down into data.gov’s catalog using a variety of categories: reporting agency, geographic coverage, frequency, data type, etc. Besides giving a better idea what is there, it also shows a number of inconsistencies that make finding data more difficult.
[Read more…] about A Browser for Data.gov
Data Is A Dish Best Served Raw
The recent opening of Data.gov has led to a number of discussions on data formats, feeds, what kinds of data, which agencies are or are not participating, etc. One key aspect that gets overlooked very easily, but that is really essential, is that what is being published is actual data: original, raw, unprocessed, undigested, naked data. Everything else is secondary.
[Read more…] about Data Is A Dish Best Served Raw
Pushing Data over Email
Email is still a useful transport mechanism for data (like Google Analytics, etc.), despite ftp, web services, etc. Some websites offer email for cheap, while other access can cost a lot of money. Email is also a push service, meaning you do not have to ask periodically if new data has arrived – if you do it right. Of course, that service is rather useless without an automated way to get that data into a database. Here is an introduction to the procmail program and the ancient art of the Unix mail filter.
[Read more…] about Pushing Data over Email