A National Data Agency
President Obama promises a more responsible and accountable government that openly shares information with its people. This includes publishing executive orders and laws before they are signed, so everybody can comment. But it also needs to include the data decisions are based on. An information society needs its data to be available and accessible to make informed decisions – just like its leaders.
What impressed me most about President Obama's inauguration speech was this passage (emphasis mine):
And those of us who manage the public's dollars will be held to account, to spend wisely, reform bad habits, and do our business in the light of day, because only then can we restore the vital trust between a people and their government.
There is no reason to doubt his words: already, the new White House web site has a blog and new sections for all kinds of reports. They will also post executive orders and legislation for review by everybody who cares to look. There is also a very interesting executive order that Obama issued on his first day that reverses Bush's sweeping use of executive privilege to withhold information and allows much more access to Presidential records (including those of past Presidents).
All these great developments notwithstanding, there is still a piece of the puzzle missing: the numbers. This president, more than many (or all) before him, makes his decisions based on numbers. These numbers need to be published and made accessible so we can understand the decisions and make our own.
The challenge is not only data availability. A lot of data is, in fact, available. The US is the most transparent nation in the world – to an extent that can be frightening to an outsider (think pay data for state employees, property tax data, etc.).
The challenge is that a lot of data is published in a format that is human-readable, not machine-readable. This might sound like a good thing, but it's not. Machine-readable data can be processed and transformed into any number of human-readable forms, that direction is trivial. Making human-readable data accessible to a machine is much more difficult, error-prone, and expensive.
What we need is a National Data Agency (NDA). This agency would be tasked with collecting data that all other agencies collect and produce, and making it available in a central place and in electronic, machine-readable form. There could and should be a reasonable data presentation on its website, perhaps even a National Data Dashboard (showing data of interest like debt, spending, jobless rate, etc.). But the bulk of data analysis would be left to third parties: analysts, journalists, citizens (and also aliens like me). Easily available data would make for more insightful reporting, more informed decisions, and endless business opportunities.
There are a number of great initiatives that make it possible to work with data collected from official sources: the Death and Taxes poster, Every Block's collection of local data, They Rule's network of board members, The New York Times' Senate API, etc. These are partly based on data feeds, but mostly on tedious, manual analysis of records and reports. The way data is presented and communicated produces lots of unnecessary work to undo the formatting and get back to the original. Publishing the data as simple data files would be so much more efficient.
The obvious thing to do with the data would of course be to visualize it. Feed it directly to Swivel and Many Eyes and any number of new sites that could build custom visualizations for different kinds of data. We have the tools to build a more informed, more engaged, and more educated society – we just need access to the raw material: data.


Comments
How about an Office of
How about an Office of National Statistics? The Brits <a href="http://www.statistics.gov.uk/">have one</a>. Maybe we should too. Perhaps Secretary of Statistics should be a cabinet-level position. I nominate Andrew Gelman.
Total political transparency, anyone?
These might turn out to be good times to head towards the famed "total political transparency".
Carl Malamud
I do hope everyone here knows the work Carl Malamud has done to make government information freely accessible in a machine-friendly form. If we ever do get a National Data Agency -- which I think is a terrific idea -- I can't imagine anyone who could do a better job at it. For once Carl would be directing the bureaucracy rather than having to fight it.
Sunlight Foundation
Carl Malamud is doing excellent individual work; I would also strongly suggest that you read about the Sunlight Foundation ( http://www.sunlightfoundation.com/ ), which explores this idea in fascinating detail. Sunlight is rapidly turning into the center for tracking and distributing government sources of information.
Economic Data
This type of agency would be a wonderful help for me. When I want to find economic data for my site VisualizingEconomics (even when the data is in a database or excel spreadsheet) I end up with a long list of government sites: BEA, CBO, Fed Reserve, Census, BLS, CIA not to mention World Bank and IMF data. Each with different interfaces for downloading the data into different formats. And while there are attempts to gather this type of econmic data in one place (like Economagic) I don't think any private organization can ever be the central repository.