Skip to content

Turning a Table into a Tree: Growing Parallel Sets into a Purposeful Project

Academic software projects tend to grow organically from an initial idea into something complex and unwieldy that has enough novelty to publish a paper about. Features get added at the last minute to be able to include them in the paper, without much time to think about how to integrate them well – or how to adapt the underlying architecture of the entire program to make them fit. The result is that many of these programs are hacked together, buggy, and embarrassing. Consequently, they do not get released together with the paper, which leads to a fundamental problem in visualization: reproducibility is possible in theory, but in practice rarely happens. Many programs and new techniques are also built from scratch rather than based on existing ones. The optimal model would be to release the software right away, then come back to it later to refine and re-architect it to reflect the overall design goals of the project. This is seldom done, because there is usually nothing to be gained from a re-implementation (or thorough refactoring), so people move on to the next project. The original prototype implementation of Parallel Sets was no different. But we decided that in order to get the idea out of academia into actual use, we would need a working program. So we set out to rethink and redesign it, based on a better understanding of the necessary internal structures we gained over time. We did not only re-engineer the program, but also revise the visualization itself to clarify the overall idea.

Robert Kosara, Turning a Table into a Tree: Growing Parallel Sets into a Purposeful Project, in Steele, Illiinsky, Beautiful Visualization, pp. 193–204, O'Reilley Media, 2010.

bibtex
@inbook{Kosara:BeautifulVis:2010,
	year = 2010,
	title = {Turning a Table into a Tree: Growing Parallel Sets into a Purposeful Project},
	author = {Robert Kosara},
	editor = {Steele, Illiinsky},
	booktitle = {Beautiful Visualization},
	pages = {193–204},
	publisher = {O'Reilley Media},
	abstract = {Academic software projects tend to grow organically from an initial idea into something complex and unwieldy that has enough novelty to publish a paper about. Features get added at the last minute to be able to include them in the paper, without much time to think about how to integrate them well – or how to adapt the underlying architecture of the entire program to make them fit. The result is that many of these programs are hacked together, buggy, and embarrassing. Consequently, they do not get released together with the paper, which leads to a fundamental problem in visualization: reproducibility is possible in theory, but in practice rarely happens. Many programs and new techniques are also built from scratch rather than based on existing ones. The optimal model would be to release the software right away, then come back to it later to refine and re-architect it to reflect the overall design goals of the project. This is seldom done, because there is usually nothing to be gained from a re-implementation (or thorough refactoring), so people move on to the next project. The original prototype implementation of Parallel Sets was no different. But we decided that in order to get the idea out of academia into actual use, we would need a working program. So we set out to rethink and redesign it, based on a better understanding of the necessary internal structures we gained over time. We did not only re-engineer the program, but also revise the visualization itself to clarify the overall idea.},
}