Skip to content

Parallel Sets: Interactive Exploration and Visual Analysis of Categorical Data

Categorical data dimensions appear in many real-world data sets, but few visualization methods exist that properly deal with them. Parallel Sets are a new method for the visualization and interactive exploration of categorical data that shows data frequencies instead of the individual data points. The method is based on the axis layout of parallel coordinates, with boxes representing the categories and parallelograms between the axes showing the relations between categories. In addition to the visual representation, we designed a rich set of interactions. Parallel Sets allow the user to interactively remap the data to new categorizations, and thus to consider more data dimensions during exploration and analysis than usually possible. At the same time, a meta-level, semantic representation of the data is built. Common procedures, like building the cross product of two or more dimensions, can be performed automatically, thus complementing the interactive visualization. We demonstrate Parallel Sets by analyzing a large CRM data set, as well as investigating housing data of two US states.

Robert Kosara, Fabian Bendix, and Helwig Hauser, Parallel Sets: Interactive Exploration and Visual Analysis of Categorical Data, Transactions on Visualization and Computer Graphics (TVCG), vol. 12, no. 4, pp. 558–568, 2006. DOI: 10.1109/TVCG.2006.76

bibtex
@article{Kosara:TVCG:2006,
	year = 2006,
	title = {Parallel Sets: Interactive Exploration and Visual Analysis of Categorical Data},
	author = {Robert Kosara and Fabian Bendix and Helwig Hauser},
	journal = {Transactions on Visualization and Computer Graphics (TVCG)},
	volume = {12},
	number = {4},
	pages = {558–568},
	doi = {10.1109/TVCG.2006.76},
	abstract = {Categorical data dimensions appear in many real-world data sets, but few visualization methods exist that properly deal with them. Parallel Sets are a new method for the visualization and interactive exploration of categorical data that shows data frequencies instead of the individual data points. The method is based on the axis layout of parallel coordinates, with boxes representing the categories and parallelograms between the axes showing the relations between categories. In addition to the visual representation, we designed a rich set of interactions. Parallel Sets allow the user to interactively remap the data to new categorizations, and thus to consider more data dimensions during exploration and analysis than usually possible. At the same time, a meta-level, semantic representation of the data is built. Common procedures, like building the cross product of two or more dimensions, can be performed automatically, thus complementing the interactive visualization. We demonstrate Parallel Sets by analyzing a large CRM data set, as well as investigating housing data of two US states.},
}