A Protovis Primer, Part 1

Protovis is a very powerful visualization toolkit. Part of what makes it special is that it is written in JavaScript and runs in the browser without the need for any plugins. Its clever use of JavaScript’s language features makes it very elegant, but it can also be confusing to people who are not familiar with functional programming concepts and the finer points of JavaScript. This multi-part tutorial shows how to create a visualization (my interactive Presidents Chart) in Protovis, and explains the concepts that are involved along the way.

This introduction is based on my experiences with using Protovis in my Visualization and Visual Communication class earlier this spring. While the concepts involved are really not that difficult, they are rather foreign to students who have not been exposed to functional programming. And since that is also the case for a lot of hobbyists and people wanting to do visualization who do not have a computer science background, I imagine they run into the same problems.

This has grown from being a single article into several parts (and is still expanding). Let me know if there are things that you don’t understand or that you think need to be covered in more detail, so I can tailor the next parts accordingly.

Protovis requires a modern browser, which means any recent version of Safari, Chrome, FireFox, or Opera. Internet Explorer does not work, because it does not support the HTML5 Canvas element. The visualizations in this article are all Protovis drawings (check out the source code!), with a fall-back to images for RSS readers and IE users. There is no real difference at this point, but once we get to interaction, you will want to read this in a supported browser.

A Simple Example

Let’s start with a simple example, taken almost verbatim from the Protovis paper presented at InfoVis 2009 (PDF).

The following code creates this chart:

  1. var vis = new pv.Panel().width(150).height(140)
  2.     .add(pv.Bar)
  3.         .data([1, 1.2, 1.7, 1.5, .7, .2])
  4.         .bottom(0).width(20)
  5.         .height(function(d) d * 80)
  6.         .left(function() this.index * 25)
  7.     .root.render();
var vis = new pv.Panel().width(150).height(140)
    .add(pv.Bar)
        .data([1, 1.2, 1.7, 1.5, .7, .2])
        .bottom(0).width(20)
        .height(function(d) d * 80)
        .left(function() this.index * 25)
    .root.render();

Let’s look at this line by line. The first line defines a new Panel, the base element for any Protovis chart. The panel is the canvas on which everything is drawn (in fact, it becomes an actual canvas element in the webpage). This line also defines its size by calling two functions that set its width and height.

Function Chaining

This brings us to the first bit of magic: Almost all functions in Protovis return the object they are called on. That makes it possible to chain function calls instead of having to repeat the variable name over and over. While this leads to more elegant, simpler code, it can make it a bit terse and more challenging to read. But consider the alternative:

  1. var vis = new pv.Panel();
  2. vis.width(150);
  3. vis.height(140);
  4. var bar = vis.add(pv.Bar)
  5. etc …
var vis = new pv.Panel();
vis.width(150);
vis.height(140);
var bar = vis.add(pv.Bar)
etc ...

Function chaining lets you treat functions like statements: I want a panel, and I want its width to be 150 and its height to be 140; this translates very directly into function calls. While this is not all there is to what Bostock and Heer call a “declarative style of programming,” it certainly is a big part of it.

Marks

But back to our example. The second line adds something to the panel by calling the add() function. What is added is called a mark, a graphical element that can represent data. Let’s briefly skip over the details of that and look at the last line: here, we need to call the render function of the panel to tell it to create the necessary infrastructure and start rendering. Since at this point, the functions being called are called on the bar mark that was added in the second line, we need a way to go back to the panel. This could be done by assigning the panel to a variable vis and calling vis.render(). The way it is done here is using the special variable root, which exists in all marks. It returns the panel that the mark sits on, so we can add other marks to it or call its functions.

The following code snippet is identical to the above. I just showed the other one first because it is the style that the Protovis examples are written in, so it makes sense to get familiar with it.

  1. var vis = new pv.Panel().width(150).height(140);
  2. vis.add(pv.Bar)
  3.     .data([1, 1.2, 1.7, 1.5, .7, .2])
  4.     .bottom(0).width(20)
  5.     .height(function(d) d * 80)
  6.     .left(function() this.index * 25);
  7. vis.render();
var vis = new pv.Panel().width(150).height(140);
vis.add(pv.Bar)
    .data([1, 1.2, 1.7, 1.5, .7, .2])
    .bottom(0).width(20)
    .height(function(d) d * 80)
    .left(function() this.index * 25);
vis.render();

Using variables is a good idea in general, especially when creating more complex visualizations. But the clever design of Protovis’ functions makes it possible to all but eliminate them in many cases (at least as far as Protovis objects are concerned).

Mark Properties

Now let’s look at the definition of the bars, which is the same between both alternatives. The first statement defines an array of numbers, which are the data to be shown on the chart. We will later see data in variables and arrays of objects, not just numbers. But this kind of inline definition of the data is also possible. The square brackets delimit an array in JavaScript (and lots of other programming languages), numbers are listed with commas in between.

The line after the data statement specifies two important but constant parts of the layout. Since we’re using bars, we have to specify at least their position (x and y), their width, and height. For a bar chart, it makes sense to have all the bars have the same width, so it is set to a number here. And since we want all bars to sit on the same baseline, we specify their y position as 0. The terminology here, bottom, has to do with the way locations and sizes can be specified in Protovis. I’ll cover more of that later.

More Fun With Functions

The next two lines is where it gets really interesting, and where a lot of confusion comes from. Remember that we are creating a bar chart here, but there is no loop to draw each of the bars. We specified the entire data array two lines earlier. Now how do we get Protovis to draw something?

The way Protovis works is that a definition like this tells it to iterate over the array of data it is given, instantiate a mark for each entry in the array, and evaluate all the information it is given about what to do with the mark. In the case of the width and bottom attributes, this is simple: we specified constants, so it just uses these same numbers for all of the bars it creates.

But we want the bars to represent the data by varying their height, how do we do that? The answer here is a function. Instead of specifying a number or variable, we assign a function to the height property. For every entry in the array, Protovis will call this function with the value, and use the result of the function as the height of the bar. The beauty of this approach is that JavaScript, unlike Java and most of the more common imperative languages, can use a function like a variable: you can pass it to another function, store it in a variable, and evaluate it by calling it.

As with the function chaining before, the particular style in which this is done in Protovis makes this a bit more difficult to understand. Here is the definition of the height again:

  1. .height(function(d) d * 80)
.height(function(d) d * 80)

And here is the same definition in a more verbose style that is actually correct JavaScript (Protovis does some magic with function parsing that allows the sloppy but more compact style above):

  1. .height(function(d) {
  2.     return d * 80;
  3. })
.height(function(d) {
    return d * 80;
})

Both statements do the exact same thing: they specify a function without a name (called an anonymous function) that takes one argument called d. The function returns the result of a simple computation, multiplying the value of that argument by 80. The unit here is pixels, which means that our highest bar will be 1.7 * 80 pixels high (1.7 is the maximum value in the data array).

Functions used in property definitions can do a lot more complex things. You can also assign functions to variables and reuse them in different definitions. But many functions used in visualization definitions in Protovis perform fairly straight-forward calculations or access data in more complex data structures (we’ll get to that, too).

The last line in the bar definition works the same way as the height definition, with a small difference. We want our bars to be next to each other, not drawn on top of each other. Even though the mark is called bar, it is really more a general-purpose rectangle: it doesn’t know anything about being a bar in a bar chart. That makes it very flexible and powerful, but it also means that you need to do a lot of things by hand that might seem obvious.

Bar Layout

To move the bars, we specify their left property, which together with the bottom means we’re specifying the lower left corner. Since this value will have to be different between the bars, we need to again specify a function. We are specifying everything at once, so there is no obvious way to order the bars and use that order to space them. Protovis therefore has a special variable that is available when it evaluates functions, which is called this.index. That variable’s value is the index of the current value in the array, starting at 0 and going all the way to the size of the array minus 1.

  1. .left(function() this.index * 25);
.left(function() this.index * 25);

Remember that we set the width of the bars to 20. Putting the bars at multiples of 25 means that we’re leaving five pixels of space between them. You can easily tweak any of these numbers to change spacing, scaling, etc. Just remember that when you increase the scale factor, you will have to also change the size of the panel, or your bars will be cut off!

Adding A Variable

The last step in this first part of this tutorial is simply to move the data into a variable. This is a simple but important step. To do this, we will define new variable numbers that contains the exact same numbers as above. The only other thing that changes is the data() function call that now uses the variable.

  1. var numbers = [1, 1.2, 1.7, 1.5, .7, .2];
  2.  
  3. new pv.Panel().width(150).height(140)
  4.     .add(pv.Bar)
  5.         .data(numbers)
  6.         .bottom(0).width(20)
  7.         .height(function(d) d * 80)
  8.         .left(function() this.index * 25)
  9.     .root.render();
var numbers = [1, 1.2, 1.7, 1.5, .7, .2];

new pv.Panel().width(150).height(140)
    .add(pv.Bar)
        .data(numbers)
        .bottom(0).width(20)
        .height(function(d) d * 80)
        .left(function() this.index * 25)
    .root.render();

Running this does not actually make any difference, which is what we intended. But it opens the doors to some more flexibility. What if we wanted the bars to change size depending on how much data there is? Let’s define another variable for the panel, panelWidth, and calculate the width of the bars, barWidth, from the size of the array.

  1. var panelWidth = 150;
  2.  
  3. var barWidth = panelWidth/numbers.length;
var panelWidth = 150;

var barWidth = panelWidth/numbers.length;

The barWidth is the total space a bar takes up, including its spacing to the next bar. If we simply use this for our definition:

  1. new pv.Panel().width(panelWidth).height(140)
  2.     .add(pv.Bar)
  3.         .data(numbers)
  4.         .bottom(0).width(barWidth)
  5.         .height(function(d) d * 80)
  6.         .left(function() this.index * barWidth)
  7.     .root.render();
new pv.Panel().width(panelWidth).height(140)
    .add(pv.Bar)
        .data(numbers)
        .bottom(0).width(barWidth)
        .height(function(d) d * 80)
        .left(function() this.index * barWidth)
    .root.render();

We end up with something slightly unexpected:

We need to add some space between the bars again. Remember that we set the width to 20 earlier, but multiplied the index by 25. We need something similar here, so we’ll introduce another variable, barSpacing. This will be subtracted from the barWidth in the width definition, so the bar does not use the entire width it got assigned.

  1. var barSpacing = 5;
  2.  
  3. new pv.Panel().width(panelWidth).height(140)
  4.     .add(pv.Bar)
  5.         .data(numbers)
  6.         .bottom(0).width(barWidth-barSpacing)
  7.         .height(function(d) d * 80)
  8.         .left(function() this.index * barWidth)
  9.     .root.render();
var barSpacing = 5;

new pv.Panel().width(panelWidth).height(140)
    .add(pv.Bar)
        .data(numbers)
        .bottom(0).width(barWidth-barSpacing)
        .height(function(d) d * 80)
        .left(function() this.index * barWidth)
    .root.render();

The result is a readable chart that looks like before:

This may not look very exciting, but it gives us a lot of flexibility. Now we can change the width of the chart by simply changing the panelWidth, let’s say to 250:

As you can see, the bars have adapted to the new width. The spacing looks a bit odd because barSpacing is still a constant, but that would be easy to change, too. But what happens if we add more data to our numbers array?

  1. var numbers = [1, 1.2, 1.7, 1.5, .7, .2, .5, .9];
var numbers = [1, 1.2, 1.7, 1.5, .7, .2, .5, .9];

The bars adapt to the number of values, because the barWidth depends on the number of values in the numbers array, numbers.length. This only works to a certain point, because our spacing is still constant; but it’s a start.

This concludes the first part of this tutorial. The next parts will cover more complex layouts, Protovis scales, other mark types, reading of complex data, labels, etc.

Start Playing!

If you want to play with Protovis, I am providing a ZIP file with two simple HTML files: one is the basic bar chart, the other one uses variables for the data and the widths.

I have also started a repository on github for people familiar with git.


Next:

Comments

  1. Nicolas says

    Nice Review! I’ve just seen they released a new version with new cool visualizations that looks really nice.

    BTW, I thought they didn’t support IE because they used SVG and not Canvas. Actually even though IE does not support Canvas you can use ExCanvas to make it work, like with my toolkit (that does support IE): http://thejit.org .

    There’s an SVG like language in IE called VML, libraries like RaphaelJS add simple geometric primitives that translate into VML and SVG, Protovis could use that also to add support for IE.

    In regard with performance and readability, I don’t really understand what all these functions calls is about. I don’t want to be too negative but it’d seem that they’re using the functional paradigm to do declarative suff. Why not just pass an object? It would save all function calls and would be more readable too:


    new pv.Panel({
    width: bla,
    height: bla,
    left: bla,
    data: []
    });

    Anyway, nice review.

  2. Robert Kosara says

    I haven’t thought this through entirely, but you’re probably right. It may be that a more declarative approach would be as flexible as the functions, and I’m sure it would be more readable. It doesn’t really make the part about passing functions around easier for people who aren’t familiar with this kind of thinking, though.

    As for IE: My understanding is that VML uses some other element, so it’s both canvas and SVG that are missing. But I don’t know the specifics of this. I’ve also seen various claims that different libraries make Protovis work with IE or not, but I didn’t want to get distracted by that. Maybe in a later part.

    I also know thejit, but I haven’t looked at it very carefully so far. It’s on my list of things to do, though. ;)

  3. Alex Kerin says

    I believe that Protovis is indicative of the future of interactive web-based visualizations, and even dashboards. I created [this animated Dorling Cartogram](http://www.datadrivenconsulting.com/2010/05/dorling-cartogram-of-unemployment-1980-2009/) by hacking the code they showed in the ACM paper, and changing the data source.

    It didn’t take that long to understand what was going on – there’s no way I would have been able to create this in Flash without manually drawing the chart time slices.

    Great review, and great job by the contributors to Protovis (and jQuery, etc. etc.)

  4. John C. Munoz says

    Thanks for the excellent tutorial. I don’t know JavaScript, but found what your wrote to be easy to understand.

    Along the lines of what Alex Kerin wrote above, I hope to use Provotis as a dashboard creation tool. I currently use Excel with a $250 plug in from XLCubed, which allows for crisp and small sparklines, bullet charts, and bar charts. But these dashboards on the web can only be presented in static image form. And updating them is usually an offline process.

    I’d love to see if protovis can be made to layout a grid type canvas on which to paint a good dashboard with live data.

    Thanks,

    John C. Munoz
    http://www.bizintelguru.com

  5. Robert Kosara says

    It’s certainly possible. I’ll cover multiple panels and layouts later, those should make it easier to build a dashboard. It’s easy to modify the data the chart is based on for dynamic visualization, you can check out the Protovis examples to see that. I’ll also talk about that at some point in the future.

  6. Derek says

    @Nicolas: “There’s an SVG like language in IE called VML, libraries like RaphaelJS add simple geometric primitives that translate into VML and SVG, Protovis could use that also to add support for IE.”

    I’d rather __not__ see this happen at all; Microsoft should be moving in support of open standards, rather than developers having to support proprietary and closed standards. Google’s plug-in can be used to create static images for those on insist on using a non-standards compliant browser.

  7. Adam says

    Well, *I* think the jquery-like syntax is more readable.. the inheritance concept seems more ‘intuitive’.. Anyway Nicolas is right about protovis using SVG (it might have used canvas but the current version definitely uses svg). As for vml, you could say svg is the w3c standardized version of microsofts vml.
    Btw, I couldn’t find the presidents’ chart mentioned in the post..?

  8. Adam says

    Not true, you could just display a screenshot of the image for IE. Once you wrap your head around the concepts, protovis enables even static visualizations that are both easy and rich

  9. ruggero says

    this tutorial is great! you teached the meaning of the javascript syntax and also why you’re using it… and it’s what I was looking for..
    I’ can’t wait for the next part!

  10. Daniel says

    Thank you very much for 2 great articles – looking very much forward to the next one!

    Excuse my chutzpah, but here’s a short wish-list.

    - Would be wonderful to learn more about tweaking a stack / stream graph. For example recreating the Crimean War stack graph here:
    http://vis.stanford.edu/protovis/jsdoc/symbols/pv.Layout.Stack.html

    - I know protovis’ animation features are being worked on but what can be done right now? Using generic javascript or intrinsic features …

    Thanks once again!

  11. Wai Yip Tung says

    Protovis definitely uses SVG, not HTML5 canvas. From the second paragraph of its own home page:

    “Protovis is free and open-source, provided under the BSD License. It uses JavaScript and SVG for web-native visualizations; no plugin required (though you will need a modern web browser)!”

    In either case, no IE8 support.

  12. santhosh says

    You have no idea how much you have helped me. I had to figure out how protovis works for integration into one of my projects and you are a lifesaver.

  13. Eiger says

    Aw, this became an incredibly good quality article. In principle I would like to produce this way way too * spending time along with true hard work to generate a very good document… nevertheless exactly what can I have faith that… My spouse and i delay doing things much and don’t apparently go accomplished. There are additional alternatives for beginning ache.
    eiger and
    eiger

  14. Eiger says

    Aw, this became an incredibly good quality article. In principle I would like to produce this way way too * spending time along with true hard work to generate a very good document… nevertheless exactly what can I have faith that… My spouse and i delay doing things much and don’t apparently go accomplished. There are additional alternatives for beginning ache.

    eiger
    eiger

  15. laertes13 says

    I’ve been searching for an HTML way to make charts after getting frustrated with Flex’s complexity. I’d messed with Protovis a bit, but couldn’t get my head around it. After your tutorial, I smiled big while adjusting the bar spacing and widths. So, simple. Thank you!

Leave a Reply