Change and transition dataset in chord diagram with D3

Creating a chord diagram

There are a number of layers to creating a chord diagram with d3, corresponding to d3’s careful separation of data manipulation from data visualization. If you’re going to not only create a chord diagram, but also update it smoothly, you’ll need to clearly understand what each piece of the program does and how they interact.

Sample Chord Diagram, from the example linked above

First, the data-manipulation aspect. The d3 Chord Layout tool takes your data about the interactions between different groups and creates a set of data objects which contain the original data but are also assigned angle measurements . In this way, it is similar to the pie layout tool, but there are some important differences related to the increased complexity of the chord layout.

Like the other d3 layout tools, you create a chord layout object by calling a function (d3.layout.chord()), and then you call additional methods on the layout object to change the default settings. Unlike the pie layout tool and most of the other layouts, however, the chord layout object isn’t a function that takes your data as input and outputs the calculated array of data objects with layout attributes (angles) set.

Instead, your data is another setting for the layout, which you define with the .matrix() method, and which is stored within the layout object. The data has to be stored within the object because there are two different arrays of data objects with layout attributes, one for the chords (connections between different groups), and one for the groups themselves. The fact that the layout object stores the data is important when dealing with updates, as you have to be careful not to over-write old data with new if you still need the old data for transitions.

var chordLayout = d3.layout.chord() //create layout object
                  .sortChords( d3.ascending ) //set a property
                  .padding( 0.01 ); //property-setting methods can be chained

chordLayout.matrix( data );  //set the data matrix

The group data objects are accessed by calling .groups() on the chord layout after the data matrix has been set. Each group is equivalent to a row in your data matrix (i.e., each subarray in an array of arrays). The group data objects have been assigned start angle and end angle values representing a section of the circle. This much is just like a pie graph, with the difference being that the values for each group (and for the circle as a whole) are calculated by summing up values for the entire row (subarray). The group data objects also have properties representing their index in the original matrix (important because they might be sorted into a different order) and their total value.

The chord data objects are accessed by calling .chords() on the chord layout after the data matrix has been set. Each chord represents two values in the data matrix, equivalent to the two possible relationships between two groups. For example, in @latortue09’s example, the relationships are bicycle trips between neighbourhoods, so the chord that represents trips between Neighbourhood A and Neighbourhood B represents the number of trips from A to B as well as the number from B to A. If Neighbourhood A is in row a of your data matrix and Neighbourhood B is in row b, then these values should be at data[a][b] and data[b][a], respectively. (Of course, sometimes the relationships you’re drawing won’t have this type of direction to them, in which case your data matrix should be symmetric, meaning that those two values should be equal.)

Each chord data object has two properties, source and target, each of which is its own data object. Both the source and target data object have the same structure with information about the one-way relationship from one group to the other, including the original indexes of the groups and the value of that relationship, and start and end angles representing a section of one group’s segment of the circle.

The source/target naming is kind of confusing, since as I mentioned above, the chord object represents both directions of the relationship between two groups. The direction that has the larger value determines which group is called source and which is called target. So if there are 200 trips from Neighbourhood A to Neighbourhood B, but 500 trips from B to A, then the source for that chord object will represent a section of Neighbourhood B’s segment of the circle, and the target will represent part of Neighbourhood A’s segment of the circle. For the relationship between a group and itself (in this example, trips that start and end in the same neighbourhood), the source and target objects are the same.

One final important aspect of the chord data object array is that it only contains objects where relationships between two groups exist. If there are no trips between Neighbourhood A and Neighbourhood B in either direction, then there will be no chord data object for those groups. This becomes important when updating from one dataset to another.

Second, the data-visualization aspect. The Chord Layout tool creates arrays of data objects, converting information from the data matrix into angles of a circle. But it doesn’t draw anything. To create the standard SVG representation of a chord diagram, you use d3 selections to create elements joined to an array of layout data objects. Because there are two different arrays of layout data objects in the chord diagram, one for the chords and one for the groups, there are two different d3 selections.

In the simplest case, both selections would contain <path> elements (and the two types of paths would be distinguished by class). The <path>s that are joined to the data array for the chord diagram groups become the arcs around the outside of the circle, while the <path>s that are joined to the data for the chords themselves become the bands across the circle.

The shape of a <path> is determined by its "d" (path data or directions) attribute. D3 has a variety of path data generators, which are functions that take a data object and create a string that can be used for a path’s "d" attribute. Each path generator is created by calling a d3 method, and each can be modified by calling it’s own methods.

The groups in a standard chord diagram are drawn using the d3.svg.arc() path data generator. This arc generator is the same one used by pie and donut graphs. After all, if you remove the chords from a chord diagram, you essentially just have a donut diagram made up of the group arcs. The default arc generator expects to be passed data objects with startAngle and endAngle properties; the group data objects created by the chord layout works with this default. The arc generator also needs to know the inside and outside radius for the arc. These can be specified as functions of the data or as constants; for the chord diagram they will be constants, the same for every arc.

var arcFunction = d3.svg.arc() //create the arc path generator
                               //with default angle accessors
                  .innerRadius( radius )
                  .outerRadius( radius + bandWidth); 
                               //set constant radius values

var groupPaths = d3.selectAll("path.group")
                 .data( chordLayout.groups() ); 
    //join the selection to the appropriate data object array 
    //from the chord layout 

groupPaths.enter().append("path") //create paths if this isn't an update
          .attr("class", "group"); //set the class
          /* also set any other attributes that are independent of the data */

groupPaths.attr("fill", groupColourFunction )
          //set attributes that are functions of the data
          .attr("d", arcFunction ); //create the shape
   //d3 will pass the data object for each path to the arcFunction
   //which will create the string for the path "d" attribute

The chords in a chord diagram have a shape unique to this type of diagram. Their shapes are defined using the d3.svg.chord() path data generator. The default chord generator expects data of the form created by the chord layout object, the only thing that needs to be specified is the radius of the circle (which will usually be the same as the inner radius of the arc groups).

var chordFunction = d3.svg.chord() //create the chord path generator
                                   //with default accessors
                    .radius( radius );  //set constant radius

var chordPaths = d3.selectAll("path.chord")
                 .data( chordLayout.chords() ); 
    //join the selection to the appropriate data object array 
    //from the chord layout 

chordPaths.enter().append("path") //create paths if this isn't an update
          .attr("class", "chord"); //set the class
          /* also set any other attributes that are independent of the data */

chordPaths.attr("fill", chordColourFunction )
          //set attributes that are functions of the data
          .attr("d", chordFunction ); //create the shape
   //d3 will pass the data object for each path to the chordFunction
   //which will create the string for the path "d" attribute

That’s the simple case, with <path> elements only. If you want to also have text labels associated with your groups or chords, then your data is joined to <g> elements, and the <path> elements and the <text> elements for the labels (and any other elements, like the tick mark lines in the hair-colour example) are children of the that inherit it’s data object. When you update the graph, you’ll need to update all the sub-components that are affected by the data.

Updating a chord diagram

With all that information in mind, how should you approach creating a chord diagram that can be updated with new data?

First, to minimize the total amount of code, I usually recommend making your update method double as your initialization method. Yes, you’ll still need some initialization steps for things that never change in the update, but for actually drawing the shapes that are based on the data you should only need one function regardless of whether this is an update or a new visualization.

For this example, the initialization steps will include creating the <svg> and the centered <g> element, as well as reading in the array of information about the different neighbourhoods. Then the initialization method will call the update method with a default data matrix. The buttons that switch to a different data matrix will call the same method.

/*** Initialize the visualization ***/
var g = d3.select("#chart_placeholder").append("svg")
        .attr("width", width)
        .attr("height", height)
    .append("g")
        .attr("id", "circle")
        .attr("transform", 
              "translate(" + width / 2 + "," + height / 2 + ")");
//the entire graphic will be drawn within this <g> element,
//so all coordinates will be relative to the center of the circle

g.append("circle")
    .attr("r", outerRadius);

d3.csv("data/neighborhoods.csv", function(error, neighborhoodData) {

    if (error) {alert("Error reading file: ", error.statusText); return; }

    neighborhoods = neighborhoodData; 
        //store in variable accessible by other functions
    updateChords(dataset); 
    //call the update method with the default dataset url

} ); //end of d3.csv function

/* example of an update trigger */
d3.select("#MenOnlyButton").on("click", function() {
    updateChords( "/data/men_trips.json" );
    disableButton(this);
});

I’m just passing a data url to the update function, which means that the first line of that function will be a data-parsing function call. The resulting data matrix is used as the matrix for a new data layout object. We need a new layout object in order to keep a copy of the old layout for the transition functions. (If you weren’t going to transition the changes, you could just call the matrix method on the same layout to create the new one.) To minimize code duplication, I use a function to create the new layout object and to set all its options:

/* Create OR update a chord layout from a data matrix */
function updateChords( datasetURL ) {

  d3.json(datasetURL, function(error, matrix) {

    if (error) {alert("Error reading file: ", error.statusText); return; }

    /* Compute chord layout. */
    layout = getDefaultLayout(); //create a new layout object
    layout.matrix(matrix);

    /* main part of update method goes here */

  }); //end of d3.json
}

Then on to the main part of the update-or-create drawing funcion: you’re going to need to break down all your method chains into four parts for data join, enter, exit and update. This way, you can handle the creation new elements during an update (e.g., new chords for groups that didn’t have a relationship in the previous data set) with the same code that you use to handle the original creation of the visualization.

First, the data join chain. One for the groups and one for the chords.
To maintain object constancy through transitions — and to reduce the number of graphical properties you have to set on update — you’ll want to set a key function within your data join. By default, d3 matches data to elements within a selection based only on their order in the page/array. Because our chord layout’s .chords() array doesn’t include chords were there is zero relationship in this data set, the order of the chords can be inconsistent between update rounds. The .groups() array could also be re-sorted into orders that don’t match the original data matrix, so we also add a key function for that to be safe. In both cases, the key functions are based on the .index properties that the chord layout stored in the data objects.

/* Create/update "group" elements */
var groupG = g.selectAll("g.group")
    .data(layout.groups(), function (d) {
        return d.index; 
        //use a key function in case the 
        //groups are sorted differently between updates
    });

/* Create/update the chord paths */
var chordPaths = g.selectAll("path.chord")
    .data(layout.chords(), chordKey );
        //specify a key function to match chords
        //between updates

/* Elsewhere, chordKey is defined as: */

function chordKey(data) {
    return (data.source.index < data.target.index) ?
        data.source.index  + "-" + data.target.index:
        data.target.index  + "-" + data.source.index;

    //create a key that will represent the relationship
    //between these two groups *regardless*
    //of which group is called 'source' and which 'target'
}

Note that the chords are <path> elements, but the groups are <g> elements, which will contain both a <path> and a <text>.

The variables created in this step are data-join selections; they will contain all the existing elements (if any) that matched the selector and matched a data value, and they will contain null pointers for any data values which did not match an existing element. They also have the .enter() and .exit() methods to access those chains.

Second, the enter chain. For all the data objects which didn’t match an element (which is all of them if this is the first time the visualization is drawn), we need to create the element and its child components. At this time, you want to also set any attributes that are constant for all elements (regardless of the data), or which are based on the data values that you use in the key function, and therefore won’t change on update.

var newGroups = groupG.enter().append("g")
    .attr("class", "group");
//the enter selection is stored in a variable so we can
//enter the <path>, <text>, and <title> elements as well

//Create the title tooltip for the new groups
newGroups.append("title");

//create the arc paths and set the constant attributes
//(those based on the group index, not on the value)
newGroups.append("path")
    .attr("id", function (d) {
        return "group" + d.index;
        //using d.index and not i to maintain consistency
        //even if groups are sorted
    })
    .style("fill", function (d) {
        return neighborhoods[d.index].color;
    });

//create the group labels
newGroups.append("svg:text")
    .attr("dy", ".35em")
    .attr("color", "#fff")
    .text(function (d) {
        return neighborhoods[d.index].name;
    });


//create the new chord paths
var newChords = chordPaths.enter()
    .append("path")
    .attr("class", "chord");

// Add title tooltip for each new chord.
newChords.append("title");

Note that the fill colours for the group arcs is set on enter, but not the fill colours for the chords. That’s because the chord colour is going to change depending on which group (of the two the chord connects) is called ‘source’ and which is ‘target’, i.e., depending on which direction of the relationship is stronger (has more trips).

Third, the update chain. When you append an element to an .enter() selection, that new element replaces the null place holder in the original data-join selection. After that, if you manipulate the original selection, the settings get applied to both the new and the updating elements. So this is where you set any properties that depend on the data.

//Update the (tooltip) title text based on the data
groupG.select("title")
    .text(function(d, i) {
        return numberWithCommas(d.value) 
            + " trips started in " 
            + neighborhoods[i].name;
    });

//update the paths to match the layout
groupG.select("path") 
    .transition()
        .duration(1500)
        .attr("opacity", 0.5) //optional, just to observe the transition
    .attrTween("d", arcTween( last_layout ) )
        .transition().duration(10).attr("opacity", 1) //reset opacity
    ;

//position group labels to match layout
groupG.select("text")
    .transition()
        .duration(1500)
        .attr("transform", function(d) {
            d.angle = (d.startAngle + d.endAngle) / 2;
            //store the midpoint angle in the data object

            return "rotate(" + (d.angle * 180 / Math.PI - 90) + ")" +
                " translate(" + (innerRadius + 26) + ")" + 
                (d.angle > Math.PI ? " rotate(180)" : " rotate(0)"); 
            //include the rotate zero so that transforms can be interpolated
        })
        .attr("text-anchor", function (d) {
            return d.angle > Math.PI ? "end" : "begin";
        });

// Update all chord title texts
chordPaths.select("title")
    .text(function(d) {
        if (neighborhoods[d.target.index].name !== 
                neighborhoods[d.source.index].name) {

            return [numberWithCommas(d.source.value),
                    " trips from ",
                    neighborhoods[d.source.index].name,
                    " to ",
                    neighborhoods[d.target.index].name,
                    "\n",
                    numberWithCommas(d.target.value),
                    " trips from ",
                    neighborhoods[d.target.index].name,
                    " to ",
                    neighborhoods[d.source.index].name
                    ].join(""); 
                //joining an array of many strings is faster than
                //repeated calls to the '+' operator, 
                //and makes for neater code!
        } 
        else { //source and target are the same
            return numberWithCommas(d.source.value) 
                + " trips started and ended in " 
                + neighborhoods[d.source.index].name;
        }
    });

//update the path shape
chordPaths.transition()
    .duration(1500)
    .attr("opacity", 0.5) //optional, just to observe the transition
    .style("fill", function (d) {
        return neighborhoods[d.source.index].color;
    })
    .attrTween("d", chordTween(last_layout))
    .transition().duration(10).attr("opacity", 1) //reset opacity
;

//add the mouseover/fade out behaviour to the groups
//this is reset on every update, so it will use the latest
//chordPaths selection
groupG.on("mouseover", function(d) {
    chordPaths.classed("fade", function (p) {
        //returns true if *neither* the source or target of the chord
        //matches the group that has been moused-over
        return ((p.source.index != d.index) && (p.target.index != d.index));
    });
});
//the "unfade" is handled with CSS :hover class on g#circle
//you could also do it using a mouseout event on the g#circle

The changes are done using d3 transitions to create a smooth shift from one diagram to another. For the changes to the path shapes, custom functions are used to do the transition while maintaining the overall shape. More about those below.

Fourth, the exit() chain. If any elements from the previous diagram no longer have a match in the new data — for example, if a chord doesn’t exist because there are no relationships between those two groups (e.g., no trips between those two neighbourhoods) in this data set — then you have to remove that element from the visualization. You can either remove them immediately, so they disappear to make room for transitioning data, or you can use a transition them out and then remove. (Calling .remove() on a transition-selection will remove the element when that transition completes.)

You could create a custom transition to make shapes shrink into nothing, but I just use a fade-out to zero opacity:

//handle exiting groups, if any, and all their sub-components:
groupG.exit()
    .transition()
        .duration(1500)
        .attr("opacity", 0)
        .remove(); //remove after transitions are complete


//handle exiting paths:
chordPaths.exit().transition()
    .duration(1500)
    .attr("opacity", 0)
    .remove();

About the custom tween functions:

If you just used a default tween to switch from one path shape to another, the results can look kind of strange. Try switching from “Men Only” to “Women Only” and you’ll see that the chords get disconnected from the edge of the circle. If the arc positions had changed more significantly, you would see them crossing the circle to reach their new position instead of sliding around the ring.

That’s because the default transition from one path shape to another just matches up points on the path and transitions each point in a straight line from one to the other. It works for any type of shape without any extra code, but it doesn’t necessarily maintain that shape throughout the transition.

The custom tween function lets you define how the path should be shaped at every step of the transition. I’ve written up comments about tween functions here and here, so I’m not going to rehash it. But the short description is that the tween function you pass to .attrTween(attribute, tween) has to be a function that gets called once per element, and must itself return a function that will be called at every “tick” of the transition to return the attribute value at that point in the transition.

To get smooth transitions of path shapes, we use the two path data generator functions — the arc generator and the chord generator — to create the path data at each step of the transition. That way, the arcs will always look like arcs and the chords will always look like chords. The part that is transitioning is the start and end angle values. Given two different data objects that describe the same type of shape, but with different angle values, you can use d3.interpolateObject(a,b) to create a function that will give you an object at each stage of the transition that with appropriately transitioned angle properties. So if you have the data object from the old layout and the matching data object from the new layout, you can smoothly shift the arcs or chords from one position to the other.

However, what should you do if you don’t have an old data object? Either because this chord didn’t have a match in the old layout, or because this is the first time the visualization is drawn and there is no old layout. If you pass an empty object as the first parameter to d3.interpolateObject, the transitioned object will always be exactly the final value. In combination with other transitions, such as opacity, this could be acceptable. However, I decided to make the transition such that it starts with a zero-width shape — that is, a shape where the start angles match the end angles — and then expands to the final shape:

function chordTween(oldLayout) {
    //this function will be called once per update cycle

    //Create a key:value version of the old layout's chords array
    //so we can easily find the matching chord 
    //(which may not have a matching index)

    var oldChords = {};

    if (oldLayout) {
        oldLayout.chords().forEach( function(chordData) {
            oldChords[ chordKey(chordData) ] = chordData;
        });
    }

    return function (d, i) {
        //this function will be called for each active chord

        var tween;
        var old = oldChords[ chordKey(d) ];
        if (old) {
            //old is not undefined, i.e.
            //there is a matching old chord value

            //check whether source and target have been switched:
            if (d.source.index != old.source.index ){
                //swap source and target to match the new data
                old = {
                    source: old.target,
                    target: old.source
                };
            }

            tween = d3.interpolate(old, d);
        }
        else {
            //create a zero-width chord object
            var emptyChord = {
                source: { startAngle: d.source.startAngle,
                         endAngle: d.source.startAngle},
                target: { startAngle: d.target.startAngle,
                         endAngle: d.target.startAngle}
            };
            tween = d3.interpolate( emptyChord, d );
        }

        return function (t) {
            //this function calculates the intermediary shapes
            return path(tween(t));
        };
    };
}

(Check the fiddle for the arc tween code, which is slightly simpler)

Live version altogether: http://jsfiddle.net/KjrGF/12/

Leave a Comment