Using D3.js – part 2

 

Last week, we made a simple interactive bar chart using the javascript library, D3.js. Today, we are going to build on this by making a simple scatterplot and adding axes. We will build this from the beginning to review all steps of the process.

First we need to our data. We will be using come data compiled by Nate Silver’s 538. In his story “Do Pulitzers Help Newspapers Keep Readers,” Nate Silver uses a scatterplot to look at the relationship between the number of Pulitzer winners and finalists and newspapers’ change in circulation during the last 10 years.

We are going to pull their data and look at it in a slightly different way. Instead of looking at the change in circulation, we are just going to look at current circulation and compare it to the number of Pulitzer winners and finalists during the last 10 years. I guess our headline would be “Do the Biggest Newspapers Win the Most Pulitzers.”

We can grab the data they compiled from their GitHub site. We could actually just tell D3 to read a CSV file and point it to the raw file hosted on Github, but that adds a few extra steps beyond the scope of this example.

Regardless, I have pulled together the data we need. Remember, we start by creating a variable called “dataset” and fill it with our data. For the bar graph, we used an array: [value 1, value 2]. This time we will use an array of arrays (see below). In the first row, we create the variable and open the array. In the second row, we enter another ray with two values 20 and 2.5. Twenty is the number of Pulitzers (or finalists) this organization has had and is the value for the y-axis; 2.5 is the circulation (in millions) and the value on the x-axis).

var dataset = [
 [20, 2.5],
 [62, 1.9],
 [1, 1.7],
 [41, 0.7],
 [2, 0.6],
 [2, 0.5],
 [0, 0.5],
 [48, 0.5],
 [1, 0.4],
 [8, 0.4],
 [15, 0.4],
 [6, 0.4],
 [6, 0.4],
 [3, 0.4],
 [2, 0.3],
 [6, 0.3],
 [11, 0.3],
 [7, 0.3],
 [8, 0.3],
 [4, 0.3],
 [2, 0.3],
 [2, 0.2],
 [15, 0.2],
 [5, 0.2],
 [5, 0.2],
 ];

Next, we need to define the size of our graphic. This is as simple as creating a variable for the width and the height.

var w = 700;
var h = 500;

Now we can create the graphic. First, we need to create the wrapper for the graphic (this is the space on the page where the graphic will be inserted.

var svg = d3.select("body")
 .append("svg")
 .attr("width", w)
 .attr("height", h);

Basically, the above code says, “Use D3 to append an ‘SVG’ or Scalable Vector Graphic to the body and make it ‘w’ width and ‘h’ height.” Now we just need to tell D3 what to put in that space.

svg.selectAll("circle")
 .data(dataset)
 .enter()
 .append("circle")
 .attr("cx", function(d){
 return d[0];
 })
 .attr("cy", function(d) {
 return d[1];
 })
 .attr("r", 5);

This code starts similar to the last example by appending a circle to each item in our dataset. Then it adds some attributes to the circles with the “.attr” class.

The first attribute is “cx” or the position on the x axis. We pull the value for this attribute by using the magical “d” variable. Remember, the “d” variable, when passed into a function, will automatically cycle through our whole dataset creating a value for each item. In our last example we used it to define the height of each bar. Now we are using it to define the position on the x axis. The value the function returns is d[0] or the value in the zero position – which is the first number – for each item in our dataset.

Next we do the same thing for the “cy” attribute. Except we return d[1] or the second number in our array. Finally, we must define the radius of our circle. For now we will just give each circle a radius of 5 px.

We can now see our amazing graphic.

WHAT?!?! What happened?

Here’s the code. What is going wrong?

55490060

Did you figure it out?

Yep! It’s that we need to scale the variables! Remember in the bar chart, we used the following code to scale up the height.

.style("height", function(d) {
	var barHeight = d * 5;
	return barHeight + "px";
});

We multiplied the data by 5 to make the bars tall enough for the page. This code worked, but we can also have D3 do this for us. We should have D3 do it for three reasons:

  1. Works better for axes (that we’ll be doing soon enough)
  2. It’s easier. When we have a lot of data its not always easy to guess what you should multiply by to make it look nice.
  3. It’s adaptable. If we end up needing to add data to our chart, it might throw of our manual scaling. If we automate it, D3 will do it all for us.

To automate the scaling all we need to do is write a function which can create the magnitude for the scale and then change how the attribute creates the x and y-dimensions. We’ll start with the function:

 var xScale = d3.scale.linear()
   .domain([0, d3.max(
     dataset, function(d){
       return d[0]; 
     })])
   .range([0, w]);

What we’re doing here is assigning a function to the variable “xScale.” The function figures out the correct scale in order to fit all of our data in the space we allotted in the wrapper (i.e., 700 x 500 pixels). It does this through setting a domain and a range. The domain is values in our data; the range is the values of our output. In our example, the domain is 0 to 62 and the range is 0 to 700.

The above code calls the linear scale function built in to D3. Then we assign the domain and the range. The domain is set to 0 and the maximum value in our dataset. Then the range is set to 0 and the width of our space. Our data only goes up to 62, but if we all the sudden need to add a 100 to the data, D3, using the max function will find that value and adjust the scale properly.

We do the same thing for the y-axis.

 var yScale = d3.scale.linear()
 .domain([0, d3.max(
 dataset, function(d){
 return d[1]; 
 })])
 .range([h , 0]);

Then all we need to do is update the attributes in the code that actually draws the graphic.

 .attr("cx", function(d){
    return xScale(d[0]); 
 })
 .attr("cy", function(d) {
    return yScale(d[1]);
 })

You’ll notice that this isn’t much different than the last time around. All we added was instead of directly returning the value from our dataset, we run it through the function we created so it is properly scaled.

We can now see our updated amazing graphic.

What is wrong now?? And why is it happening?

Here is the code.

55490060

Did you figure it out?

You’re right…our circles on the edge are getting cut off. That’s because they are falling outside of the space we set up for our wrapper. For example, one of our data points has a value of zero. That means the center of the circle is on the “0” line, which is the very edge of the wrapper. So half the circle falls outside the wrapper.

We fix that by putting a little padding around the chart. We start by adding a variable to define the width of the padding. We don’t need much.

var padding = 30;

Then we change the scaling functions to incorporate the padding. Specifically, we need to change the range.

 var xScale = d3.scale.linear()
 .domain([0, d3.max(
 dataset, function(d){
 return d[0]; 
 })])
 .range([padding, w - padding]);

All that I changed was the last line of code. It was “0” and “w” or 700. Now it is “padding” or 30 and “w – padding” or 700 – 30.

You do the same thing for the yScale variable. You can see what it ends up look like here. Here is the current full code.

It really is starting to look nice, but we need to add some axes to this chart.

First we start by creating a variable that calls D3’s axis function. Like so:

 var xAxis = d3.svg.axis()
   .scale(xScale)
   .orient("bottom")
   .ticks(5);
 
 var yAxis = d3.svg.axis()
   .scale(yScale)
   .orient("left")
   .ticks(5);

So we call the function. Then tell it to use the scales we already set up, orient the axis on the bottom and left, respectively, and break the graphic up into 5 ticks, or major units in Excel parlance.

Next, we have to actually draw the lines on to the SVG we already created.

 svg.append("g")
 .attr("transform", "translate(0," + (h - padding) + ")")
 .call(xAxis);
 
 svg.append("g")
 .attr("transform", "translate(" + padding + ", 0)")
 .call(yAxis);

With the above code, we are adding to our SVG. Unlike the “circle” or “bar” we already drawn, we are now using the “g” class to draw a whole group of things, (e.g., lines, text). Then we use the call function to pull the math axis info we already defined. Finally, we add an attribute to move the axes to where we want them. It should look like this now:

Screen Shot 2015-03-31 at 1.58.30 AM

 

Next, we can style the axes using CSS and adding an attribute. The styling is just like styling the bars in the example from Thursday.

<style type="text/css">
 
 .axis path, 
 .axis line {
 fill: none;
 stroke: black; 
 shape-rendering: crispEdges;
 }
 
 .axis text {
 font-family: sans-serif;
 font-size: 11px;
 }
 </style>

Then we apply it via another attribute.

.attr(“class”, “axis”)

Finally, we have a good looking graphic with axes. Anything weird?

Here is the current code.