I’d like D3 to become the standard library of data visualization: not just a tool you use directly to visualize data by writing code, but also a suite of tools that underpin more powerful software.

To this end, D3 espouses abstractions that are useful for any visualization application and rejects the tyranny of charts.

As Leland Wilkinson wrote in The Grammar of Graphics,

If we endeavor to develop a charting instead of a graphing program, we will accomplish two things. First, we inevitably will offer fewer charts than people want. Second, our package will have no deep structure. Our computer program will be unnecessarily complex, because we will fail to reuse objects or routines that function similarly in different charts. And we will have no way to add new charts to our system without generating complex new code. Elegant design requires us to think about a theory of graphics, not charts.

If visualization is constructing “visual representations of abstract data to amplify cognition”, then perhaps the most important concept in D3 is the scale, which maps a dimension of abstract data to a visual variable.

And now scales are available in a standalone library, d3-scale.

But what is a “dimension” of data? Or a “visual variable”? Consider a table of data, as in a spreadsheet. Each row in the table is a vector, and each column is a dimension. A dimension is just a named attribute whose values have a particular meaning, such as a price in dollars.

We typically think of dimensions as spatial and quantitative, such as a position in space represented by real numbers ⟨x, y, z⟩. Yet with abstract data there are also non-quantitative dimensions; for example, diamond cut quality (fair, good, very good, ideal) is ordinal, while diamond cut shape (princess, round, marquise, etc.) is categorical.

Visual variables are best explained by Jacques Bertin in Semiology of Graphics. He described how graphical marks (say, dots in a scatterplot) can represent data using planar position ⟨x, y⟩ and a luminous dimension z:

Within the plane a mark can be at the top or the bottom, to the right or the left. The eye perceives two independent dimensions along X and Y, which are distinguished orthogonally. A variation in light energy produces a third dimension in Z, which is independent of X and Y… The eye is sensitive, along the Z dimension, to 6 independent visual variables, which can be superimposed on the planar figures: the size of the marks, their value, texture, color, orientation, and shape. They can represent differences (≠), similarities (≡), a quantified order (Q), or a nonquantified order (O), and can express groups, hierarchies, or vertical movements.

From Semiology of Graphics, colorized by the author.

Thus, a scale is a function that takes an abstract value of data, such as the mass of a diamond in carats, and returns a visual value such as the horizontal position of a dot in pixels. With two scales (one each for x and y), we have the basis for a scatterplot.