1 Introduction

Graphics is all about geometry. In particular, there are points that have a certain color, and the pattern of colors gives you a certain picture. You can change the way the points are distributed by applying certain transformations like translation (movement), rotation and scaling. While most people familiar with the subject know that these kinds of transformations are generally done with matrices, many seem unaware of what these operations actually mean. They know how to actually do a matrix multiplication, but why it's necessary or why to use this matrix and not that matrix. It's just a bit of magic you need to do.

In this document, I'll try to show how the magic works. This consists of two parts. First, I'll discuss a way of looking at points, vectors and coordinate systems that differs from what you might have learned in school. This perspective will give you a better understanding of what these concepts actually mean. After that, I'll explain what matrices mean in a geometrical environment, why they're used for these transformations and how to construct your own matrices with ease.

After that, I'll provide some examples so that you can see how it works in practice, and I'll probably derail to some things that might not have anything to do with matrices themselves, but are still related to analytical geometry and worthwhile to know.

2 Points, vectors and coordinate systems

2.1 Standard view of coordinates

Your first experience to the concept of coordinates was probably along the lines of Fig 1. First, you have a coordinate system, composed of a number of axes with numbers along them. Then you have points, given as sets of coordinates. To plot a point, you take its coordinates, find the lines of those coordinates on the axes and put a dot on the intersection of those lines.

Fig 1 is an example of this in two dimensions. The axes are marked by x and y, and you have a set of coordinate pairs for five points. The right-most image shows how to find the right position of point C, which has coordinates (x, y) = (3, 4). The lines x=3 and y=4 are given in blue, and their intersection marks the location of point C. The other points can be found in a similar fashion.

In this method, the coordinate system is fixed and the points live inside it, with its position indicated by its coordinates. This view works and even works well, but it has downsides too. For one, it gives the coordinate system and coordinates more important roles than they should have. Defining points by their coordinates hides the nature of both. It also hides the difference between points and vectors, since both are represented by sets of coordinates.

For an understanding of geometry, it's better to take a different view: instead of keeping the coordinate system fixed and the points inside it, consider the points as fixed entities in space, and the coordinate system imposed on this space. This is a better presentation of things, since geometry is about points and vectors, not coordinates.

2.2 Alternate view of coordinates

Points and vectors

The points-first perspective considers points and vectors to be the primary concepts, and the coordinates and coordinate systems as secondary. Before going to the latter group, I first have to define what points and vectors are.

Points are … well, they're points. It's hard to really describe something so basic. Technically, points are dimensionless entities indicating positions in space. They're the foundation of geometry; everything else is built up from points or describes relations between points.

A line is a connection between two points(1). A vector is a special kind of line. A vector is a geometric entity with a direction and a magnitude (length). Vectors are usually written in bold (u). Another notation often used is with an arrow over it ( ), but this is usually requires a special text editor or viewer, so I'll stick to bold.

While vectors are essentially the difference between two points, they aren't fixed to any position. You can have identical vectors at different locations (see Fig 3a). As long as the direction and magnitude are equal, they count as the same vector.

You can scale vectors by changing its magnitude (see Fig 3b). Scaling by two makes the line twice as long. A negative scale makes it point the other way, effectively reversing its direction. Scaling a vector is written as a multiplication by a number: cu. The c here is called a scalar , because it scales whatever it's multiplied with.

A vector is a step from one point to another. You can reach different points by concatenating vectors – putting them head to tail. The difference between the first and final point is another vector. This whole procedure works as an addition, hence it's written as such and named vector addition: w = u+v. Note that subtraction is just adding a negative, so you also have u = w + −v, or u = w-v. Also remember that multiplication is just repeated additions, which is also true for vectors. Doing u+u just gives a vector twice as long as the original u, which is the definition of a 2× scaling.

Scaling and adding vectors is called a linear combination . For example, if you have two vectors u and v, then a third vector, w, can be constructed via a linear combination of u and v : w = au + bv, where a and b are arbitrary scalars.

I know this is a mouthful, but what it really just means scaling terms adding them together. The concept of linear combinations is important in linear algebra and other fields, so try to remember it.

So – points exist in space; vectors are the differences between points. In a 2D space, you can reach any point P from other, O via a linear combination of two vectors, u and v. This can be summed up as follows.

(1)

In Eq 1, u and v are two arbitrary vectors, O is some reference point and P is the point we want to reach. The terms x and y are the scaling factors for u and v, respectively, that make the equation fit. To put in another way, to reach P from O, you need to take x steps along vector u and y steps along v.

Side note: Linear dependencies Technically, Eq 1 isn't quite complete. There is a condition to u and v, namely that they are linearly independent. Suppose you have a set of vectors. A vector from this set is to be linearly dependent when it can be formed via a linear combination of the others. And linearly independent when it can't. Linearly dependent vectors don't add additional information, and can be removed until you are left with only independent ones. For an N-dimensional space, you only need N linearly independent vectors to span the whole space. I'm assuming all my vectors here are linearly independent, because that's the only case of relevance.

Creating a coordinate systems

Eq 1 is a very important equation. It basically defines what coordinates and coordinate systems mean. A coordinate system is a way of dividing space. It consists of two things: an reference point called the origin , and a number of base vectors defining the principal axes of the system. The main purpose of a coordinate system is to assign sets of numbers to each point in the space. You can reach any point from the origin via a linear combination of the base vectors, exactly like Eq 1 says. The coordinates are the scaling factors for the base vectors: the number of steps along the base vectors needed to reach a point.

This may sound like a roundabout way of stating the obvious, but it's important to get these ground rules down. So, the coordinates of a point are the scalars in the linear combination of base vectors. The thing here is that the choice of base vectors (and the origin) is arbitrary: any set of vectors will do. Consequently, the values of the coordinates for a point will depend on your choice of coordinate system.

As a way of visualising this, consider the following. Take a sheet of paper and place some dots on it. This is the space with a number of points. Then take transparent sheet and draw a single point and two arrows springing from it. These are the origin and the base vectors of the coordinate system. With the base vectors, you can create a grid on this sheet to make it easier to read off coordinates. Now you can place this transparency over the space in any position and orientation and find the coordinates of the point. Placing it differently will give another set of coordinates.

Coordinate system examples

An example is given in Fig 4. First, there's the set of points from before. In Fig 4b, we take the lines AB and AC to be the base vectors u and v, respectively. The coordinate space formed by this looks like Fig 4c.

Note that I've marked the ticks on the main axes by multiples of u and v, not just as numbers. This is really how a coordinate system is supposed to work. You can consider the base vectors as the units of geometry, and the coordinates are the scalars indicating how many of units you need. This is just like working with metres, seconds and all that jazz. And just like those things, the units are usually ignored during calculations; but that doesn't mean they're not there.

Finally, in Fig 4d, the coordinate system is super-imposed over the space, given the points coordinates. In this case, point P can be reached by three steps along u and four along v, so its coordinates are (3, 4).

Now, that's one example of a coordinate system. But like I said, you can pick any kind of system you like. In Fig 5, I've used lines AB and AD for the base vectors. The coordinate system now looks a little different. But more importantly, the coordinates of P have changed as well! This is only natural, as different base vectors require different scalars as compensation.

Note that so far I haven't really attached any kind of numbers to the vectors yet; just the coordinates. While points and vectors aren't exactly the same, they are related. Vectors are the differences or distances between points. The coordinates of vectors are essentially the differences between those of the points. It's a small variation of Eq 1:

(2)

So the coordinates of a point form a vector (specifically, a coordinate vector ). Conversely, a vector has can be expressed as a set of coordinates. The base vectors are no exception to this. Following Eq 2, we have u = 1u+0v = (1, 0) and v = 0u+1v = (0, 1). These values shouldn't be too surprising, as that's the whole point of base vectors: in their own base, they are the unitary vectors. When you start to use them in another base, however, things change. And this brings us to the subject of coordinate transformations.

2.3 Coordinate transformations

As Fig 4 and Fig 5 show, a point will have different coordinates in different systems. The process of calculating one set of coordinates from another is called a coordinate transformation . What it actually means, though, is a change of bases: going from one set of base vectors to another.

Before I continue, I do have to do some renaming. Because we're now dealing with multiple systems, it's vital that the components don't all use the same names. Fig 6 shows the two systems I've been using. The one on the left can be recognized as the standard Cartesian coordinate system. I'll call this E, and its base vectors are e 1 and e 2 ; or E = {e 1 , e 2 } for short. The Cartesian coordinate system is more or less the standard form. It's so common that its existence is usually taken for granted. The second system is S={u, v}. This is not the standard system, and u and v can be any vector.

The same point can be expressed in the different systems by different coordinates (see Eq 3). The coordinates of S are x or x S , and those in E are x′ or x E .

(3)

Performing a coordinate transformation starts by placing one system inside another – usually the non-Cartesian system is embedded in the Cartesian one: S inside E (see Fig 7). This means that u and v are now ordinary vectors in E and can be expressed as such. In this particular case, u = (1, 0) and v = (1, 2).

(4)

With these expressions, you can map coordinates inside S to ones in E via linear combinations.

(5)

And indeed, the coordinates of P in E are (3, 4). You can also do the inverse in a similar manner, although the first step, expressing e 1 and e 2 in terms of u and v, is a bit trickier.

In the previous example, I expressed the same point in different systems. This is a passive coordinate transformation. An active transformation is when the points themselves change. What you do here is take the coordinates from one system and simply use them in the other. For example, point C in E and point D in S have the same coordinates: (0, 1). Transforming the latter back to E-coordinates, we find that D E = (1, 2). In effect, the transformation has scaled and slanted the vector.

Active vs passive transformations Both active and transformations use something like Eq 5. In a passive transformation, you start with the coordinates in the system you're transforming from (the source system); in an active one, you start use those of the destination system as if they were from the source. A passive transform takes the points as fixed, with two coordinate systems over it. An active transform lays down one system, locks it in, and then morphs into the other system, taking all the points with it.

Oh, and please do not forget that any kind of coordinate transformation always involves two systems, with two sets of base vectors, origins and coordinates. Just because one is usually implicit does not mean it is not there.

3 Enter the matrix

3.1 Geometric interpretation of matrices.

As mentioned before, vectors can be seen lists of coordinates. The two most common notations for vectors are a comma-separated list between parentheses or as a column-vector. As the name implies, a column-vector places the coordinates as a column. What vectors are for coordinates, matrices are for vectors. A matrix is essentially a set of column-vectors (see Eq 6). In a way, it is a concise notation for the base vectors of a coordinate system.

(6)

A matrix-vector multiplication is simply another way of writing down Eq 5. In the matrix-vector multiplication M·x, you scale the column(-vector)s of M by the coordinates of x and add the results. In other words, it's just the linear combination of Eq 5 again.

(7)

Eq 7 essentially defines how a matrix-vector multiplication works. Applying that to point P=(1, 2) again, we get the following:

(8)

As you can see, this is exactly the same as Eq 5.

When it comes to geometry, matrices are merely a notational device of writing down the base vectors of a coordinate system. Also, matrix-vector multiplications are shorthand for a linear combination, with the elements in the coordinate vector used as the scalars for the base vectors. Note that this is not te only interpretation for matrices, but for geometry it is the most useful one.

3.2 Examples of transformations

To see which matrix you need for a given coordinate transformation, all you need to do is look at the way the base vectors change. Base vectors e 1 and e 2 turn into u and v, respectively, and these vectors are the contents of the matrix. The coordinate transformation itself consists of using the old coordinates in the new system.

Here are two more examples of how this works in practice: a rotation and a scaling.

Rotation

A rotation keeps the length of a vector the same, but changes the direction. Effectively, it describes a movement along a circle. The coordinates will, of course, be some combination of cos θ and sin θ. Which combination will depend on how you define the angle, θ. To figure out where the cosine goes, remember that for a zero angle the cosine will be one, and the sine will be zero.

With θ defined as in Fig 8, it's easy to see that u = cosθ e 1 + sinθ e 2 , or u = (cos θ, sin θ); and v = (−sin θ, cos θ). The matrix M = [u v] is then:

Scaling

A scaling keeps the angles of lines constant, but changes the sizes. In fig 9, u = s x e 1 and v = s y e 2 . In other words, u=(s x , 0) and v=(0, s y ), giving the following matrix:

Note that the ‘F’ in Fig 9 is flipped vertically. Reflections are scalings as well – negative scalings to be precise. The values for Fig 9 are two for the horizontal stretch and negative half for the flip and shrink: s x = 2 and s y = −½.

Sketch first When you need to do transformations, always sketch the situation first. No, seriously, do, it will save you so much time. Note how the transformation changes the base vectors and the origin. Once you know that, you know the matrix you need.

3.3 Translations and homogeneous coordinates

In all the cases presented so far, I've kept the origin of the systems in the same place. Of course, this can vary as well. In Fig 10, the embedded coordinate system is moved by t = (t x , t y ). This kind of transformation is called a translation , and is represented by simply adding t to the matrix transformation:

(9)

Unfortunately, the translation cannot be captured by a matrix transformation. Or can it? Remember what the matrix-vector multiplication meant again: a linear combination. So what Eq 9 actually says is this:

(10)

(11)

In this equation, we now haveterms in the linear equation instead of just two:and, with coordinatesand, respectively. The translation is effectively another dimension. By extending all the vectors to 3 coordinates, you can include translation in the matrix as well.

These extended coordinates are called homogeneous coordinates . Because now every transformation can be written as a matrix, these things are everywhere in computer graphics. There are some special rules to work with them, but mostly it's just business as usual.

I would like to point out one thing though. The extra coordinate has a special meaning. If you look at the equations, you'll see that you only get a translation if it's non-zero. So a coordinate vector with an extended coordinate of zero is a true vector: something with a direction but no real location. If non-zero, you have the representation of a point.

3.4 Row vectors

Expressing vectors as columns is the standard, but in some fields (glowers at Direct3D), you'll also see row-vectors. The math is still the same, only everything is transposed (mirrored along the diagonal).

(12)

This notation has one upside and two major downsides. The upside is that the elements are ordered like C-style matrices, making it easier to write them down in such languages.

The first downside is that is not the mathematical standard. This can be more troublesome than it sounds. When documents cover coordinate transformations, they tend to just give the matrices. This is a problem because the row-major and column-major matrices are each other's transposed, and if you use a matrix in the wrong environment, you'll almost certainly get the wrong effect. Do not blindly trust the matrices you see. Always find out if they were intended to be used on column-vectors or row-vectors first.

The second downside is probably a matter of personal preference, but I want to mention it anyway. There is a strong relation between the matrices and the base vectors of a coordinate system, as shown by Eq 7. The nice thing about column-vectors is that the transition from the form x' = xu + yv feels natural, with scaling and additions along one axis and the different dimensions in the other. This creates a nice block of equations. In row-vector form, however, everything gets dumped on one very long line, which just looks ugly. But perhaps that's just me.

3.5 Inverse transformations

If you can transform from coordinates in S to ones in E, it stands to reason you can do the reverse. The same principles apply as before: you need to know what the base vectors of the system you're transforming from are in terms of the system of the latter. In this case, that means knowing how e 1 and e 2 are constructed from u and v.

The inverse of a transformation is just another matrix. By definition, multiplying a matrix by its inverse gives back identity. The inverse matrix works similar to a division, which is visible in the notation: the inverse of matrix A is written as A−1. So we have A·A−1 = A−1·A = I.

Finding the matrix for the inverse transformation tends to be more difficult than the other thing. Because the system you're transforming to in this case tends to have non-Cartesian vectors, reading off the base vectors from a graph can be tricky. It can also be done algebraically, which can also be a real PITA, especially when the number of dimensions increases. Because I want to keep the amount of hardcore math on this page to a minimum, I'll just mention Cramer's Rule and an example of an iterative approach called the Gauss-Seidel method and leave it at that.

Well, maybe just a little more than that. The major geometric transformations have the very nice property that their inverses are just the normal version with different terms. For example, the inverse of a rotation by θ is a rotation by −θ. Finding their inverse matrix is a cinch when you take that into account.

I'll also derive the matrix inverse for a 2D case via row reduction. To do this, you start with a matrix containing both the matrix you want to convert, A and identity: [A I] and work your way to [ I A−1 ]. I'll call the elements of A, a, b, c and d to save typing. I'll also make use of the a quantity D = ad−bc.

(13)

Yeah, I know what the process looks like. Now remember that 2D is the simplest case; writing out the thing even for 3D is just horrible, which is why I'm not mentioning it here. The factor D I've used here is called the determinant . Inverse matrices always have 1/D in front of it. This gives a nice way of seeing if a matrix is invertible: if D is zero, it isn't.

The forward transformation (by which I mean morphing the Cartesian base vectors into non-Cartesian ones; see Fig 11) is easy to visualize. You re-orient the base vectors and use the same coordinates in both cases. This effectively means that the points move along with the base vectors.

The inverse has two interpretations; the first and obvious one being to undo the transformation. The second one is where you lay down the altered coordinate system over the un-altered image, and then change those vectors into E again (see Fig 12). This sort of thing is usually how texture mapping works, where you have to fill in scanlines on screen by sampling pixels from a texture.

4 Vectors and matrices in N-D space

N-D cases

Embedded spaces (note MxN * Nx1)

∞-D cases

5 Random interesting bits

Inner/outer products.

Matrix interpretations of inner/outer products.

Creating orthonormal bases.

Rotations around arbitrary axes.

6 Summary

A point marks a location in space. A vector is a line with a direction and magnitude, but not really a location. The difference between points is a vector. You can add vectors and scale them to create other vectors (Fig 3).

marks a location in space. A is a line with a direction and magnitude, but not really a location. The difference between points is a vector. You can add vectors and scale them to create other vectors (Fig 3). A linear combination is a summation of scaled vectors.

is a summation of scaled vectors. A coordinate system is a set of N base vectors and a reference point. A linear combination of the base vectors spans an N -dimensional space, and any point in that space can be reached via: (14) This is the equation to memorize. Everything else is derived from it.

system is a set of base vectors and a reference point. A linear combination of the base vectors spans an -dimensional space, and any point in that space can be reached via: This is equation to memorize. Everything else is derived from it. Coordinates are the scalar multipliers in Eq 14. The coordinates of point P represent how many steps along the base vectors you need to take to get there, starting at O . Different coordinate systems will result in different coordinates for the same point.

are the scalar multipliers in Eq 14. The coordinates of point represent how many steps along the base vectors you need to take to get there, starting at . Different coordinate systems will result in different coordinates for the same point. A coordinate transformation is either representing the same point with in different systems (passive transformation), or representing different points that share coordinates in different system (active transformation).

is either representing the same point with in different systems (passive transformation), or representing different points that share coordinates in different system (active transformation). Numerically, a (column-)vector is a column of coordinates. A matrix is a row of column vectors. A matrix-vector multiplication is a notational device for Eq 14.

is a column of coordinates. A is a row of column vectors. A matrix-vector multiplication is a notational device for Eq 14. Coordinate transformations always involve two coordinate systems, say, S and S ′. To go from coordinates x in S to x ′ in S ′, express the base vectors of S in terms of those of S ′ and take a linear combination using x for the multipliers. Yes, it's Eq 14 again. Hence, the matrix for this transformation are formed by the base vectors if S .

coordinate systems, say, and ′. To go from coordinates in to ′ in ′, express the base vectors of in terms of those of ′ and take a linear combination using for the multipliers. Yes, it's Eq 14 again. Hence, the matrix for this transformation are formed by the base vectors if . Homogeneous coordinates have an additional coordinate marking the coordinate vector as a point (non-zero) or a directional vector (zero). These extended vectors allow translations to be written as matrices as well.

have an additional coordinate marking the coordinate vector as a point (non-zero) or a directional vector (zero). These extended vectors allow translations to be written as matrices as well. While the column-vector notation of matrices is standard, you'll also see row-vector based ones. Do not blindly trust matrices you see in documents. Find out what representation they follow before use.

Changes

(20100325) fixed for comment 2

[[TODO]][[TODO]]