In Praise of the Gershgorin Disc Theorem

Posted by Tom Leinster

I’m revising the notes for the introductory linear algebra class that I teach, and wondering whether I can find a way to fit in the wonderful but curiously unpromoted Gershgorin disc theorem.

The Gershgorin disc theorem is an elementary result that allows you to make very fast deductions about the locations of eigenvalues. For instance, it lets you look at the matrix

( 3 i 1 − 1 4 + 5 i 2 2 1 − 1 ) \begin{pmatrix} 3 &i &1 \\ -1 &4 + 5i &2 \\ 2 &1 &-1 \end{pmatrix}

and see, with only the most trivial mental arithmetic, that the real parts of its eigenvalues must all lie between − 4 -4 and 7 7 and the imaginary parts must lie between − 3 -3 and 8 8 .

I wasn’t taught this theorem as an undergraduate, and ever since I learned it a few years ago, have wondered why not. I feel ever so slightly resentful about it. The theorem is so useful, and the proof is a pushover. Was it just me? Did you get taught the Gershgorin disc theorem as an undergraduate?

Here’s the statement: Theorem (Gershgorin) Let A = ( a i j ) A = (a_{i j}) be a square complex matrix. Then every eigenvalue of A A lies in one of the Gershgorin discs { z ∈ ℂ : | z − a i i | ≤ r i } \{ z \in \mathbb{C} \colon |z - a_{i i}| \leq r_i \} where r i = ∑ j ≠ i | a i j | r_i = \sum_{j

eq i} |a_{i j}| . For example, if A = ( 3 i 1 − 1 4 + 5 i 2 2 1 − 1 ) A = \begin{pmatrix} 3 &i &1 \\ -1 &4 + 5i &2 \\ 2 &1 &-1 \end{pmatrix} (as above) then the three Gershgorin discs have: centre 3 3 and radius | i | + | 1 | = 2 |i| + |1| = 2 ,

and radius , centre 4 + 5 i 4 + 5i and radius | − 1 | + | 2 | = 3 |-1| + |2| = 3 ,

and radius , centre − 1 -1 and radius | 2 | + | 1 | = 3 |2| + |1| = 3 . Gershgorin’s theorem says that every eigenvalue lies in the union of these three discs. My statement about real and imaginary parts follows immediately. Even the proof is pathetically simple. Let λ \lambda be an eigenvalue of A A . Choose a λ \lambda -eigenvector x x , and choose i i so that | x i | |x_i| is maximized. Taking the i i th coordinate of the equation A x = λ x A x = \lambda x gives ( λ − a i i ) x i = ∑ j ≠ i a i j x j . (\lambda - a_{i i})x_i = \sum_{j

eq i} a_{i j} x_j. Now take the modulus of each side: | λ − a i i | | x i | = | ∑ j ≠ i a i j x j | ≤ ∑ j ≠ i | a i j | | x j | ≤ ( ∑ j ≠ i | a i j | ) | x i | = r i | x i | |\lambda - a_{i i}| |x_i| = \left| \sum_{j

eq i} a_{i j} x_j \right| \leq \sum_{j

eq i} |a_{i j}| |x_j| \leq \left( \sum_{j

eq i} |a_{i j}| \right) |x_i| = r_i |x_i| where to get the inequalities, we used the triangle inequality and then the maximal property of | x i | |x_i| . Cancelling | x i | |x_i| gives | λ − a i i | ≤ r i |\lambda - a_{i i}| \leq r_i . And that’s it! The theorem is often stated with a supplementary part that gives further information about the location of the eigenvalues: if the union of k k of the discs forms a connected-component of the union of all of them, then exactly k k eigenvalues lie within it. In the example shown, this tells us that there’s exactly one eigenvalue in the blue disc at the top right and exactly two eigenvalues in the union of the red and green discs. (But the theorem says nothing about where those two eigenvalues are within that union.) That’s harder to prove, so I can understand why it wouldn’t be taught in a first course. But the main part is entirely elementary in both its statement and its proof, as well as being immediately useful. As far as that main part is concerned, I’m curious to know: when did you first meet Gershgorin’s disc theorem?

Posted at August 9, 2016 5:21 PM UTC