MAT225 Section Summary: 7.4

The Singular Value Decomposition (SVD):

Fundamental Theorem of Linear Algebra

Summary

That's right: The Fundamental Theorem of Linear Algebra. The SVD ties it all together. Rather than focus on the technicalities, I want to focus on the ``bang''. If you can understand the Singular Value Decomposition, then you understand this course. If you are weak in any part, then you will not really understand this theorem. Understanding this section is the best preparation for the final exam.

The Singular Value Decomposition: Let A be an tex2html_wrap_inline334 matrix with rank r. There exists

  1. an tex2html_wrap_inline334 matrix

    displaymath308

    for which D is diagonal, with positive entries (the singular values) tex2html_wrap_inline342 , and

  2. orthogonal matrices tex2html_wrap_inline344 and tex2html_wrap_inline346
such that

displaymath309

The 0 matrices are included simply to pad D (if necessary) to make the dimensions right. Here is tex2html_wrap_inline352 with the dimensions indicated explicitly:

displaymath310

For example, if A happens to be invertible, then there are no zero matrices, and tex2html_wrap_inline356 .

Now your first impulse might be to say ``so what?'' (But don't say it in my hearing!) Understanding this is the true key to understanding A either

  1. as an image made up as rank-one subimages, or
  2. as a linear transformation tex2html_wrap_inline360 taking a ball to an ellipsoid
(my two favorite applications of matrices).

You might also want to know what this has to do with all the symmetric matrices we've been looking at: the connection is crucial. One way that the SVD arises is by considering the problem of solving a constrained optimization problem:

What is the maximum value of tex2html_wrap_inline362 given that tex2html_wrap_inline364 ?

Solving this is equivalent to solving the problem

What is the maximum value of tex2html_wrap_inline366 given that tex2html_wrap_inline364 ?

But this is equivalent to solving

What is the maximum value of tex2html_wrap_inline370 given that tex2html_wrap_inline364 ?

So it's equivalent to solving a problem about constrained optimization of quadratic forms....

Now, to make my life easy, I'm going to think of tex2html_wrap_inline374 (A is rectangular, and its ``height'' is greater than or equal to its ``width'').

The matrix tex2html_wrap_inline378 is positive semi-definite: that is, it has positive ( tex2html_wrap_inline380 ) eigenvalues, which we can calculate as

displaymath311

Now, here's a curious fact, which I'm going to gloss over entirely (in order to get to the ``bang!''): tex2html_wrap_inline382 (which is also positive semi-definite) can be orthogonally diagonalized as

displaymath312

(Same tex2html_wrap_inline384 ! - same eigenvalues). Again, the zero matrices are only included to (possibly) make the dimensions work out. If r=m, then they disappear. Now for the kicker:

  1. the singular values of A, tex2html_wrap_inline390 , tex2html_wrap_inline392 , ..., tex2html_wrap_inline394 are the positive square roots of the elements of tex2html_wrap_inline384 ! I.e., tex2html_wrap_inline398 .
  2. The singular vectors are the eigenvectors of tex2html_wrap_inline378 and tex2html_wrap_inline382 .

It's easy to check that the SVD formula recreates tex2html_wrap_inline378 and tex2html_wrap_inline382 . Give it a try!

Now the best way to think about A from the standpoint of the image problem is to throw out the irrelevant stuff, and write

displaymath313

where we've thrown out the eigenvectors of tex2html_wrap_inline378 and the eigenvectors of tex2html_wrap_inline382 corresponding to the zero weights of tex2html_wrap_inline352 . This might be called the reduced SVD of A. Hence we can write a decomposition analogous to the spectral decomposition of a symmetric matrix for A, as

displaymath314

This is the matrix A composed of a sum of rank-one outer product matrices, weighted from most important to least (by the size of tex2html_wrap_inline422 ). An example of this can be found at http://www.nku.edu/ tex2html_wrap_inline424 longa/classes/mat225/days/dad/Abe.jpg

You know that the rank of A and tex2html_wrap_inline428 is the same. The row space of one and the column space of the other are equivalent. The SVD makes that clear.

From the standpoint of the transformation tex2html_wrap_inline360 , taking a ball in tex2html_wrap_inline432 to an ellipsoid in tex2html_wrap_inline434 , the better way to think of A is as

displaymath309

where by tex2html_wrap_inline438 we understand the succession of transformations tex2html_wrap_inline440 as follows:

  1. displaymath316

    (the expression of tex2html_wrap_inline442 in the basis of V, as projections onto the basis vectors). This is effectively a rotation/reflection of the unit (radius) ball in tex2html_wrap_inline432 into position for easy scaling.

  2. displaymath317

    represents the scaling of the vector along each of the principal axes (the conversion of a ball into an ellipsoid), including possible squashing of some of the dimensions corresponding to the null-space of A. The resulting vectors are in tex2html_wrap_inline434 .

  3. Then

    displaymath318

    represents the rotation/reflection of the resulting ellipsoid in tex2html_wrap_inline434 so that the result is oriented as it should be, since tex2html_wrap_inline438 is not necessarily alligned well with the standard basis. This is the image of the vector tex2html_wrap_inline442 in the column space of A, as a linear combination of the column space basis U of A.

Here's a nice picture due to Cliff Long and Tom Hern that captures these three steps, at

http://www.nku.edu/ tex2html_wrap_inline424 longa/classes/mat225/days/dad/SVD.jpg

The rank of a matrix is equal to the number of non-zero singular values. The condition number (or at least the most common definition of it) is given as the ratio of the largest to smallest singular values (infinity if rank tex2html_wrap_inline466 , since it's as if the matrix has a singular value of 0).

It is all really too marvellous for words. We'll need to look at some pictures, and a few example problems.

Example: #15, p. 481

Note: in order to calculate the SVD of a matrix, you need only find D and either U or V - you don't need both. That's because if tex2html_wrap_inline474 , then if we know D and V, then tex2html_wrap_inline480

This leads to the idea of a pseudo-inverse of a matrix:

displaymath319

This is the closest thing to an inverse that a general matrix A has!

There's a nice picture due to Cliff Long and Tom Hern at

http://www.nku.edu/ tex2html_wrap_inline424 longa/classes/mat225/days/dad/pseudo.jpg

Example: #9, p. 481


LONG ANDREW E
Sat Jan 29 21:08:51 EST 2011