Further matrices

Introduction

(I've written this topic specifically for students taking MEI FP2.)

I'm going to introduce this topic by running through some important definitions.

A lower triangular matrix is a square matrix that has zeroes in all of the elements above and to the right of the diagonal $$ \begin{array}{cc} \left(\begin{array}{cc} 1 & 0 \\ 1 & 3 \end{array}\right) & \left(\begin{array}{ccc} 1 & 0 & 0 \\ 2 & 1 & 0 \\ 1 & 2 & 1 \end{array}\right) \end{array} $$

An upper triangular matrix is a square matrix that has zeroes in all of the elements below and to the left of the diagonal $$ \begin{array}{cc} \left(\begin{array}{cc} 1 & 2 \\ 0 & 1 \end{array}\right) & \left(\begin{array}{ccc} 4 & 2 & 5 \\ 0 & 3 & 1 \\ 0 & 0 & 2 \end{array}\right) \end{array} $$

A diagonal matrix is a square matrix that has zeroes in all of the elements except on the diagonal $$ \begin{array}{cc} \left(\begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array}\right) & \left(\begin{array}{ccc} 2 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 5 \end{array}\right) \end{array} $$

Now take the following square matrix $A$ of constant values $a_i$ $$ \mathbf{A} = \left(\begin{array}{cc} a_1 & a_2 \\ a_3 & a_4 \end{array}\right) $$ The transpose of this matrix is $$ \mathbf{A}^{\textrm{T}} = \left(\begin{array}{cc} a_1 & a_3 \\ a_2 & a_4 \end{array}\right) $$ Transposing a matrix swaps its rows and columns. The transpose of a lower triangular matrix is always an upper triangular matrix and vice versa, and a diagonal matrix is always equal to its transpose.

Strictly speaking, if $\{a_{ij}\}$ is an element of a matrix in row $i$ and column $j$ then $$ \{a_{ij}\}^{T} = \{a_{ji}\}, i\ne j $$

The matrix does not have to be square at all. You can find the transpose of any matrix you want. The transpose is just a matrix with the rows and columns swapped. If $\mathbf{A}$ is an $m\times n$ matrix then $\mathbf{A}^{\textrm{T}}$ is an $n\times m$ matrix.

Now check out this 2x3 matrix $$ \mathbf{B} = \left(\begin{array}{ccc} 1 & 2 & 3 \\ 4 & 5 & 6 \end{array}\right) $$ its transpose is $$ \mathbf{B}^{\textrm{T}} = \left(\begin{array}{cc} 1 & 4 \\ 2 & 5 \\ 3 & 6 \end{array}\right) $$

A matrix that is equal to its transpose, i.e. $\mathbf{A}^{\textrm{T}} = \mathbf{A}$ is symmetric. A matrix whose transpose is equal to -1 times itself, i.e. $\mathbf{A}^{\textrm{T}} = -\mathbf{A}$ is skew-symmetric.

Determinant of a 3x3 matrix

In FP2 you need to know how to find the determinants of $3\times 3$ matrices. It's a bit harder than the process for finding the determinant of a $2\times 2$ from FP1.

The determinant $|\mathbf{A}|$ of a $3\times 3$ matrix $\mathbf{A}$ is $$ |\mathbf{A}| = \left|\begin{array}{ccc} a & b & c \\ d & e & f \\ g & h & i \end{array}\right| = a\left|\begin{array}{cc} e & f \\ h & i \end{array}\right| - b\left|\begin{array}{cc} d & f \\ g & i \end{array}\right| + c\left|\begin{array}{cc} d & e \\ g & h \end{array}\right| $$

What we have done is go along one row of the matrix, and sum up each element multiplied by the determinant of the $2\times 2$ matrix formed by the elements not in the same row or column of that element. In this module I will denote the determinant $|\ldots|$ unlike in FP1 where I denoted it $\Delta$.

We call each element's corresponding $2\times 2$ determinant that element's minor.

However, notice that the $b$ term is made to be $-b$. That's because each element in the row or column you go along is actually multiplied by the $(i,j)$ cofactor, itself obtained by multiplying the minor by $(-1)^{i+j}$, where $i$ is the row and $j$ is the column.

In truth you can actually calculate the determinant by going along any straight row or column you want. The determinant you'll get is exactly the same. This can be useful if a row or column is mostly zeroes. But you need to remember the following convention for minusing cofactors when you go along a different row or column $$ \left(\begin{array}{cccc} + & - & + & \ldots \\ - & + & - & \ldots \\ + & - & + & \ldots \\ \vdots & \vdots & \vdots & \ddots \end{array}\right) $$ Many just go along the top row for simplicity, but this can be very handy to know if you want to make life simpler.

Example

Q) Find the determinant of $\left(\begin{array}{ccc} 4 & 2 & 1 \\ 0 & 0 & 1 \\ 2 & -3 & 1 \end{array}\right)$.

A) I'll walk you through finding the determinant using several rows. First let's go over the top row $$ \left|\begin{array}{ccc} 4 & 2 & 1 \\ 0 & 0 & 1 \\ 2 & -3 & 1 \end{array}\right| = 4\left|\begin{array}{cc} 0 & 1 \\ -3 & 1\end{array}\right| -2\left|\begin{array}{cc}0 & 1 \\ 2 & 1\end{array} \right| + \left|\begin{array}{cc}0 & 0 \\ 2 & -3\end{array} \right| $$

Recall that the determinant of a $2\times 2$ matrix $\left(\begin{array}{cc} a & b \\ c & d \end{array} \right)$ is $ad-bc$, therefore $$ 4\left|\begin{array}{cc} 0 & 1 \\ -3 & 1\end{array}\right| -2\left|\begin{array}{cc}0 & 1 \\ 2 & 1\end{array} \right| + \left|\begin{array}{cc}0 & 0 \\ 2 & -3\end{array} \right| = 4\left(0-(-3)\right)-2\left(0-2\right)+\left(0-0\right) = 16 $$

So the determinant is 16. Now I'm going to go over the second row because it has mostly zeroes. This means you can cancel out the first two terms in the calculation $$ -0\left|\begin{array}{cc}2 & 1 \\ -3 & 1\end{array} \right| + 0\left|\begin{array}{cc}4 & 1 \\ 2 & 1\end{array} \right| -1\left|\begin{array}{cc}4 & 2 \\ 2 & -3\end{array} \right| = -1\left(-12-4\right) = 16 $$

See how much easier the second time was? The zeroes cancel out their minors, greatly reducing the amount of work needed. Just make sure you remember to apply the right pluses and minuses.

Properties of the determinant

You should know some basic properties of the determinant of an $n\times n$ square matrix.

$|\mathbf{I}_n| = 1$ where $\mathbf{I}_n$ is the $n\times n$ identity matrix
$|\mathbf{A}^\textrm{T}| = |\mathbf{A}|$
$|\mathbf{A}^{-1}| = \frac{1}{|\mathbf{A}|}$
$|\mathbf{AB}| = |\mathbf{A}||\mathbf{B}|$ for square matrices of equal size
$|k\mathbf{A}| = k^n |\mathbf{A}|$ where $k$ is a scalar and $A$ is $n\times n$
The determinant of a diagonal matrix or either kind of triangular matrix is the product of its diagonal elements

Inverse of a 3x3 matrix

In FP2 you need to know how to invert a $3\times 3$ matrix. Like finding determinants it's a bit harder than inverting a $2\times 2$ and requires a bit of extra work.

The inverse of a $3\times 3$ matrix is its adjugate matrix divided by its determinant. A matrix's adjugate is a transposed matrix of the cofactors of each element. That means you replace each element in the matrix by that element's cofactor, found by multiplying its minor by $(-1)^{i+j}$, where $i$ is the row and $j$ is the column.

Remember this formula $$ \mathbf{A}^{-1} = \frac{1}{\left|\mathbf{A}\right|} \mathbf{C}^{\textrm{T}} $$ where $\left|\mathbf{A}\right|$ is the determinant and $\mathbf{C}$ is the matrix of cofactors, so that $\mathbf{C}^{\textrm{T}}$ is the adjugate matrix. Once again remember the following convention for determining if a cofactor is positive or negative $$ \left(\begin{array}{cccc} + & - & + & \ldots \\ - & + & - & \ldots \\ + & - & + & \ldots \\ \vdots & \vdots & \vdots & \ddots \end{array}\right) $$

As you learned in FP1, if a matrix has a determinant equal to zero, it is singular and doesn't have an inverse.

Example

Q) Find the inverse of $\left(\begin{array}{ccc} 4 & 2 & 1 \\ 0 & 0 & 1 \\ 2 & -3 & 1 \end{array}\right)$.

A) Let $\mathbf{A} = \left(\begin{array}{ccc} 4 & 2 & 1 \\ 0 & 0 & 1 \\ 2 & -3 & 1 \end{array}\right)$. In the last question we found $|\mathbf{A}|=16$ so we need to find the matrix of cofactors and then transpose it. $$ \mathbf{C} = \left(\begin{array}{ccc} \left(0-(-3)\right) & -\left(0-2\right) & \left(0-0\right) \\ -\left(2-(-3)\right) & \left(4-2\right) & -\left(-12-4\right) \\ \left(2-0\right) & -\left(4-0\right) & \left(0-0\right) \end{array}\right) = \left(\begin{array}{ccc} 3 & 2 & 0 \\ -5 & 2 & 16 \\ 2 & -4 & 0 \end{array}\right) $$

Then $$ \mathbf{C}^{\textrm{T}} = \left(\begin{array}{ccc} 3 & -5 & 2 \\ 2 & 2 & -4 \\ 0 & 16 & 0 \end{array}\right) $$ The inverse is therefore $$ \mathbf{A}^{-1}=\frac{1}{16} \left(\begin{array}{ccc} 3 & -5 & 2 \\ 2 & 2 & -4 \\ 0 & 16 & 0 \end{array}\right) = \left(\begin{array}{ccc} \frac{3}{16} & -\frac{5}{16} & \frac{1}{8} \\ \frac{1}{8} & \frac{1}{8} & -\frac{1}{4} \\ 0 & 1 & 0 \end{array}\right) $$

Simultaneous equations

As you learned in FP1, matrices can be used to solve simultaneous linear equations. With $3\times 3$ matrices we can solve systems with three unknowns in three simultaneous linear equations.

I keep repeating the word 'linear' because it's important - it means equations like $ax+b$, whereas $ax^2 +b$ is non-linear. This branch of maths is called 'linear algebra' for that reason.

Take the system of three simultaneous linear equations with unknowns $a_i$, variables $x,y,z$ and constants on the right-hand side $b_i$ $$ \begin{align} a_1 x + a_2 y + a_3 z &= b_1 \\ a_4 x + a_5 y + a_6 z &= b_2 \\ a_7 x + a_8 y + a_9 z &= b_3 \end{align} $$ The system can be expressed by $\mathbf{A}$ as the matrix of $a_i$, $\mathbf{x}$ for the variables $x,y,z$ and $\mathbf{b}$ for the constants $b_i$ $$ \left(\begin{array}{ccc} a_1 & a_2 & a_3 \\ a_4 & a_5 & a_6 \\ a_7 & a_8 & a_9 \end{array}\right)\left(\begin{array}{c} x \\ y \\ z \end{array}\right) = \left(\begin{array}{c} b_1 \\ b_2 \\ b_3 \end{array}\right) $$ Or in matrix algebra terms $$ \mathbf{Ax} = \mathbf{b} $$

The system can be solved for the three variables $x,y,z$ by left-multiplying both sides by $\mathbf{A}^{-1}$

$$ \begin{align} \mathbf{A}^{-1}\mathbf{Ax} &= \mathbf{A}^{-1}\mathbf{b} \\ \therefore ~ \mathbf{x} &= \mathbf{A}^{-1}\mathbf{b} \end{align} $$

This depends on the matrix $\mathbf{A}$ being non-singular. If $\mathbf{A}$ is singular then the system has no solution in $x,y,z$.

Example

Q) Solve the following system of equations for $x,y,z$ $$ \begin{align} -5x + 4 y -2z &= 2 \\ 8 x -2 y + z &= 2 \\ -2 x + \frac{1}{2}y + \frac{5}{2} z &= 4 \end{align} $$

A) Let $\mathbf{A}$ be a $3\times 3$ matrix of the $a_i$ coefficients, $\mathbf{x}$ be a $3\times 1$ matrix of the unknowns $x,y,z$, and $\mathbf{b}$ be a $3\times 1$ matrix of the constants on the right hand side. Then $$ \mathbf{Ax} = \mathbf{b} $$ That is, $$ \left(\begin{array}{ccc} -5 & 4 & -2 \\ 8 & -2 & 1 \\ -2 & \frac{1}{2} & \frac{5}{2} \end{array}\right)\left(\begin{array}{c} x \\ y \\ z \end{array}\right) = \left(\begin{array}{c} 2 \\ 2 \\ 4 \end{array}\right) $$

We will invert $\mathbf{A}$ and left-multiply both sides by the inverse to find $\mathbf{x}$. Firstly find the determinant by going along the top row $$ |\mathbf{A}| = -5 \left|\begin{array}{cc} -2 & 1 \\ \frac{1}{2} & \frac{5}{2} \end{array}\right| -4\left|\begin{array}{cc} 8 & 1 \\ -2 & \frac{5}{2} \end{array}\right| -2\left|\begin{array}{cc} 8 & -2 \\ -2 & \frac{1}{2} \end{array}\right| $$ Then evaluating each term $$ -5\left(-5-\frac{1}{2}\right) -4\left(20-(-2)\right)-2\left(4-4\right) = -\frac{121}{2} = -60.5 $$

So the determinant is non-zero, which means $\mathbf{A}$ is non-singular and a solution exists for $x,y,z$. Now to calculate the inverse of $\mathbf{A}$. First find the matrix of cofactors $\mathbf{C}$ $$ \mathbf{C} = \left(\begin{array}{ccc} \left(-5-\frac{1}{2}\right) & -\left(20-(-2)\right) & \left(4-4\right) \\ -\left(10-(-1)\right) & \left(-12.5-4\right) & -\left(-2.5-(-8)\right) \\ \left(4-4\right) & -\left(-5-(-16)\right) & \left(10-32\right) \end{array}\right) = \left(\begin{array}{ccc} -5.5 & -22 & 0 \\ -11 & -16.5 & -5.5 \\ 0 & -11 & -22 \end{array}\right) $$ Then transpose it and divide by the determinant to find the inverse $$ \mathbf{A}^{-1} = \frac{1}{|\mathbf{A}|}\mathbf{C}^{\textrm{T}} = -\frac{1}{60.5}\left(\begin{array}{ccc} -5.5 & -11 & 0 \\ -22 & -16.5 & -11 \\ 0 & -5.5 & -22 \end{array}\right) $$ Now to left multiply $\mathbf{b}$ by $\mathbf{A}^{-1}$ to find $\mathbf{x}$ $$ \mathbf{x} = -\frac{1}{60.5}\left(\begin{array}{ccc} -5.5 & -11 & 0 \\ -22 & -16.5 & -11 \\ 0 & -5.5 & -22 \end{array}\right)\left(\begin{array}{c} 2 \\ 2 \\ 4 \end{array}\right) = -\frac{1}{60.5} \left(\begin{array}{c} \left(-11-22\right) \\ \left(-44-33-44\right) \\ \left(-11-88\right) \end{array}\right) = \left(\begin{array}{c} 6/11 \\ 2 \\ -18/11 \end{array}\right) $$

Therefore the solution is $(x,y,z) = (\frac{6}{11},2,-\frac{18}{11})$.

Eigenvalues and eigenvectors

Take the following matrix multiplication.

$$ \left(\begin{array}{cc} 1 & 3 \\ 3 & 1 \end{array}\right)\left(\begin{array}{c}1 \\ 1\end{array}\right) = \left(\begin{array}{c}4 \\ 4\end{array}\right) $$

What do you notice? The result of multiplying the $2\times 2$ matrix with the $2\times 1$ matrix is a scalar multiple of the $2\times 1$ matrix.

The scalar multiplier 4 is an eigenvalue of the $2\times 2$ matrix, and the original $2\times 1$ matrix $$\left(\begin{array}{c}1 \\ 1\end{array}\right)$$ is a corresponding eigenvector paired with that eigenvalue.

Every square matrix $\mathbf{A}$ with dimensions $n\times n$ has $n$ eigenvalues and $n$ eigenvectors each paired with an eigenvalue. In matrix algebra terms we say that $$ \mathbf{Av} = \lambda\mathbf{v} $$ where $\mathbf{v}$ is an eigenvector and $\lambda$ is an eigenvalue. It means that there is a vector (an $n\times 1$ matrix) such that left-muliplying the vector by the matrix $\mathbf{A}$ just has the same effect as multiplying the vector by some number.

In FP2 you will need to know how to find the eigenvalues and corresponding eigenvectors of a square matrix. The first step is to find the eigenvalues. Now, strictly speaking the eigenvalues of a matrix $\mathbf{A}$ are the solutions of the matrix's characteristic polynomial, but we'll get to that in a minute. For the matrix algebra definition, I showed you above, subtract $\lambda\mathbf{v}$ from both sides $$ \mathbf{Av} - \lambda\mathbf{v} = \mathbf{0} $$ Where $\mathbf{0}$ is an $n\times n$ matrix full of zeroes. Now if this were standard algebra you might be tempted to factorise out $\mathbf{v}$ and solve for $\lambda$. While that is conceptually on the right track, we don't do that just yet. What we actually do is solve

$$ \left|\left(\mathbf{A} - \lambda\mathbf{I}\right)\mathbf{v} \right| = \mathbf{0} $$ where $\mathbf{I}$ is the $n\times n$ identity matrix with the same dimensions as $\mathbf{A}$. However, eigenvectors are never zero. Therefore $$ \left|\mathbf{A} - \lambda\mathbf{I} \right| = 0 $$

If, for example, we say that $\mathbf{A}$ is a $3\times 3$ matrix of constants $a_i$ then this means $$ \left|\left(\begin{array}{ccc} a_1 & a_2 & a_3 \\ a_4 & a_5 & a_6 \\ a_7 & a_8 & a_9 \end{array}\right) - \lambda\left(\begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right) \right| = 0 $$ Which is equivalent to $$ \left|\begin{array}{ccc} a_1-\lambda & a_2 & a_3 \\ a_4 & a_5-\lambda & a_6 \\ a_7 & a_8 & a_9-\lambda \end{array} \right| = 0 $$ So to find the eigenvalues of $\mathbf{A}$ you find the $\lambda$ that make the matrix formed by $\mathbf{A} - \lambda\mathbf{I}$ singular. The polynomial of $\lambda$ formed on the left-hand side by finding the determinant of that matrix is the characteristic polynomial of $\mathbf{A}$. That is, $$ p(\lambda) = |\mathbf{A} - \lambda\mathbf{I}| $$

The characteristic polynomial $p(\lambda)$ can get quite messy, but it's much easier for diagonal or triangular matrices. For diagonal, lower triangular, and upper triangular matrices, the eigenvalues are equal to the diagonal elements. This is obvious because the determinant of a matrix of these three types is just the product of the diagonal elements.

Proof (click to expand)

Let $\mathbf{A}$ be a diagonal $n\times n$ matrix with constants $a_i$ as the diagonal elements. Then $$ |\mathbf{A} - \lambda\mathbf{I}| = \left|\begin{array}{cccc} a_1- \lambda & 0 & 0 & \ldots \\ 0 & a_2- \lambda & 0 & \ldots \\ 0 & 0 & a_3- \lambda & \ldots \\ \vdots & \vdots & \vdots & \ddots \end{array}\right| = 0 $$ From the properties of determinants $$ p(\lambda) = 0 \Rightarrow (a_1- \lambda)(a_2-\lambda)(a_3-\lambda)\cdots (a_n-\lambda) = 0 $$ Therefore all of the $a_i$ diagonal elements are each roots of the characteristic polynomial $p(\lambda)$. Similar reasoning for triangular matrices.

Example

Q) Find the eigenvalues of $\left(\begin{array}{cc} -1 & 2 \\ -6 & 6 \end{array}\right)$.

A) Start by finding the characteristic polynomial $|\mathbf{A} - \lambda\mathbf{I}|$ $$ p(\lambda) = \left| \begin{array}{cc} -1-\lambda & 2 \\ -6 & 6-\lambda \end{array} \right| = \left(-1-\lambda\right)\left(6-\lambda\right)-(-12) = \lambda^2-5\lambda +6 $$

Then set $p(\lambda) = 0$ and solve for $\lambda$ $$ \lambda^2-5\lambda +6 = 0 \Rightarrow \left(\lambda-2\right)\left(\lambda-3\right) $$

Therefore the eigenvalues are 2 and 3.

Once you've found the eigenvalues, you need to find each corresponding eigenvector $\mathbf{v}$. Take each eigenvalue $\lambda$ and sub it into the following equation $$ \left(\mathbf{A} - \lambda\mathbf{I} \right)\mathbf{v} = \mathbf{0} $$ So we're not finding a determinant this time, we're finding a $\mathbf{v}$ that makes the left side equal to $\mathbf{0}$. Like many concepts in matrices, it's often best explained with a worked example.

Example

Q) Find the eigenvectors corresponding to the eigenvalues you found for $\left(\begin{array}{cc} -1 & 2 \\ -6 & 6 \end{array}\right)$.

A) We found that the eigenvalues were 2 and 3. So we'll sub each one into $\left(\mathbf{A} - \lambda\mathbf{I} \right)\mathbf{v} = \mathbf{0}$ and find each corresponding eigenvector.

$\lambda = 2$: $$ \left(\begin{array}{cc} -1-2 & 2 \\ -6 & 6-2 \end{array}\right) = \left(\begin{array}{cc} -3 & 2 \\ -6 & 4 \end{array}\right) $$ Then multiply with the vector of unknowns and set to zero $$ \left(\begin{array}{cc} -3 & 2 \\ -6 & 4 \end{array}\right)\left(\begin{array}{c} x \\ y \end{array}\right) = \left(\begin{array}{c} 0 \\ 0 \end{array}\right) $$ $$\begin{align} \Rightarrow -3x+2y &= 0 \\ -6x+4y &=0 \end{align} $$ See that these equations are scalar multiples of each other, and the rows and columns of the matrix are linearly dependent. (We specifically wanted to find values of $\lambda$ to make it this way!) So we can just look at one of the equations and get a working answer.

$$ -3x+2y = 0 \Rightarrow y = \frac{3}{2}x $$ This means that any vector of the form $\left(\begin{array}{c} x \\ \frac{3}{2}x \end{array}\right)$ will work as an eigenvector. Let's make it easy on ourselves and pick $$ \mathbf{v} = \left(\begin{array}{c} 2 \\ 3 \end{array}\right) $$

$\lambda = 3$: $$ \left(\begin{array}{cc} -1-3 & 2 \\ -6 & 6-3 \end{array}\right) = \left(\begin{array}{cc} -4 & 2 \\ -6 & 3 \end{array}\right) $$ Then multiply with the vector of unknowns and set to zero $$ \left(\begin{array}{cc} -4 & 2 \\ -6 & 3 \end{array}\right)\left(\begin{array}{c} x \\ y \end{array}\right) = \left(\begin{array}{c} 0 \\ 0 \end{array}\right) $$ $$\begin{align} \Rightarrow -4x+2y &= 0 \\ -6x+3y &=0 \end{align} $$ $$ -4x+2y = 0 \Rightarrow y = 2x $$ This means that any vector of the form $\left(\begin{array}{c} x \\ 2x \end{array}\right)$ will work as an eigenvector. So we pick

$$ \mathbf{v} = \left(\begin{array}{c} 1 \\ 2 \end{array}\right) $$

Example

Q) Find all the eigenvalues and corresponding eigenvectors of $\left(\begin{array}{ccc} -2 & 5 & -2 \\ -8 & 11 & -2 \\ -4 & 5 & 0 \end{array} \right)$, given that one of the eigenvalues is 1.

A) Start by finding the characteristic polynomial $|\mathbf{A} - \lambda\mathbf{I}|$ $$ \left|\begin{array}{ccc} -2-\lambda & 5 & -2 \\ -8 & 11-\lambda & -2 \\ -4 & 5 & -\lambda \end{array} \right| = 0 $$ Evaluating the determinant $$ \left(-2-\lambda\right)\left(-\lambda\left(11-\lambda\right)-(-10)\right)-5\left(8\lambda-8 \right)-2\left(-40+4\left(11-\lambda\right)\right) = 0 $$ Simplifying and finding the characteristic polynomial $$ p(\lambda)=-\lambda^3+9\lambda^2-20\lambda+12 $$ So we need to solve $$ \lambda^3-9\lambda^2+20\lambda-12=0 $$

We were told that one of the eigenvalues is 1, so by the remainder theorem $p(\lambda)$ is divisible by $(\lambda-1)$. Using algebraic long division or any other method of your choice, we factorise the polynomial to $$ \left(\lambda-1\right)\left(\lambda^2-8\lambda+12\right) = 0 $$ which further factorises to $$ \left(\lambda-1\right)\left(\lambda-2\right)\left(\lambda-6\right) = 0 $$

Therefore the eigenvalues are 1, 2, and 6. Now to find the corresponding eigenvectors.

$\lambda=1$: $$ \left(\begin{array}{ccc} -3 & 5 & -2 \\ -8 & 10 & -2 \\ -4 & 5 & -1 \end{array} \right)\left(\begin{array}{c} x \\ y \\ z \end{array}\right) = 0 $$ $$\begin{align} \Rightarrow -3x+5y-2z &= 0 \\ -8x+10y-2z &=0 \\ -4x+5y-z &= 0 \end{align} $$ Subtract the second equation from the first $$ 5x-5y = 0 \Rightarrow y=x $$ Choose $x=y=1$ and sub into the third equation $$ -4+5 -z =0 \Rightarrow z = 1 $$ So a working eigenvector is $$ \mathbf{v}= \left(\begin{array}{c} 1 \\ 1 \\ 1 \end{array}\right) $$

$\lambda=2$: $$ \left(\begin{array}{ccc} -4 & 5 & -2 \\ -8 & 9 & -2 \\ -4 & 5 & -2 \end{array} \right)\left(\begin{array}{c} x \\ y \\ z \end{array}\right) = 0 $$ $$\begin{align} \Rightarrow -4x+5y-2z &= 0 \\ -8x+9y-2z &=0 \\ -4x+5y-2z &= 0 \end{align} $$ The first and third rows are linearly dependent. Subtract twice the first equation from the second equation $$ -y+2z = 0 \Rightarrow y = 2z $$ Choose $z=1$ so that $y=2$. Then subtract the second equation from the first equation $$ 4x-4y = 0 \Rightarrow y= x$$ Choose $x=y=1$, so a working eigenvector is $$ \mathbf{v}= \left(\begin{array}{c} 2 \\ 2 \\ 1 \end{array}\right) $$

$\lambda=6$: $$ \left(\begin{array}{ccc} -8 & 5 & -2 \\ -8 & 5 & -2 \\ -4 & 5 & -6 \end{array} \right)\left(\begin{array}{c} x \\ y \\ z \end{array}\right) = 0 $$ $$\begin{align} \Rightarrow -8x+5y-2z &= 0 \\ -8x+5y-2z &=0 \\ -4x+5y-6z &= 0 \end{align} $$ The first and second rows are linearly dependent. Subtract thrice the first equation from the third equation $$ 20x -10y = 0 \Rightarrow y=2x $$ Choose $x=1$ so that $y=2$. Then subtract the third equation from the second equation $$ -4x + 4z=0 \Rightarrow x=z $$ Choose $z=x=1$, so a working eigenvector is $$ \mathbf{v}= \left(\begin{array}{c} 1 \\ 2 \\ 1 \end{array}\right) $$

Diagonalisation

Recall from earlier that a diagonal matrix is a square matrix whose only non-zero elements lie on its diagonal. Let $\mathbf{A}$ be an $n\times n$ diagonal matrix $$ \left(\begin{array}{cccc} a_1 & 0 & 0 & \ldots \\ 0 & a_2 & 0 & \ldots \\ 0 & 0 & a_3 & \ldots \\ \vdots & \vdots & \vdots & \ddots \end{array}\right) $$

Then we can raise the matrix to any power by raising each diagonal element to that power $$ \mathbf{A}^n = \left(\begin{array}{cccc} a_1^n & 0 & 0 & \ldots \\ 0 & a_2^n & 0 & \ldots \\ 0 & 0 & a_3^n & \ldots \\ \vdots & \vdots & \vdots & \ddots \end{array}\right) $$

For other matrices, to find $\mathbf{A}^n$ you'd need to multiply the matrix by itself $n$ times, a long and tedious task. However, matrices can be diagonisable, which makes raising them to large powers much easier.

A diagonisable matrix $\mathbf{A}$ can be written as $$ \mathbf{A} = \mathbf{PDP}^{-1} $$ where $\mathbf{D}$ is a diagonal matrix of the eigenvalues of $\mathbf{A}$, $\mathbf{P}$ is a matrix whose columns are the eigenvectors corresponding to the eigenvalues of $\mathbf{A}$, and $\mathbf{P}^{-1}$ is the inverse of $\mathbf{P}$. If $\mathbf{P}$ is singular then $\mathbf{A}$ isn't diagonisable. The eigenvectors in $\mathbf{P}$ and their corresponding eigenvalues in $\mathbf{D}$ must be in the same order.

In the $2\times 2$ case that means if the diagonal $\mathbf{D}$ has the first eigenvalue as the top left element, and the second eigenvalue as the bottom right element, then in $\mathbf{P}$ the first column must be the eigenvector corresponding to the first eigenvalue, and the second column must be the eigenvector corresponding to the second eigenvalue.

Then $$ \mathbf{A}^n = \mathbf{PD}^n\mathbf{P}^{-1} $$

Proof (click to expand)

Let $\mathbf{A}$ be an $n\times n$ diagonisable matrix. Then $\mathbf{A}$ can be written as $\mathbf{PDP}^{-1}$. Also $$ \begin{align} \mathbf{A}^n &= \underbrace{\mathbf{A}\mathbf{A}\mathbf{A}\cdots \mathbf{A}}_\text{n times} \\ &= \underbrace{\left(\mathbf{PDP}^{-1}\right)\left(\mathbf{PDP}^{-1}\right)\left(\mathbf{PDP}^{-1}\right)\cdots \left(\mathbf{PDP}^{-1}\right)}_\text{n times} \\ &= \mathbf{PD}\left(\mathbf{P}^{-1}\mathbf{P}\right)\mathbf{D}\left(\mathbf{P}^{-1}\mathbf{P}\right)\cdots \mathbf{D}\left(\mathbf{P}^{-1}\mathbf{P}\right)\mathbf{DP}^{-1} \\ &= \mathbf{PDIDI}\cdots \mathbf{DIDP}^{-1} \\ &= \mathbf{P}\underbrace{\mathbf{DDDD}}_\text{n times}\mathbf{P}^{-1} \\ &= \mathbf{PD}^n\mathbf{P}^{-1} \end{align} $$

This formula is very useful when raising $\mathbf{A}$ to large powers, because simply raising the eigenvalues in $\mathbf{D}$ to the nth power is much faster than multiplying $\mathbf{A}$ by itself loads of times.

Example

Q) Find $\mathbf{A}^7$ where $\mathbf{A} = \left(\begin{array}{ccc} -2 & 5 & -2 \\ -8 & 11 & -2 \\ -4 & 5 & 0 \end{array} \right)$.

A) We found the eigenvalues 1, 2, and 6 and the corresponding eigenvectors in the previous example, so we'll write $\mathbf{A}$ as $\mathbf{PDP}^{-1}$, where $$ \mathbf{D} = \left(\begin{array}{ccc} 1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 6 \end{array} \right) $$ and $$ \mathbf{P} = \left(\begin{array}{ccc} 1 & 2 & 1 \\ 1 & 2 & 2 \\ 1 & 1 & 1 \end{array} \right) $$

We need to find $\mathbf{P}^{-1}$. Find its determinant first by going along the top row $$ |\mathbf{P}| = \left(2-2\right)-2\left(1-2\right)+\left(1-2\right) = 1 $$ Then find the matrix of cofactors $\mathbf{C}_{\mathbf{P}}$ $$ \mathbf{C}_{\mathbf{P}} = \left(\begin{array}{ccc} 0 & 1 & -1 \\ -1 & 0 & 1 \\ 2 & -1 & 0 \end{array} \right) $$ Then transpose it to find $\mathbf{P}^{-1}$ $$ \mathbf{P}^{-1} = \left(\begin{array}{ccc} 0 & -1 & 2 \\ 1 & 0 & -1 \\ -1 & 1 & 0 \end{array} \right) $$

Now we can write the diagonalised form of $\mathbf{A}$ $$ \mathbf{PDP}^{-1} = \left(\begin{array}{ccc} 1 & 2 & 1 \\ 1 & 2 & 2 \\ 1 & 1 & 1 \end{array} \right)\left(\begin{array}{ccc} 1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 6 \end{array} \right)\left(\begin{array}{ccc} 0 & -1 & 2 \\ 1 & 0 & -1 \\ -1 & 1 & 0 \end{array} \right) $$ Therefore $\mathbf{A}^7$ is $$ \mathbf{A}^7 = \mathbf{PD}^7\mathbf{P}^{-1} = \left(\begin{array}{ccc} 1 & 2 & 1 \\ 1 & 2 & 2 \\ 1 & 1 & 1 \end{array} \right)\left(\begin{array}{ccc} 1^7 & 0 & 0 \\ 0 & 2^7 & 0 \\ 0 & 0 & 6^7 \end{array} \right)\left(\begin{array}{ccc} 0 & -1 & 2 \\ 1 & 0 & -1 \\ -1 & 1 & 0 \end{array} \right) $$ Evaluating this product $$ \mathbf{PD}^7\mathbf{P}^{-1} = \left(\begin{array}{ccc} 1 & 256 & 279936 \\ 1 & 256 & 559872 \\ 1 & 128 & 279936 \end{array} \right)\left(\begin{array}{ccc} 0 & -1 & 2 \\ 1 & 0 & -1 \\ -1 & 1 & 0 \end{array} \right) = \left(\begin{array}{ccc} -279680 & 279935 & -254 \\ -559616 & 559871 & -254 \\ -279808 & 279935 & -126 \end{array} \right) $$

Example

Q) Let $\mathbf{M} = \left(\begin{array}{cc} 1 & 0 \\ 2 & 4 \end{array}\right)$. Show that $\mathbf{M}^n = \left(\begin{array}{cc} 1 & 0 \\ \frac{2}{3}\left(4^n -1\right) & 4^n \end{array}\right)$.

A) Find the characteristic polynomial of $\mathbf{M}$ $$ p(\lambda) = \left| \begin{array}{cc} 1-\lambda & 0 \\ 2 & 4-\lambda \end{array} \right| = \left(1-\lambda\right)\left(4-\lambda\right) $$ Indeed since $\mathbf{M}$ is a lower triangular matrix its eigenvalues are its diagonal elements 1 and 4.

$\lambda = 1$: $$ \left(\begin{array}{cc} 0 & 0 \\ 2 & 3 \end{array}\right)\left(\begin{array}{c} x \\ y \end{array}\right) = \left(\begin{array}{c} 0 \\ 0 \end{array}\right) $$ $$ \Rightarrow 2x+3y = 0 \Rightarrow y=-\frac{2}{3}x $$ Choose $x=3$ which makes $y=-2$. So the eigenvector is $$ \left(\begin{array}{c} 3 \\ -2 \end{array}\right) $$

$\lambda = 4$: $$ \left(\begin{array}{cc} -3 & 0 \\ 2 & 0 \end{array}\right)\left(\begin{array}{c} x \\ y \end{array}\right) = \left(\begin{array}{c} 0 \\ 0 \end{array}\right) $$ $$ \Rightarrow x = 0 $$ No other condition is placed on $x$ and $y$ other than that $x$ must be zero. So pick $y=1$ which means the eigenvector is $$ \left(\begin{array}{c} 0 \\ 1 \end{array}\right) $$

We aim to write $\mathbf{M}$ as $\mathbf{PDP}^{-1}$. We have $\mathbf{D}$ $$ \mathbf{D} = \left(\begin{array}{cc} 1 & 0 \\ 0 & 4 \end{array}\right) $$ and $\mathbf{P}$ is $$ \mathbf{P} = \left(\begin{array}{cc} 3 & 0 \\ -2 & 1 \end{array}\right) $$

Find $\mathbf{P}^{-1}$ by using FP1 methods $$ \mathbf{P}^{-1} = \frac{1}{3}\left(\begin{array}{cc} 1 & 0 \\ 2 & 3 \end{array}\right) = \left(\begin{array}{cc} \frac{1}{3} & 0 \\ \frac{2}{3} & 1 \end{array}\right) $$ Then $$ \mathbf{M}^n = \left(\begin{array}{cc} 3 & 0 \\ -2 & 1 \end{array}\right)\left(\begin{array}{cc} 1 & 0 \\ 0 & 4^n \end{array}\right)\left(\begin{array}{cc} \frac{1}{3} & 0 \\ \frac{2}{3} & 1 \end{array}\right) $$

Multiplying the left two matrices $$ \mathbf{M}^n = \left(\begin{array}{cc} 3 & 0 \\ -2 & 4^n \end{array}\right)\left(\begin{array}{cc} \frac{1}{3} & 0 \\ \frac{2}{3} & 1 \end{array}\right) $$

Then finishing the multiplication $$ \mathbf{M}^n = \left(\begin{array}{cc} 1 & 0 \\ -\frac{2}{3}+\frac{2}{3}\left(4^n\right) & 4^n \end{array}\right) = \left(\begin{array}{cc} 1 & 0 \\ \frac{2}{3}\left(4^n -1\right) & 4^n \end{array}\right) $$

The Cayley-Hamilton theorem

The Cayley-Hamilton theorem is a rather astonishing fact that allows you to quickly find matrix inverses and the relations between powers of matrices.

Every square matrix is a root of its own characteristic polynomial.

While pretty astonishing in and of itself, I'm going to show you with a practical example why this theorem is so useful.

Take the $2\times 2$ matrix $\mathbf{A}$ $$ \mathbf{A} = \left(\begin{array}{cc} 4 & 1 \\ 1 & \frac{1}{2} \end{array}\right) $$ The characteristic polynomial is $$ p(\lambda) = \left|\begin{array}{cc} 4-\lambda & 1 \\ 1 & \frac{1}{2}-\lambda \end{array}\right| = \left(4-\lambda\right)\left(1/2-\lambda\right)-1 = \lambda^2-\frac{9}{2}\lambda+1 $$

By the Cayley-Hamilton theorem $p(\mathbf{A})=\mathbf{0}$ $$ \mathbf{A}^2-\frac{9}{2}\mathbf{A}+\mathbf{I} = \mathbf{0} $$ Rearranging $$ \mathbf{A}^2 = \frac{9}{2}\mathbf{A}-\mathbf{I} $$ We now have a formula for $\mathbf{A}^2$ in terms of $\mathbf{A}$ itself, so we can find the square of the matrix without any matrix multiplication. Indeed $$ \frac{9}{2}\mathbf{A}-\mathbf{I} = \left(\begin{array}{cc} 18 & \frac{9}{2} \\ \frac{9}{2} & \frac{9}{4} \end{array}\right) - \left(\begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array}\right) = \left(\begin{array}{cc} 17 & \frac{7}{2} \\ \frac{7}{2} & \frac{5}{4} \end{array}\right) $$ $$ \mathbf{A}^2 = \mathbf{AA} = \left(\begin{array}{cc} 4 & 1 \\ 1 & \frac{1}{2} \end{array}\right)\left(\begin{array}{cc} 4 & 1 \\ 1 & \frac{1}{2} \end{array}\right) = \left(\begin{array}{cc} 17 & \frac{7}{2} \\ \frac{7}{2} & \frac{5}{4} \end{array}\right) $$

Additionally you can do this to find the inverse in terms of $\mathbf{A}$. Start by left-multiplying everything by $\mathbf{A}^{-1}$ $$ \mathbf{A}^{-1}\mathbf{A}^2-\frac{9}{2}\mathbf{A}^{-1}\mathbf{A}+\mathbf{A}^{-1}\mathbf{I} = \mathbf{0} $$ $$ \Rightarrow \mathbf{A} -\frac{9}{2}\mathbf{I}+\mathbf{A}^{-1}=\mathbf{0} $$ $$ \Rightarrow \mathbf{A}^{-1} = \frac{9}{2}\mathbf{I} - \mathbf{A} $$

Now this is pretty cool. We've found a formula for finding the inverse without having to do all of those steps. Indeed $$ \frac{9}{2}\mathbf{I} - \mathbf{A} = \left(\begin{array}{cc} \frac{9}{2} & 0 \\ 0 & \frac{9}{2} \end{array}\right) - \left(\begin{array}{cc} 4 & 1 \\ 1 & \frac{1}{2} \end{array}\right) = \left(\begin{array}{cc} \frac{1}{2} & -1 \\ -1 & 4 \end{array}\right) $$ $$ \mathbf{A}\left(\frac{9}{2}\mathbf{I} - \mathbf{A}\right) = \left(\begin{array}{cc} 4 & 1 \\ 1 & \frac{1}{2} \end{array}\right)\left(\begin{array}{cc} \frac{1}{2} & -1 \\ -1 & 4 \end{array}\right) = \left(\begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array}\right) = \mathbf{I} $$

Example

Q) Let $\mathbf{M} = \left(\begin{array}{ccc} 2 & 3 & 0 \\ 2 & 4 & 0 \\ -5 & -6 & 1 \end{array}\right)$. Find an expression for $\mathbf{M}^{-1}$ in terms of powers of $\mathbf{M}$ by using the Cayley-Hamilton theorem.

A) Find the characteristic polynomial by going down the third column, which allows us to eliminate the first two terms $$ p(\lambda) = \left|\begin{array}{ccc} 2-\lambda & 3 & 0 \\ 2 & 4-\lambda & 0 \\ -5 & -6 & 1-\lambda \end{array}\right| = \left(1-\lambda\right)\left(\left(2-\lambda\right)\left(4-\lambda\right)-6\right) $$

Multiplying out the brackets $$ p(\lambda) = \left(1-\lambda\right)\left(8-6\lambda+\lambda^2-6\right) = \lambda^3 - 7\lambda^2+8\lambda-2 $$ By the Cayley-Hamilton theorem $p(\mathbf{M}) = \mathbf{0}$ $$ \mathbf{M}^3 - 7\mathbf{M}^2+8\mathbf{M}-2\mathbf{I} = \mathbf{0} $$

Then left-multiply every term by $\mathbf{M}^{-1}$ $$ \mathbf{M}^{-1}\mathbf{M}^3 - 7\mathbf{M}^{-1}\mathbf{M}^2+8\mathbf{M}^{-1}\mathbf{M}-2\mathbf{M}^{-1}\mathbf{I} = \mathbf{0} $$ $$ \Rightarrow \mathbf{M}^2- 7\mathbf{M}+8\mathbf{I}-2\mathbf{M}^{-1}= \mathbf{0} $$ Rearranging to make $\mathbf{M}^{-1}$ the subject $$ 2\mathbf{M}^{-1} = \mathbf{M}^2-7\mathbf{M}+8\mathbf{I} $$ $$ \Rightarrow \mathbf{M}^{-1} = \frac{1}{2}\left(\mathbf{M}^2-7\mathbf{M}+8\mathbf{I}\right) $$

Study materials

Further matrices

Introduction

Determinant of a 3x3 matrix

Example

Properties of the determinant

Inverse of a 3x3 matrix

Example

Simultaneous equations

Example

Eigenvalues and eigenvectors

Example

Example

Example

Diagonalisation

Example

Example

The Cayley-Hamilton theorem

Example

More study materials