#### (A new question of the week)

Looking for a new topic, I realized that a recent question involves determinants, and an older one provides the background for that. We’ll continue the series on determinants by seeing how they can be used in finding the inverse of a matrix, and how something called the adjugate matrix might fit in (with side trips into Cramer’s Rule and row reduction).

## Finding an inverse using determinants

This question came from Sarah, in February of last year:

I was studying matrices, and was thinking, is there some proof on

finding the inverse of a matrix?I know how to do it step by step by heart but l do not understand what I’m doing and why it is like that.

For example, the inverse uses the

determinant of a matrix– how do you interpret it? For instance, if the determinant of a 3×3 matrix is 2,what is that telling youabout the matrix?We also find

minors– if an element has a minor of -1, what does that really mean, please?

We’ve recently seen what a determinant means, algebraically and geometrically; but the “meaning” in this context is a little different. We haven’t yet looked at minors, which are determinants of sub-matrices.

Doctor Fenton answered:

Hi Sarah,

Yes, there are ways of proving that a given algorithm does produce an inverse to a matrix, and

there is more than one way to compute the inverse, one of which is to use determinants.It would help to know what you already know about matrices. Do you use matrices to solve systems of linear equations, to transform vectors (column matrices), or for some other application?

Sarah replied,

Thanks for your reply. I’m using it in a course about mathematical economics where it is mostly applied to

finding inverses to solve a system of 3 equations. If you’re familiar with some economic theory, there is also an application to find OLS estimators in a regression.We had covered matrices before, but now l want to understand a bit deeper what I’m actually doing.

So l know if I’m using determinants, l can find the reciprocal of that and

multiply by the adjoint, where the adjoint is thetranspose of the cofactor matrixbut beyond that, l still don’t know what the determinant is. I’ve always learnt it as “ad – bc”.Even

minors, l get the definition that you delete theith andjth row and column and find determinant of resultant matrix, but doing that by heart is a bit strange because l don’t understand why l am doing that, in the sense l don’t know what the minor shows you and how it leads to the inverse matrix. I think that logic is why you can only apply inverses tosquare matrices, although to solve systems of equations, number of equations = number of unknowns shouldn’t be a problem.We had previously covered

row reduction technique, l also knowLaplace expansionand the short hand rule. And we have solved systems usingCramer’s rule.Thanks

We’ll touch on most of these topics: Finding the inverse using what she calls the “adjoint“, more often today called the “adjugate“, and also by row reduction; “minors” in a determinant (used in finding the adjugate, and also in the Laplace expansion for evaluating a determinant; and Cramer’s rule for solving a system of equations.

Sometime we will look into what matrices *are*, *why* they are added and multiplied as they are, and so on. But we’ll see the basics of multiplication and inverses momentarily.

### What is a matrix inverse?

Doctor Fenton responded, first stating what an inverse is:

Thank you for clarifying what you already know. Using the

adjugate(previously called theadjoint) matrix to find the inverse isnot the most efficient way to compute the inverse. I will illustrate the ideas with 2×2 matrices, although the idea works for square matrices of any size (only square matrices can have an inverse).When I multiply two 2×2 matrices AB, with

A = [a_{11}a_{12}] and B = [b_{11}b_{12}], [a_{21}a_{22}] [b_{21}b_{22}]note that the product is

[a_{11}b_{11}+a_{12}b_{21}a_{11}b_{12}+a_{12}b_{22}] = [ [a_{11}a_{12}][b11] [a_{11}a_{12}][b_{12}] ] [a_{21}b_{11}+a_{22}b_{21}a_{21}b_{12}+a_{22}b_{22}] [ [a_{21}a_{22}][b21] [a_{21}a_{22}][b_{21}] ] = [A(B_{1}) A(B_{2})]where B

_{1}and B_{2}are the first and second columns of B. That is, to multiply A by the matrix B=[B_{1}B_{2}] on the right, you just multiply each of the columns in B by A.

To help us follow this, I’ll make a simple 2×2 example:

$$A=\begin{bmatrix}1&2\\3&4\end{bmatrix},B=\begin{bmatrix}2&-1\\1&3\end{bmatrix}\\

AB=\begin{bmatrix}1&2\\3&4\end{bmatrix}\begin{bmatrix}2&-1\\1&3\end{bmatrix}=\begin{bmatrix}1\cdot2+2\cdot1&1\cdot-1+2\cdot3\\3\cdot2+4\cdot1&3\cdot-1+4\cdot3\end{bmatrix}=\begin{bmatrix}4&5\\10&9\end{bmatrix}=Y$$

The first column of the product is A times the first column of B:

$$\begin{bmatrix}1&2\\3&4\end{bmatrix}\begin{bmatrix}2\\1\end{bmatrix}=\begin{bmatrix}1\cdot2+2\cdot1\\3\cdot2+4\cdot1\end{bmatrix}=\begin{bmatrix}4\\10\end{bmatrix}$$

That’s how we multiply. So what is the inverse?

The

inverseof a matrix A (if it exists) is the matrix A^{-1}such thatAA

^{-1}= A^{-1}A = I ,where I is the identity matrix.

If A is invertible, and we want to solve the matrix equation AX=B, where

X is a 2x1 column matrix [x_{1}] and B is a column matrix [b_{1}], [x_{2}] [b_{2}]we multiply AX=B by A

^{-1}and get X = A^{-1}B as the solution.

For our A, the inverse (which we’ll calculate below in two ways) turns out to be $$A^{-1}=\begin{bmatrix}-2&1\\\frac{3}{2}&-\frac{1}{2}\end{bmatrix},$$ which we can check by seeing that $$AA^{-1}=\begin{bmatrix}1&2\\3&4\end{bmatrix}\begin{bmatrix}-2&1\\\frac{3}{2}&-\frac{1}{2}\end{bmatrix}=\begin{bmatrix}1\cdot-2+2\cdot\frac{3}{2}&1\cdot1+2\cdot-\frac{1}{2}\\3\cdot-2+4\cdot\frac{3}{2}&3\cdot1+4\cdot-\frac{1}{2}\end{bmatrix}=\begin{bmatrix}1&0\\0&1\end{bmatrix}$$ and

$$A^{-1}A=\begin{bmatrix}-2&1\\\frac{3}{2}&-\frac{1}{2}\end{bmatrix}\begin{bmatrix}1&2\\3&4\end{bmatrix}=\begin{bmatrix}-2\cdot1+1\cdot3&-2\cdot2+1\cdot4\\\frac{3}{2}\cdot1-\frac{1}{2}\cdot3&\frac{3}{2}\cdot2-\frac{1}{2}\cdot4\end{bmatrix}=\begin{bmatrix}1&0\\0&1\end{bmatrix}.$$

If we wanted to solve the equation \(AX=Y\), $$\begin{bmatrix}1&2\\3&4\end{bmatrix}X=\begin{bmatrix}4&5\\10&9\end{bmatrix},$$ we could multiply both sides by \(A^{-1}\) to get

$$X=A^{-1}Y=\begin{bmatrix}-2&1\\\frac{3}{2}&-\frac{1}{2}\end{bmatrix}\begin{bmatrix}4&5\\10&9\end{bmatrix}=\begin{bmatrix}-2\cdot4+1\cdot10&-2\cdot5+1\cdot9\\\frac{3}{2}\cdot4-\frac{1}{2}\cdot10&\frac{3}{2}\cdot5-\frac{1}{2}\cdot9\end{bmatrix}=\begin{bmatrix}2&-1\\1&3\end{bmatrix},$$ which is our B above.

### Inverse by solving equations

So, how do we find that inverse matrix?

To simplify notation by reducing the number of super- and subscripts, let me denote the inverse matrix of A, A

^{-1}, by C, so that C_{1}is thefirst columnof C and C_{2}thesecond.The equation AA

^{-1}= AC = I can be written asAC = A[C

_{1}: C_{2}] = (AC_{1}: AC_{2}] = [E_{1}: E_{2}] ,since

E1 = [1] is the first column of I and E2 = [0] is the second. [0] [1]Then AC

_{1}=E_{1}and AC_{2}=E_{2}, which says that C_{1}is the solution to AX=E_{1}, and C_{2}is the solution to AX=E_{2}.

In our example, we find the two columns of the inverse by solving $$AC_1=E_1$$ $$\begin{bmatrix}1&2\\3&4\end{bmatrix}C_1=\begin{bmatrix}1\\0\end{bmatrix}$$ and $$AC_2=E_2$$ $$\begin{bmatrix}1&2\\3&4\end{bmatrix}C_2=\begin{bmatrix}0\\1\end{bmatrix}$$

But you know how to solve AX=B by

row reducing the augmented matrix[A:B] (the matrix A augmented with B as an extra column) to the form [I:X], so that the solution X is the last column of the reduced augmented matrix.Then, to find the

inverse matrix, we augment the matrix A with the identity matrix [A:I] (a 2×4 matrix) and row reduce to the form [I:C], and the inverse matrix will be the right half of the reduced 2×2 matrix. (If the left half cannot be reduced to I, then the matrix A is not invertible.) That is the efficient way to find A^{-1}.

This is the standard method that he referred to before, and which we’ll see below. But we can also use determinants to solve this equation, which will lead to the adjugate. For that, keep reading …

### Cramer’s rule and the inverse

Finding \(C_1\) and \(C_2\) each amounts to solving a system of equations, which we can do with determinants:

If you solve

ax + by = u cx + dy = vwith elimination, multiplying the first equation by d and the second equation by b, and then subtracting, you get

(ad – bc)x = du – bv,

so

x = (du – bv)/(ad – bc), or

or

[u b] det [v d] x = --------- , [a b] det [c d]and similarly y = (av – cu)/(ad – bc) is a quotient of determinants. This indicates

where determinants can come fromand can lead toCramer’s Rule, but using determinants is not the best way to find the inverse.

Here we have derived Cramer’s Rule by brute force in the 2×2 case. As Wikipedia puts it,

Consider a system of *n* linear equations for n unknowns, represented in matrix multiplication form as follows: $$A\mathbf{x}=\mathbf{b}$$

where the *n* × *n* matrix A has a nonzero determinant, and the vector \(\mathbf{x}=(x_1,\dots,x_n)^T\) is the column vector of the variables. Then the theorem states that in this case the system has a unique solution, whose individual values for the unknowns are given by: $$x_i=\frac{\det(A_i)}{\det(A)}\; \; \; i=1,\dots n$$ where \(A_i\) is the matrix formed by replacing the i-th column of A by the column vector \(\mathbf{b}\).

So let’s solve our system this way, in order to find the inverse of A:

To find the first column of our inverse, we need to solve

$$\begin{bmatrix}1&2\\3&4\end{bmatrix}C_1=\begin{bmatrix}{\color{Green}1}\\{\color{Green}0}\end{bmatrix}$$

Cramer’s rule gives this solution:

$$C_{11}=\frac{\begin{vmatrix}{\color{Green}1}&2\\{\color{Green}0}&{\color{Red}4}\end{vmatrix}}{\begin{vmatrix}1&2\\3&4\end{vmatrix}}=\frac{1\cdot{\color{Red}4}-2\cdot0}{1\cdot4-2\cdot3}=\frac{{\color{Red}4}}{-2}=-2$$

$$C_{21}=\frac{\begin{vmatrix}1&{\color{Green}1}\\ {\color{Red}3}&{\color{Green}0}\end{vmatrix}}{\begin{vmatrix}1&2\\3&4\end{vmatrix}}=\frac{1\cdot0-1\cdot{\color{Red}3}}{1\cdot4-2\cdot3}=\frac{-{\color{Red}3}}{-2}=\frac{3}{2}$$

But observe that the determinant on the top, in each case, is just the element (4 or 3) opposite the 1, with an alternating sign; I’ve highlighted them. These, as we’ll see, are **cofactors**.

So the first column is $$C_{1}=\begin{bmatrix}-2\\\frac{3}{2}\end{bmatrix}$$

Similarly, to solve

$$\begin{bmatrix}1&2\\3&4\end{bmatrix}C_2=\begin{bmatrix}{\color{Green}0}\\{\color{Green}1}\end{bmatrix}$$

we use

$$C_{12}=\frac{\begin{vmatrix}{\color{Green}0}&{\color{Red}2}\\{\color{Green}1}&4\end{vmatrix}}{\begin{vmatrix}1&2\\3&4\end{vmatrix}}=\frac{0\cdot4-{\color{Red}2}\cdot1}{1\cdot4-2\cdot3}=\frac{-{\color{Red}2}}{-2}=1$$

$$C_{22}=\frac{\begin{vmatrix}{\color{Red}1}&{\color{Green}0}\\3&{\color{Green}1}\end{vmatrix}}{\begin{vmatrix}1&2\\3&4\end{vmatrix}}=\frac{{\color{Red}1}\cdot1-0\cdot3}{1\cdot4-2\cdot3}=\frac{{\color{Red}1}}{-2}=-\frac{1}{2}$$

So the second column of the inverse is $$C_{2}=\begin{bmatrix}1\\-\frac{1}{2}\end{bmatrix}$$

This gives us the inverse I showed before,

$$A^{-1}=\begin{bmatrix}-2&1\\\frac{3}{2}&-\frac{1}{2}\end{bmatrix}$$

We almost used the adjugate here, though we haven’t yet even talked about what it is. We’ll get there eventually, but first, he answered the side questions:

Determinants have a

geometric interpretation. The determinant of[a b] [c d]is the area of the parallelogram with sides given by the vectors (a,b) and (c,d) in the plane.

I don’t know of any significance of this fact for solving linear systems, other than the fact that if the determinant is 0, then the system either has no solution or infinitely many solutions, depending upon the right side B.Does this help?

This is the subject of our last two posts.

## Finding the inverse by row reduction

Sarah asked for a little more:

Thank you so much for that, Dr Fenton.

Just to make sure l understood, could you kindly

illustrate through an example? I can then apply that myself to a 3×3, don’t worry 🙂Why is there such an emphasis on determinants

not being the most efficient way, please?The part on deriving the determinant and how it can lead to Cramer’s Rule is very interesting, thank you.

What about the part on

minors, particularly interpreting them – the idea behind WHY we delete the i^{th}row and j^{th}column and take the determinant of the resultant matrix.Thank you!

Doctor Fenton replied with, first, a statement of what we did above with Cramer’s Rule:

By an example, I assume that you want an example of

using row reduction to compute an inverse of a matrix. In the 2×2 case, the determinant approach gives the inverse matrix of[a b]^{-1}[ d -b]^{ }[c d] = 1/(ad-bc) [-c a]which doesn’t require much computation.

That matrix is, in fact, the adjugate.

Then he gave an example of the more efficient method of finding inverses, before getting back to minors:

For a 3×3 example, to find

[ 1 -1 0]^{-1}[ 1 0 -1] [-6 2 3] ,we write

[ 1 -1 0 1 0 0] [ 1 0 -1 0 1 0] [-6 2 3 0 0 1]and row reduce to

[ 1 0 0 -2 -3 -1] [ 0 1 -1 -3 -3 -1] [ 0 0 1 -2 -4 -1] ,so

[ 1 -1 0]^{-1 }[-2 -3 -1] [ 1 0 -1] = [-3 -3 -1] [-6 2 3] [-2 -4 -1] .

We’ll see the adjugate method, for the same matrix, later.

The reason for preferring row operations is because of complexity. Even in the 3×3 case, the arithmetic work required is not onerous, but

for larger matrices, there is a big difference. It’s not hard to see that in general,computing an nxn determinantrequires computing n! terms, while row-reducing an nxn matrix to upper triangular form takes roughly n^{3}/6 operations, so reducing the left half of the augmented n x (2n) matrix to the identity will take about n^{3}/3 operations. For n=2 or 3, n! and n^{3}/3 are comparable, but for larger n, say n=10, 10! is over 3×10^{6}, while 10^{3}/3 is about 300. For n=100, the value of 100! is an integer with 158 digits, while 100^{3}/3 is in the hundreds of thousands.To compute the value of

large determinants, it is more efficient to use row operations to transform the matrix to upper triangular form, since the determinant of a triangular matrix is just the product of its diagonal elements, and the effects of two operations on a determinant is easy to determine: interchanging rows changes the sign of the determinant; multiplying a row by a constant multiplies the determinant by the same constant; and replacing a row by the sum of itself and another row doesn’t change the determinant.

This provides a way to find determinants that is quicker than doing it directly; but in the adjugate method we’re about to see, we’d need to calculate *many* determinants!

### Minors and cofactors

The adjugate is defined in terms of **minors**, which arise in the **Laplace expansion of a determinant**; so he explained that first. Here is what it looks like for a 3×3 determinant, starting with the algebraic definition we saw two weeks ago:

As for the Laplace expansion, I don’t know how Laplace discovered it, but if you look at the 3×3 case,

[a b c] det [d e f] = aei + cdh + bfg - ceg - afh - bdi = a(ei-hf) + b(fg-di) + c(dh-eg) [g h i] = a det[e f] - b det [d f] + c det [d e] [h i] [g i] [g h] .You can pick any row (or column) and rewrite the determinant as a

sum of the entries in that row(or column) timesdeterminantswhich are theminorsof the entries.

Each element of one row (here, the top) is multiplied by the determinant of the matrix formed by removing that element’s row and column. The **minor** of the bold entry here is the determinant of the part in red, and the **cofactor** is the minor multiplied by \(\pm1\):

\begin{vmatrix}\mathbf{a}&b&c\\d&{\color{Red}e}&{\color{Red}f}\\g&{\color{Red}h}&{\color{Red}i}\end{vmatrix}

\begin{vmatrix}a&\mathbf{b}&c\\ {\color{Red}d}&e&{\color{Red}f}\\ {\color{Red} g}&h&{\color{Red}i}\end{vmatrix}

\begin{vmatrix}a&b&\mathbf{c}\\ {\color{Red}d}&{\color{Red}e}&f\\ {\color{Red} g}&{\color{Red}h}&i\end{vmatrix}

The same pattern is true, almost trivially, of the 2×2 determinant: the minors are just the diagonally opposite entries, as I mentioned above.

## Inverse by adjugate

Sarah now asked for the one missing piece:

Thank you Dr Fenton! This is why l love asking questions here – l always learn more than l ever thought l would before asking!

The part about number of operations isn’t as obvious to me, but l do get the gist why row operations are quicker.

Could you elaborate on the notion of

minors, please? I’m still unsure what a minor of 4 would really be saying. I think there’s more to it that l just don’t know about.And what about proving that

1/det multiplied by adjugateindeed gives you theinverse matrix, please?Thank you 🙂

Doctor Fenton answered:

As I think I said earlier, I just regard minors as

quantities which arise in evaluating determinants. As a determinant, it has a geometric interpretation as an area or volume in 2 or 3 dimensions, but I am not aware of any geometric significance to that fact. The Laplace expansion (or cofactor expansion) tells you that the absolute value of a 3×3 determinant is a volume of a 3-dimensional parallelepiped, which is a linear combination of some 2-dimensional areas (the areas corresponding to the minors of the determinant), but I don’t know that this interpretation helps understand what a determinant is.

This could be interesting to think more about, but if there is a meaning, it is not obvious.

Now we finally get to the adjugate:

As for the inverse formula of an invertible matrix A, you form the

cofactor matrixC of A, where the entry in the i^{th}row and j^{th}column is c_{ij}, the cofactor of the entry a_{ij}in A (that is, (-1)^{i+j}M_{ij}), obtained by deleting the i^{th}row and j^{th}column of A. Next, youtranspose the cofactor matrix, C^{T}. This is theadjugate matrix.Then the matrix product AC

^{T}is[a_{11}a_{12}... a_{1n}][c_{11}c_{21}... c_{n1}] [a_{21}a_{22}... a_{2n}][c_{12}c_{22}... c_{2n}] [ : : :][ : : : ] [a_{n1}a_{n2}... a_{nn}][c_{1n}c_{2n}... c_{nn}] ,so the 11 entry of the product is

a

_{11}c_{11}+a_{12}c_{12}+ … + a_{1n}c_{1n}which is exactly the cofactor expansion of det(A). The 12 entry of the product is

a

_{11}c_{21}+a_{12}c_{22}+ … + a_{1n}c_{2n},which is the cofactor expansion of the determinant of the matrix

[a_{11}a_{12}... a_{1n}] [a_{11}a_{12}... a_{1n}] [ : : : ] [a_{n1}a_{n2}... a_{nn}] .This matrix has a repeated row, so the determinant of this matrix is 0.

Then the product AC

^{T}is[det(A) 0 0 ... 0 ] [ 0 det(A) 0 ... 0 ] [ 0 0 det(A) ... 0 ] [ : : : ... : ] [ 0 0 0 ... det(A)] ,which is det(A)I, where I is the nxn identity matrix.

### 2×2 example

We’ve already done this in our 2×2 example. With $$A=\begin{bmatrix}1&2\\3&4\end{bmatrix},$$ the cofactor matrix is $$C=\begin{bmatrix}4&-3\\-2&1\end{bmatrix},$$ swapping diagonally opposite entries and changing the sign of every other one. Its transpose is $$C^T=\begin{bmatrix}4&-2\\-3&1\end{bmatrix},$$ which is the adjugate. Dividing this by the determinant, \(1\cdot4-2\cdot3=-2,\) we get $$A^{-1}=\begin{bmatrix}\frac{4}{-2}&\frac{-2}{-2}\\\frac{-3}{-2}&\frac{1}{-2}\end{bmatrix}=\begin{bmatrix}-2&1\\\frac{3}{2}&-\frac{1}{2}\end{bmatrix}.$$ This is what we got before.

Can you see the connection between this and what we did with Cramer’s Rule?

### 3×3 example

Now let’s do a 3×3 example; using the example Doctor Fenton used above, I’ll take $$A=\begin{bmatrix}1&-1&0\\1&0&-1\\-6&2&3\end{bmatrix}.$$

The cofactor of the first entry, \(a_{11}\), is $$(-1)^{1+1}\begin{vmatrix}0&-1\\2&3\end{vmatrix}=2,$$ so that is the first entry. The cofactor of \(a_{12}\), is $$(-1)^{1+2}\begin{vmatrix}1&-1\\-6&3\end{vmatrix}=-(-3)=3,$$Continuing, the cofactor matrix is $$C=\begin{bmatrix}2&3&2\\3&3&4\\1&1&1\end{bmatrix},$$ and the adjugate is $$C^T=\begin{bmatrix}2&3&1\\3&3&1\\2&4&1\end{bmatrix}.$$

Its determinant is (using cofactors in the first row) $$\det(A)=\begin{vmatrix}1&-1&0\\1&0&-1\\-6&2&3\end{vmatrix}=1\cdot2+-1\cdot3+0\cdot2=2-3+0=-1.$$

So the inverse is $$A^{-1}=\frac{C^T}{\det(A)}=\frac{1}{-1}\begin{bmatrix}2&3&1\\3&3&1\\2&4&1\end{bmatrix}=\begin{bmatrix}-2&-3&-1\\-3&-3&-1\\-2&-4&-1\end{bmatrix},$$ as we got by row reduction.We can check this by multiplying:

$$AA^{-1}=\begin{bmatrix}1&-1&0\\1&0&-1\\-6&2&3\end{bmatrix}\begin{bmatrix}-2&-3&-1\\-3&-3&-1\\-2&-4&-1\end{bmatrix}=\\\begin{bmatrix}1\cdot-2+-1\cdot-3+0\cdot-2&1\cdot-3+-1\cdot-3+0\cdot-4&1\cdot-1+-1\cdot-1+0\cdot-1\\1\cdot-2+0\cdot-3+-1\cdot-2&1\cdot-3+0\cdot-3+-1\cdot-4&1\cdot-1+0\cdot-1+-1\cdot-1\\-6\cdot-2+2\cdot-3+3\cdot-2&-6\cdot-3+2\cdot-3+3\cdot-4&-6\cdot-1+2\cdot-1+3\cdot-1\end{bmatrix}=\\\begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix}$$

Pingback: Adjoints and Inconsistency: A Questionable Test – The Math Doctors