線性代數(Linear Algebra)

Mathematics   Yoshio    Mar 24th, 2024 at 8:00 PM    8    0   

線性代數

Linear Algebra


In math terms, an operation F is linear if scaling inputs scales the output, and adding inputs adds the outputs:

\(F(\alpha\overset\rightharpoonup x)=\alpha\cdot F(\overset\rightharpoonup x)\)

\(F(\overset\rightharpoonup x+\overset\rightharpoonup y)=F(\overset\rightharpoonup x)+F(\overset\rightharpoonup y)\)


[Ex] \(x{\color[rgb]{0.0, 0.0, 1.0}C}_{\color[rgb]{0.0, 0.0, 1.0}2}{\color[rgb]{0.0, 0.0, 1.0}H}_{\color[rgb]{0.0, 0.0, 1.0}5}{\color[rgb]{0.0, 0.0, 1.0}O}{\color[rgb]{0.0, 0.0, 1.0}H}+y{\color[rgb]{0.0, 0.0, 1.0}O}_{\color[rgb]{0.0, 0.0, 1.0}2}\Rightarrow z{\color[rgb]{0.0, 0.0, 1.0}C}{\color[rgb]{0.0, 0.0, 1.0}O}_{\color[rgb]{0.0, 0.0, 1.0}2}+w{\color[rgb]{0.0, 0.0, 1.0}H}_{\color[rgb]{0.0, 0.0, 1.0}2}{\color[rgb]{0.0, 0.0, 1.0}O}\)

\(x{\color[rgb]{0.0, 0.0, 1.0}\begin{bmatrix}2\\1\\6\end{bmatrix}}+y{\color[rgb]{0.0, 0.0, 1.0}\begin{bmatrix}{\color[rgb]{0.0, 0.0, 0.0}0}\\2\\{\color[rgb]{0.0, 0.0, 0.0}0}\end{bmatrix}}=z{\color[rgb]{0.0, 0.0, 1.0}\begin{bmatrix}1\\2\\{\color[rgb]{0.0, 0.0, 0.0}0}\end{bmatrix}}+w{\color[rgb]{0.0, 0.0, 1.0}\begin{bmatrix}{\color[rgb]{0.0, 0.0, 0.0}0}\\1\\2\end{bmatrix}}\)

\({\color[rgb]{0.0, 0.0, 1.0}\begin{bmatrix}2&{\color[rgb]{0.1, 0.1, 0.1}0}&-1&{\color[rgb]{0.1, 0.1, 0.1}-}{\color[rgb]{0.1, 0.1, 0.1}0}\\1&2&-2&-1\\6&{\color[rgb]{0.1, 0.1, 0.1}0}&{\color[rgb]{0.1, 0.1, 0.1}-}{\color[rgb]{0.1, 0.1, 0.1}0}&-2\end{bmatrix}}\begin{bmatrix}x\\y\\z\\w\end{bmatrix}={\color[rgb]{1.0, 0.0, 0.0}\begin{bmatrix}0\\0\\0\end{bmatrix}}\)





Linear algebra provides an invaluable tool kit for computation tasks such as solving a system of linear equations or finding the inverse of a square matrix. Given that these tasks usually involve a large number of calculations, it is often the case that the geometric interpretation is underappreciated. Linear algebra is a genuine oddity in that it is possible to understand this entire discipline with multiple, distinct perspectives, each of which has its own merits and its own way of understanding this vast and elegant subject. A criminally overlooked perspective for new students of linear algebra is the one in which we think of matrices as a way of transforming vectors, hence offering us a well-developed tool kit to start describing (linear) spatial geometry.

Vectors are special types of matrices which can be split into two categories: row vectors and column vectors. A row vector is a matrix of order \(1\times n\), which has \(1\) row and \(n\) columns, whereas a column vector is a matrix of order \(m\times1\) which has \(m\) rows and \(1\) column. Although convention differs from source to source, it is arguably most sensible to think only in terms of column vectors. There are two reasons for this: column vectors are used more often and can be related to row vectors by transposition if needed; and column vectors are possible to combine with matrices under matrix multiplication on the left-hand side (which is also simply a matter of arbitrary convention but for some reason it just feels better).

\(a=\begin{pmatrix}3\\2\end{pmatrix},\;b=\begin{pmatrix}-5\\2\end{pmatrix}\)

The matrix multiplication \(AB\) is well defined so long as \(A\) has order \(m\times n\) and \(B\) has order \(n\times p\) The resulting matrix is of order \(m\times p\).

uppose now that we were to consider the matrix products \(a'=Ma\) and \(b'=Mb\).

\(M=\begin{pmatrix}1&2\\-1&3\end{pmatrix}\)

\(a'=Ma=\begin{pmatrix}1&2\\-1&3\end{pmatrix}\begin{pmatrix}3\\2\end{pmatrix}=\begin{pmatrix}7\\3\end{pmatrix}\)

\(b'=Mb=\begin{pmatrix}1&2\\-1&3\end{pmatrix}\begin{pmatrix}-5\\2\end{pmatrix}=\begin{pmatrix}-1\\11\end{pmatrix}\)

\(M=\begin{pmatrix}2&-1\\1&1\end{pmatrix}\)

\(a'=Ma=\begin{pmatrix}2&-1\\1&1\end{pmatrix}\begin{pmatrix}3\\2\end{pmatrix}=\begin{pmatrix}4\\5\end{pmatrix}\)

\(b'=Mb=\begin{pmatrix}2&-1\\1&1\end{pmatrix}\begin{pmatrix}-5\\2\end{pmatrix}=\begin{pmatrix}-12\\-3\end{pmatrix}\)


Eigenvalues and Eigenvectors

The prefix eigen- is adopted from the German word eigen (cognate with the English word own) for 'proper', 'characteristic', 'own'.

\(A \vec{v} =\lambda \vec{v} \)

Notice we multiply a squarematrix by a vector and get the same result as when we multiply a scalar (just a number) by that vector.

We start by finding the eigenvalue. We know this equation must be true:

\(A \vec{v} =\lambda \vec{v} \)

Next we put in an identity matrix so we are dealing with matrix-vs-matrix:

\(A \vec{v} =\lambda {I} \vec{v} \)

Bring all to left hand side:

\(A \vec{v} -\lambda I \vec{v} = 0\)

If v is non-zero then we can (hopefully) solve for λ using just the determinant:

\(det(A - \lambda I) = 0\)

\(| A − \lambda I | = 0\)

trace of A

The matrix A has n eigenvalues (including each according to its multiplicity). The sum of the n eigenvalues of A is the same as the trace of A (that is, the sum of the diagonal elements of A). The product of the n eigenvalues of A is the same as the determinant of A.



Perpendicular Vector

While both involve multiplying the magnitudes of two vectors, the dot product results in a scalar quantity, which indicates magnitude but not direction, while the cross product results in a vector, which indicates magnitude and direction.


Inner product (Dot product)

Since the inner product generalizes the dot product, it is reasonable to say that two vectors are “orthogonal” (or “perpendicular”) if their inner product is zero. With this definition, we see from the preceding example that sin x and cosx are orthogonal (on the interval [−π, π]).

\(\overset\rightharpoonup A\cdot\overset\rightharpoonup B=\sum_{i=1}^n\;a_ib_i=a_1b_1+a_2b_2+\cdots+a_nb_n\)


Outer product (Cross product)

A cross product of any two vectors yields a vector which is perpendicular to both vectors .

\(\because\) for two vectors \(\overrightarrow A\) and \(\overrightarrow B\), if \(\overrightarrow C\) is the vector perpendicular to both.

\(\overrightarrow C=\overrightarrow A\times\overrightarrow B=\begin{bmatrix}\widehat i&\widehat j&\widehat k\\A_1&A_2&A_3\\B_1&B_2&B_3\end{bmatrix}\\=(A_2B_3-B_2A_3)\widehat i-(A_1B_3-B_1A_3)\widehat j+(A_1B_2-B_1A_2)\widehat k\)

Inserting given vectors we obtain

\(\overrightarrow C=\begin{bmatrix}\widehat i&\widehat j&\widehat k\\2&3&4\\1&2&3\end{bmatrix}\)

\(={(3\times3-2\times4)}\widehat i-{(2\times3-1\times4)}\widehat j+{(2\times2-1\times3)}\widehat k\)

\(=\widehat i-2\widehat j+\widehat k\)


Cross product trick

\( \vec{A} \times \vec{B}=\begin{vmatrix} \hat{i}&\hat{j}&\hat{k} \\ A_x & A_y & A_z \\ B_x&B_y&B_z \end{vmatrix} \)

\( \vec{A}\times\vec{B}=(A_y B_z- A_z B_y)\hat{i}+(A_z B_x - A_x B_z) \hat{j}+(A_x B_y- A_y B_x) \hat{k} \nonumber \)



Matrix multiplication

There are two basic types of matrix multiplication: inner (dot) product and outer product. The inner product results in a matrix of reduced dimensions, the outer product results in one of expanded dimensions. A helpful mnemonic: Expand outward, contract inward.


Eigenvalue Decomposition


\[\begin{bmatrix}\mathtt{\,\,\,\,0}&\mathtt{\,\,\,\,1}\\\mathtt{-2}&\mathtt{-3}\end{bmatrix}\begin{bmatrix}\mathtt{-1}&\mathtt{-1}\\\mathtt{\,\,\,\,2}&\mathtt{\,\,\,\,1}\end{bmatrix}=\begin{bmatrix}\mathtt{-1}&\mathtt{-1}\\\mathtt{\,\,\,\,2}&\mathtt{\,\,\,\,1}\end{bmatrix}\begin{bmatrix}\mathtt{-2}&\mathtt{\,\,\,\,0}\\\mathtt{\,\,\,\,0}&\mathtt{-1}\end{bmatrix}\]

\[\begin{bmatrix}\mathtt{\,\,\,\,0}&\mathtt{\,\,\,\,1}\\\mathtt{-2}&\mathtt{-3}\end{bmatrix}=\begin{bmatrix}\mathtt{-1}&\mathtt{-1}\\\mathtt{\,\,\,\,2}&\mathtt{\,\,\,\,1}\end{bmatrix}\begin{bmatrix}\mathtt{-2}&\mathtt{\,\,\,\,0}\\\mathtt{\,\,\,\,0}&\mathtt{-1}\end{bmatrix}\begin{bmatrix}\mathtt{\,\,\,\,1}&\mathtt{\,\,\,\,1}\\\mathtt{-2}&\mathtt{-1}\end{bmatrix}\]



Example (A)

As a real example of this, consider a simple model of the weather. Let’s say it either is sunny or it rains. If it’s sunny, there’s a 90% chance that it remains sunny, and a 10% chance that it rains the next day. If it rains, there’s a 40% chance it rains the next day, but a 60% chance of becoming sunny the next day. We can describe the time evolution of this process through a matrix equation:

\( \begin{bmatrix}{}_{Ps,n+1}\\{}_{Pr,n+1}\end{bmatrix}=\begin{bmatrix}0.9&0.6\\0.1&0.4\end{bmatrix}\begin{bmatrix}{}_{Ps,n}\\{}_{Pr,n}\end{bmatrix} \)

where \({}_{Ps,n}\), \({}_{Ps,n}\) are the probabilities that it is sunny or that it rains on day \(n\). (If you are curious, this kind of process is called a Markov process, and the matrix that I have defined is called the transition matrix. This is, by the way, not a very good model of the weather.) One way to see that this matrix does provide the intended result is to see what results from matrix multiplication:

\(p_{s,n+1}=0.9p_{s,n}+0.6p_{r,n}\)
\(p_{r,n+1}=0.1p_{s,n}+0.4p_{r,n}\)

This is just the law of total probability! To get a sense of the long run probabilities of whether it rains or not, let’s look at the eigenvectors and eigenvalues.

\(\begin{bmatrix}0.986394\\0.164399\end{bmatrix}\) with eigenvalue 1, \(\begin{bmatrix}‐0.707\\0.707\end{bmatrix}\) with eigenvalue 0.3

Now, this matrix ends up having an eigenvector of eigenvalue 1 (in fact, all valid Markov transition matrices do), while the other eigenvector is both unphysical (probabilities cannot be negative) and has an eigenvalue less than 1. Therefore, in the long run, we expect the probability of it being sunny to be 0.986 and the probability of it being rainy to be 0.164 in this model. This is known as the stationary distribution. Any component of the vector in the direction of the unphysical eigenvector will wither away in the long run (its eigenvalue is 0.3, and \(0.3^k\rightarrow0\;as\;k\rightarrow\infty\), while the one with eigenvalue 1 will survive and dominate.

Eigenspaces

The set of all eigenvectors of T corresponding to the same eigenvalue, together with the zero vector, is called an eigenspace, or the characteristic space of T associated with that eigenvalue. If a set of eigenvectors of T forms a basis of the domain of T, then this basis is called an eigenbasis.

Eigenvectors of Symmetric Matrices Are Orthogonal

軸元 Pivot

軸元(英語:pivot或pivot element)是矩陣、陣列或是其他有限集合的一個演算元素,算法(如高斯消去法、快速排序、單體法等等)首先選出軸元,用於特定計算。

The pivot or pivot element is the element of a matrix, or an array, which is selected first by an algorithm (e.g. Gaussian elimination, simplex algorithm, etc.), to do certain calculations.



Markov chain

A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happens next depends only on the state of affairs now." A countably infinite sequence, in which the chain moves state at discrete time steps, gives a discrete-time Markov chain (DTMC). A continuous-time process is called a continuous-time Markov chain (CTMC). It is named after the Russian mathematician Andrey Markov.

Markov chains have many applications as statistical models of real-world processes, such as studying cruise control systems in motor vehicles, queues or lines of customers arriving at an airport, currency exchange rates and animal population dynamics.


Example (B)

The Linear Algebra View of the Fibonacci Sequence


The Fibonacci sequence first appears in the book Liber Abaci (The Book of Calculation, 1202) by Fibonacci where it is used to calculate the growth of rabbit populations.

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, ....

Each term in the Fibonacci sequence equals the sum of the previous two. That is, if we let Fn denote the nth term in the sequence we can write

\(F_n=F_{n-1}+F_{n-2}\)

To make this into a linear system, we need at least one more equation. Let’s use an easy one:

\(F_{n-1}=F_{n-1}+\left(0\right)\times F_{n-2}\)

Putting these together, here’s our system of linear equations for the Fibonacci sequence:

\(F_n=F_{n-1}+F_{n-2}\)
\(F_{n-1}=F_{n-1}+0\)

We can write this in matrix notation as the following:

\(\begin{bmatrix}F_n\\F_{n-1}\end{bmatrix}=\begin{bmatrix}1&1\\1&0\end{bmatrix}\begin{bmatrix}F_{n-1}\\F_{n-2}\end{bmatrix}\)

Here’s how we do that:

\(\begin{vmatrix}1-\lambda&1\\1&-\lambda\end{vmatrix}=0\)

\((1-\lambda)(-\lambda)-1=0\)

\(\lambda^2-\lambda-1=0\)

get the following two solutions, which are our two eigenvalues:

\(\lambda_1=\frac{1+\sqrt5}2\)
\(\lambda_2=\frac{1-\sqrt5}2\)

The next step is to find the two eigenvectors that correspond to the eigenvalues.

\(\left(A-\lambda I\right)x=0\)

we have the eigenvalues and eigenvectors, we can write the \(\mathrm S\) and \(\Lambda\) matrices as follows:

\(\mathrm S=\begin{bmatrix}\frac{1+\sqrt5}2&\frac{1-\sqrt5}2\\1&1\end{bmatrix}\)

\(\mathrm\Lambda=\begin{bmatrix}\frac{1+\sqrt5}2&0\\0&\frac{1-\sqrt5}2\end{bmatrix}\)

\(\mathrm S^{-1}=\frac1{\sqrt5}\begin{bmatrix}1&\frac{\sqrt5-1}2\\-1&\frac{\sqrt{5+1}}2\end{bmatrix}=\begin{bmatrix}\frac1{\sqrt5}&\frac{\sqrt5-1}{2\sqrt5}\\-\frac1{\sqrt5}&\frac{\sqrt{5+1}}{2\sqrt5}\end{bmatrix}\)


\({38}_{th}=39088169\), and \({39}_{th}\) is ?
\(39088169\times\frac{1+\sqrt5}2=39088169\times1.61803398875=63245986\)
\(39088169\div\frac{1-\sqrt5}2=39088169\div-0.61803398875=-63245986\),
\(\left|-63245986\right|=63245986\)

\({57}_{th}=365435296162\), and \({56}_{th}\) is ?
\(365435296162\div\frac{1+\sqrt5}2=365435296162\div1.61803398875=225851433717\)
\(365435296162\times\frac{1-\sqrt5}2=365435296162\times-0.61803398875=-225851433717\),
\(\left|-225851433717\right|=225851433717\)



Every beehive has one female queen bee who lays all the eggs. If an egg is fertilized by a male bee, then the egg produces a female worker bee, who doesn’t lay any eggs herself. If an egg is not fertilized it eventually hatches into a male bee, called a drone.


Fibonacci sequence also occur in real populations, here honeybees provide an example. A look at the pedigree of family tree of honey bee suggests that mating tendencies follow the Fibonacci numbers.

In a colony of honeybees there is one distinctive female called the queen. The other females are worker bees who, unlike the queen bee, produce no eggs. Male bees do no work and are called drone bees. When the queen lays eggs, the eggs could be either fertilised or unfertilised. The unfertilised eggs would result in drone (male bee) which carry 100% of their genome and fertilised eggs will result in female worker bees which carry 50% of their genome.

Males are produced by the queen’s unfertilised eggs, so male bees only have a mother but no father. All the females are produced when the queen has mated with a male and so have two parents. Females usually end up as worker bees but some are fed with a special substance called royal jelly which makes them grow into queens ready to go off to start a new colony when the bees form a swarm and leave their home (a hive) in search of a place to build a new nest.

Vector and Linear combination

\(\underline v=\begin{bmatrix}v_1\\v_2\end{bmatrix},\;column\;vector\)

Vector Addition

\(\underline v=\begin{bmatrix}v_1\\v_2\end{bmatrix}\;and\;\underline w=\begin{bmatrix}w_1\\w_2\end{bmatrix}\;\Rightarrow\underline v+\underline w=\begin{bmatrix}v_1+w_1\\v_2+w_2\end{bmatrix}\)

Scalar Multiplication

\(2\underline v=\begin{bmatrix}2v_1\\2v_2\end{bmatrix}\;\Rightarrow\;c\underline v=\begin{bmatrix}cv_1\\cv_2\end{bmatrix}\)

Linear Combination

\(c\underline v+d\underline w=\begin{bmatrix}cv_1+dw_1\\cv_2+dw_2\end{bmatrix},\;\{c,\;d\}\in\mathfrak R\)



derivative is linear

\(f(x)\rightarrow f'(x)\)

\(\alpha\cdot f(x)\rightarrow\alpha\cdot f'(x)\)

\(\alpha\cdot f(x)+\beta\cdot g(x)\rightarrow\alpha\cdot f'(x)+\beta\cdot g'(x)\)

integral is linear

\(f(x)\rightarrow\int_a^bf(x)dx\)

\(\alpha\cdot f(x)\rightarrow\alpha\cdot\int_a^bf(x)dx\)

\(\alpha\cdot f(x)+\beta\cdot g(x)\rightarrow\alpha\cdot\int_a^bf(x)dx+\beta\cdot\int_a^bg(x)dx\)

The Fourier Transform, Laplace transform, and Wavelet transforms are linear operators.


Linear Equations

\(x-2y=1\)

\(3x+2y=11\)

\(\begin{bmatrix}1&-2\\3&2\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}1\\11\end{bmatrix}\)

\(A\underline x=\underline b\)

\(A\in matrix\)

row picture:

solution: x=3, y=1

column picture:

\(x\begin{bmatrix}1\\3\end{bmatrix}+y\begin{bmatrix}-2\\2\end{bmatrix}=\begin{bmatrix}1\\11\end{bmatrix}\)

find the linear combination of the vectors on the left side that equals the vector on the right side.







\(A\underline x=\underline b\)

raw picture : different there planes.

column picture : the same three column vectors.


Question: can we solve \(A\underline x=\underline b\) for every \(\underline b\) ?

Do the linear combinations of the columns fill the 3-dimentional space?

Yes for \(\begin{bmatrix}2&1&1\\4&-6&0\\-2&7&2\end{bmatrix}\)

No when the three column vectors lie in the same plane.


Gaussian Elimination

In mathematics, Gaussian elimination, also known as row reduction, is an algorithm for solving systems of linear equations. It consists of a sequence of row-wise operations performed on the corresponding matrix of coefficients. This method can also be used to compute the rank of a matrix, the determinant of a square matrix, and the inverse of an invertible matrix.



pivot. forward elimination and backward substitution



In general, \(\begin{bmatrix}A&\vert&\underline b\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}\;U&\vert&\underline c\;\end{bmatrix}\) upper triangular matrix

\(A\underline x=\underline b\Rightarrow U\underline x=\underline c\)

In order to solve the unknowns, pivots can not be zero.

When would the process break down?

(i) sometimes rows should be exchanged. nonsingular case

(ii) These equations may be solvable or unsolvable. singular case


Triangular matrix

Two particularly important types of triangular matrices are termed upper triangular and lower triangular—these are matrices that have, respectively, all 0 elements below and above the diagonal:

\(L=\left[\begin{array}{lccc}a_{11}&0&\cdots&0\\a_{21}&a_{22}&\cdots&0\\\vdots&\vdots&\ddots&\vdots\\a_{n1}&a_{n2}&\cdots&a_{nn}\end{array}\right]\)

\(\operatorname{det}{(L)}=\prod_{k=1}^na_{kk}\)

\(U=\left[\begin{array}{llll}a_{11}&a_{12}&\cdots&a_{1n}\\0&a_{22}&\cdots&a_{2n}\\\vdots&\vdots&\ddots&\vdots\\0&0&\cdots&a_{nn}\end{array}\right]\)

\(\operatorname{det}{(U)}=\prod_{k=1}^na_{kk}\)

The determinant and permanent of a triangular matrix equal the product of the diagonal entries.



Inverse Matrices

identity matrix \(I_n\)

\(I=\left[ \begin{array}{c}1\end{array} \right] , \left[ \begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array} \right] ,\left[ \begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array} \right] ,\left[ \begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array} \right]\)

\(\delta _{ij}=\left\{ \begin{array}{c} 1 \text{ if }i=j \\ 0\text{ if }i\neq j \end{array} \right.\nonumber\)

\(AI=IA=A\) for any \(n\times n\) matrix \(A\)

Def \(A_n\;n\times n\;matrix\;A\;is\;invertible\;if\;there\;exists\;a\;matrix\;B\;such\;that\;BA=I\;and\;AB=I\)

\(B\;is\;called\;\) an \(\;inverse\;of\;A\)

Claim: Suppose \(A\) is invertible. Then its inverse is unique.

Proof:

Suppose \(A\) has two inverses \(B\) and \(C\)

Then \(BA=I\) and \(AC=I\)

We have \(B=BI=B(AC)=(BA)C=IC=C\)

\(B=IB=(CA)B=C(AB)=CI=C\)

Remark: The inverse of \(A\) is denoted as \(A^{-1}\)

Remark: The proof actually shows that if \(BA=AC=I\) then \(B=C\)

"left inverse" = "right inverse" = "inverse"

Claim: the inverse of \(A^{-1}\) is \(A\) itself

Proof: \(AA^{-1}=I\;and\;A^{-1}A=I\) since \(A^{-1}\) is the inverse of \(A\)

Therefor \(A\) is the inverse of \(A^{-1}\)

Claim: If \(A\) is invertible the one and the only one solution to \(A\underline x=\underline b\) is

\(\underline x=A^{-1}\underline b\)

Proof: \(A^{-1}A\underline x=A^{-1}\underline b\)

\(\Leftrightarrow I\underline x=A^{-1}\underline b\)

\(\Leftrightarrow\underline x=A^{-1}\underline b\)

Claim: Suppose there is a nonzero solution \(\underline x\;to\;A\underline x=\underline0\;then\;A\;cannot\;have\;an\;inverse\)

Proof: If \(A\) is invertible, \(\Rightarrow(A^{-1}A)\underline x=A^{-1}\underline0\)

\(\therefore\underline x=\underline0\)

Definition: A square \(n \times n\) matrix \(D\) is a Diagonal Matrix if all entries off the main diagonal are zero, that is for all \(i \neq j\), \(a_{ij} = 0\).

\[\begin{split}\boldsymbol D = \begin{bmatrix} d_1 & 0 & 0 & ... & 0 \\ 0 & d_2 & 0 & ... & 0 \\ 0 & 0 & d_3 & ... & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & ... & d_n \end{bmatrix}\end{split}\]

\[\begin{split}\boldsymbol D^{-1} = \begin{bmatrix} \frac{1}{d_1} & 0 & 0 & ... & 0 \\ 0 & \frac{1}{d_2} & 0 & ... & 0 \\ 0 & 0 & \frac{1}{d_3} & ... & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & ... & \frac{1}{d_n} \end{bmatrix}\end{split}\]

Claim: A diagonal matrix has an inverse prodived no diagonal entries are zero.

Proof: From the diagonal matrix inverse, each diagonal element is the inverse of diagonal matrix diagonal element.

Claim: If A and B are invertible matrices with same size, then so is (AB). \({(AB)}^{-1}=B^{-1}A^{-1}\)

Proof: \((B^{-1}A^{-1})(AB)=B^{-1}(A^{-1}A)B=B^{-1}IB=B^{-1}B=I\)

\((AB)(B^{-1}A^{-1})=A(BB^{-1})A^{-1}=AIA^{-1}=AA^{-1}=I\)

Claim: \({(ABC)}^{-1}=C^{-1}B^{-1}A^{-1}\)


[Ex] \(E=\begin{bmatrix}1&0&0\\-4&1&0\\0&0&1\end{bmatrix},\;E^{-1}=\begin{bmatrix}1&0&0\\4&1&0\\0&0&1\end{bmatrix}\)

[Ex] Permutation Matrix

row exchange matrix \(P_{32}=\begin{bmatrix}1&0&0\\0&0&1\\0&1&0\end{bmatrix},\;P_{32}^{-1}=\begin{bmatrix}1&0&0\\0&0&1\\0&1&0\end{bmatrix}=P_{32}\)


What is Block Diagonal Matrix?

A matrix which is split into blocks is called a block matrix. In such type of square matrix, off-diagonal blocks are zero matrices and main diagonal blocks square matrices. Here, the non-diagonal blocks are zero. Dij = 0 when i is not equal to j, then D is called a block diagonal matrix.

n x n order block diagonal matrix

\(\begin{array}{l}\begin{bmatrix} a_{11} & 0 & 0 & . & . & 0\\ 0 & a_{22} & 0 & .& . &0 \\ 0& 0 & a_{33} & .& . & 0\\ .& . & . &. & .&. \\ .& . & . & . & . &. \\ 0& 0 & 0 &0 &0 & a_{nn} \end{bmatrix}\end{array} \)

Inverse of a Diagonal Matrix

If the elements on the main diagonal are the inverse of the corresponding element on the main diagonal of the D, then D is a diagonal matrix.

\(\begin{array}{l}Let\ D = \begin{bmatrix} a_{11} & 0& 0\\ 0 & a_{22} & 0\\ 0& 0 & a_{33} \end{bmatrix}\end{array} \)

Determinants of the above matrix:

\(\begin{array}{l}\left | D \right | = a_{11}\begin{bmatrix} a_{22} & 0\\ 0 & a_{33} \end{bmatrix}+ 0 \begin{bmatrix} 0& 0\\ 0 & a_{33} \end{bmatrix} + 0 \begin{bmatrix} 0 & a_{22} & \\ 0 & 0 \end{bmatrix}\end{array} \)

= a11a22a33

\(\begin{array}{l}Adj D = \begin{bmatrix} a_{22}a_{33} & 0& 0\\ 0 & a_{11}a_{33} & 0\\ 0& 0 & a_{11}a_{22} \end{bmatrix}\end{array} \)

\(\begin{array}{l}D^{-1} = \frac{1}{\left | D \right |} adj D\end{array} \)

\(=\begin{array}{l}\frac{1}{a_{11}a_{22}a_{33}} \begin{bmatrix} a_{22}a_{33} & 0& 0\\ 0 & a_{11}a_{33} & 0\\ 0& 0 & a_{11}a_{22} \end{bmatrix}\end{array} \)

\(=\begin{array}{l}\begin{bmatrix} \frac{1}{a_{11}} &0 & 0\\ 0 & \frac{1}{a_{22}} &0 \\ 0& 0 & \frac{1}{a_{33}} \end{bmatrix}\end{array} \)

Anti Diagonal Matrix

If the entries in the matrix are all zero except the ones on the diagonals from lower left corner to the other upper side(right) corner are not zero, it is anti diagonal matrix.

It is of the form:

\(\begin{array}{l}\begin{bmatrix} 0 & 0 &a_{13} \\ 0 & a_{22}&0 \\ a_{31}& 0 & 0 \end{bmatrix}\end{array} \)

Gauss-Jordan Elimination

Given \(A\), we want to find its inverse \(A^{-1}\)

\(AA^{-1}=I\)

\(\begin{bmatrix}\underline{x_1}&\underline{x_2}&\underline{x_3}\end{bmatrix}=A^{-1}\)

\(A\begin{bmatrix}\underline{x_1}&\underline{x_2}&\underline{x_3}\end{bmatrix}=\begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix}=\begin{bmatrix}\underline{e_1}&\underline{e_2}&\underline{e_3}\end{bmatrix}\)

\(\underline{e_1}=\begin{bmatrix}1\\0\\0\end{bmatrix},\;\underline{e_2}=\begin{bmatrix}0\\1\\0\end{bmatrix},\;\underline{e_3}=\begin{bmatrix}0\\0\\1\end{bmatrix},\)

\(A\underline{x_1}=\underline{e_1}\)

\(A\underline{x_2}=\underline{e_2}\)

\(A\underline{x_3}=\underline{e_3}\)

\(A^{-1}\cdot\begin{bmatrix}A&I\end{bmatrix}=\begin{bmatrix}A^{-1}A&A^{-1}I\end{bmatrix}=\begin{bmatrix}I&A^{-1}\end{bmatrix}\)

\(\begin{bmatrix}2&-1&0&1&0&0\\-1&2&-1&0&1&0\\0&-1&2&0&0&1\end{bmatrix}\)

\(\begin{bmatrix}2&-1&0&1&0&0\\0&\frac32&-1&\frac12&1&0\\0&-1&2&0&0&1\end{bmatrix}\)

\(\begin{bmatrix}2&-1&0&1&0&0\\0&\frac32&-1&\frac12&1&0\\0&0&\frac43&\frac13&\frac23&1\end{bmatrix}\)

above - Gauss

\(\begin{bmatrix}2&-1&0&1&0&0\\0&\frac32&0&\frac34&\frac32&\frac34\\0&0&\frac43&\frac13&\frac23&1\end{bmatrix}\)

\(\begin{bmatrix}2&0&0&\frac32&1&\frac12\\0&\frac32&0&\frac34&\frac32&\frac34\\0&0&\frac43&\frac13&\frac23&1\end{bmatrix}\)

\(\begin{bmatrix}1&0&0&\frac34&\frac12&\frac14\\0&1&0&\frac12&1&\frac12\\0&0&1&\frac14&\frac12&\frac34\end{bmatrix}\)

above - Jordan

\(\therefore\underline{x_1}=\begin{bmatrix}\frac34\\\frac12\\\frac14\end{bmatrix},\;\underline{x_2}=\begin{bmatrix}\frac12\\1\\\frac12\end{bmatrix},\;\underline{x_3}=\begin{bmatrix}\frac14\\\frac12\\\frac34\end{bmatrix}\)

\(A^{-1}=\begin{bmatrix}\frac34&\frac12&\frac14\\\frac12&1&\frac12\\\frac14&\frac12&\frac34\end{bmatrix}\)


\(\lbrack\;X\;\vert\;I\;\rbrack\)

\(\lbrack\;X\;A_1\;\vert\;A_1\;\rbrack\)

\(\lbrack\;X\;A_1\;\cdots\;A_n\;\vert\;A_1\;\cdots\;A_n\;\rbrack\)

Let’s use \(Y\;=\;A_1\;\cdots\;A_n\) for convenience. So we have \(\lbrack\;X\;Y\;\vert\;Y\;\rbrack\)

\(X\;Y\;=\;I\)

That can only mean that \(Y\;=\;X^{-1}\)


Def: a \(n\times n\) matrix is nonsingular if it has a full set of n (nonzero) pivots.

Claim: A matrix is invertible if and only if (iff) it is nonsingular.

[Proof]

Suppose A is nonsingular, i.e. it has a full set of n pivots.

Then by Gauss-Jordan elimination, we can find a matrix B such that \(AB=I\).

\(since\;A\underline{x_i}=\underline{e_i}\;is\;solvable\;for\;all\;i=1,\;2,\;...,\;n\)

On the other hand, Gauss-Jordan elimination is really a sequence of multiplications by elementary matrices on the left.

\((D^{-1}..E..P..E)A=I\)

\(where\;E_{ij}:\;to\;substract\;amultiple\;of\;l_{ij}\;of\;row\;j\;from\;row\;i\)

\(P_{ij}:\;to\;exchange\;rows\;i\;and\;j\)

\(D^{-1}:\;to\;divide\;all\;rows\;by\;their\;pivots\)

\(That\;is,\;there\;is\;a\;matrix\;C\;such\;that\;C=(D^{-1}..E..P..E),\;CA=I\)

\(Therefore,\;B=C=A^{-1}\;and\;A\;is\;invertible\)

\(If\;A\;does\;not\;have\;n\;pivots,\;elimination\;will\;lead\;to\;a\;all\;zero\;row.\)

\(There\;is\;an\;invertible\;M\;such\;that\;a\;row\;of\;MA\;is\;zero.\)

\(If\;AC=I\;is\;possible,\;then\;MAC=M\)

\(MAC=M,\;(MA)\;is\;with\;a\;zero\;row,\;M\;is\;with\;a\;zero\;row\)

\(Hence\;M\;must\;have\;a\;zero\;row,\;which\;reaches\;a\;contradiction\;\sin ce\;M\;is\;invertible.\)

\(Therefor\;A\;is\;not\;invetible.\)


In another viepoint

\(\lbrack A\;I\rbrack\Rightarrow\lbrack I\;A^{-1}\rbrack\) by performing elementary row operations

\(A^{-1}\lbrack A\;I\rbrack=\lbrack I\;A^{-1}\rbrack\)


Elimination=Factorization \(A=LU\)

[Ex] \(\begin{bmatrix}2&1&1\\4&-6&0\\-2&7&2\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}2&1&1\\0&-8&-2\\-2&7&2\end{bmatrix}\Rightarrow\begin{bmatrix}2&1&1\\0&-8&-2\\0&8&3\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}2&1&1\\0&-8&-2\\0&0&1\end{bmatrix}\)

\(\begin{bmatrix}1&0&0\\0&1&0\\0&1&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&1&0\\1&0&1\end{bmatrix}\begin{bmatrix}1&0&0\\-2&1&0\\0&0&1\end{bmatrix}\begin{bmatrix}2&1&1\\4&-6&0\\-2&7&2\end{bmatrix}=\begin{bmatrix}2&1&1\\0&-8&-2\\0&0&1\end{bmatrix}\)

\((E_{32}E_{31}E_{21})A=U\) Upper trianglular

\(E_{21}=\begin{bmatrix}1&0&0\\-2&1&0\\0&0&1\end{bmatrix}\)

\(E_{31}=\begin{bmatrix}1&0&0\\0&1&0\\1&0&1\end{bmatrix}\)

\(E_{32}=\begin{bmatrix}1&0&0\\0&1&0\\0&1&1\end{bmatrix}\)

\(E_{21}^{-1}=\begin{bmatrix}1&0&0\\2&1&0\\0&0&1\end{bmatrix}\)

\(E_{31}^{-1}=\begin{bmatrix}1&0&0\\0&1&0\\-1&0&1\end{bmatrix}\)

\(E_{32}^{-1}=\begin{bmatrix}1&0&0\\0&1&0\\0&-1&1\end{bmatrix}\)

\(\begin{bmatrix}1&0&0\\l_{21}&1&0\\0&0&1\end{bmatrix},\;\begin{bmatrix}1&0&0\\0&1&0\\l_{31}&0&1\end{bmatrix},\;\begin{bmatrix}1&0&0\\0&1&0\\0&l_{32}&1\end{bmatrix}\)

Inverse \(\begin{bmatrix}1&0&0\\-l_{21}&1&0\\0&0&1\end{bmatrix},\;\begin{bmatrix}1&0&0\\0&1&0\\-l_{31}&0&1\end{bmatrix},\;\begin{bmatrix}1&0&0\\0&1&0\\0&-l_{32}&1\end{bmatrix}\)

\((E_{32}E_{31}E_{21})A=U\)

\({(E_{32}E_{31}E_{21})}^{-1}(E_{32}E_{31}E_{21})A={(E_{32}E_{31}E_{21})}^{-1}U\)

\(A=(E_{21}^{-1}E_{31}^{-1}E_{32}^{-1})U\)

\(E_{21}^{-1}E_{31}^{-1}E_{32}^{-1}=\begin{bmatrix}1&0&0\\2&1&0\\0&0&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&1&0\\-1&0&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&1&0\\0&-1&1\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&0&0\\2&1&0\\0&0&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&1&0\\-1&-1&1\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&0&0\\2&1&0\\-1&-1&1\end{bmatrix}\)

\(\begin{bmatrix}1&0&0\\\boxed2&1&0\\0&0&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&1&0\\\boxed{-1}&0&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&1&0\\0&\boxed{-1}&1\end{bmatrix}\Rightarrow\begin{bmatrix}1&0&0\\\boxed2&1&0\\\boxed{-1}&\boxed{-1}&1\end{bmatrix}\)

In general,

\(E_{21}^{-1}E_{31}^{-1}E_{32}^{-1}=\begin{bmatrix}1&0&0\\-l_{21}&1&0\\0&0&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&1&0\\-l_{31}&0&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&1&0\\0&-l_{32}&1\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&0&0\\-l_{21}&1&0\\0&0&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&1&0\\-l_{31}&-l_{32}&1\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&0&0\\-l_{21}&1&0\\-l_{31}&-l_{32}&1\end{bmatrix}\)=L, lower triangular matrix

\(Note:\;E_{32}E_{31}E_{21}\neq\begin{bmatrix}1&0&0\\l_{21}&1&0\\l_{31}&l_{32}&1\end{bmatrix}\)

\(A=LU\) if no row exchanges are required

\(\begin{bmatrix}2&1&1\\4&-6&0\\-2&7&2\end{bmatrix}=\begin{bmatrix}1&0&0\\2&1&0\\-1&-1&1\end{bmatrix}\begin{bmatrix}2&1&1\\0&-8&-2\\0&0&1\end{bmatrix}\)

\(\begin{bmatrix}\boxed2&1&1\\0&\boxed{-8}&-2\\0&0&\boxed1\end{bmatrix}\Rightarrow2,\;-8,\;1\;are\;pivots\)

\(\begin{bmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{bmatrix}=\begin{bmatrix}\ell_{11}&0&0\\\ell_{21}&\ell_{22}&0\\\ell_{31}&\ell_{32}&\ell_{33}\end{bmatrix}\begin{bmatrix}u_{11}&u_{12}&u_{13}\\0&u_{22}&u_{23}\\0&0&u_{33}\end{bmatrix}.\)


We can further split U into

\(\begin{bmatrix}2&1&1\\0&-8&-2\\0&0&1\end{bmatrix}=\begin{bmatrix}2&0&0\\0&-8&0\\0&0&1\end{bmatrix}\begin{bmatrix}1&\frac12&\frac{1}{2}\\0&1&\frac{1}4\\0&0&1\end{bmatrix}\)

\(\therefore A=LDU\) if no row exchanges are required.

Let A be a square matrix. When a factorization A = LDU exists where
1. L is a lower triangular matrix with all main diagonal entries 1, (which records the steps of elimination.)
2. U is an upper triangular matrix with all diagonal entries 1, and
3. D is a diagonal matrix with all main diagonal entries nonzero, (with pivots on the diagonal.)
It is unique.

\(\begin{bmatrix}2&1&1\\4&-6&0\\-2&7&2\end{bmatrix}=\begin{bmatrix}1&0&0\\2&1&0\\-1&-1&1\end{bmatrix}\begin{bmatrix}2&1&1\\0&-8&-2\\0&0&1\end{bmatrix}\)

\(=\begin{bmatrix}1&0&0\\2&1&0\\-1&-1&1\end{bmatrix}\begin{bmatrix}2&0&0\\0&-8&0\\0&0&1\end{bmatrix}\begin{bmatrix}1&\frac12&\frac{1}{2}\\0&1&\frac{1}4\\0&0&1\end{bmatrix}\)

\(=LDU\)


Claim:

\(If\;A=L_1D_1U_1\;and\;A=L_2D_2U_2\)

\(Then\;L_1=L_2,\;D_1=D_2,\;U_1=U_2.\;If\;it\;is\;decomposable\)


Uniqueness Proof:

Assume \(LDU = L_1D_1U_1\). Objective: Prove \(X = X_1\) for each \(X = L, D, U\).

In keeping with the hint, \(\color{green}{L_1^{-1}}LDU\color{#D555D1}{U^{-1}} = \color{green}{L_1^{-1}}L_1D_1U_1\color{#D555D1}{U^{-1}} \iff \color{green}{L_1^{-1}}LD = D_1U_1\color{#D555D1}{U^{-1}}\) \(\Longrightarrow (\text{Lower triangular})D = D_1(\text{Upper triangular}).\)

\(\color{red}{\bigstar} \) So both sides are diagonal. \( \; L,U,L_1,U_1\) have diagonal \(1\)’s so \(D = D_1\). Then \(\color{green}{L_1^{-1}}L = I\) and \(U_1\color{#D555D1}{U^{-1}} = I. \blacksquare\)

First notice that \(L_1^{-1}L\) is a lower triangular matrix and has diagonal 1's, since both \(L_1^{-1}\) and \(L\) are lower triangular and have diagonal 1's.

Nonzero entries below the diagonal of \(L_1^{-1}L\) will cause nonzero entries below the diagonal of \(L_1^{-1}LD\) (since \(D\) is also diagonal). And if that happens, \(L_1^{-1}LD\) cannot equal to \(D_1U_1U^{-1}\), thus entries below the diagonal of \(L_1^{-1}L\) are all 0's.

Now we can conclude \(L_1^{-1}L=I\). Similarly, \(U_1U^{-1}=I\).




LU decomposition

Solving linear systems:

LU decomposition, short for Lower-Upper decomposition, is a matrix factorization technique used to break down a square matrix into the product of a lower triangular matrix (L) and an upper triangular matrix (U). It's commonly employed to simplify solving systems of linear equations and calculating determinants. It can be done by first solving a system with the lower triangular matrix L and then solving a system with the upper triangular matrix U.


One Square System = Two Triangular Systems

\(A\underline x=\underline b\)

Suppose elimination requires no row exchanges

\(A=LU\)

\(\Rightarrow LU\underline x=\underline b\)

\(U\underline x=L^{-1}\underline b=\underline c\)

We have \(U\underline x=\underline c\;where\;L\underline c=\underline b\)

1. Factor A=LU by Gaussian elimination.

2. Solve \(\underline c\;from\;L\underline c=\underline b\) (forward elimination)

and then solve \(U\underline x=\underline c\) (back substitution)


[Ex]

\(\begin{bmatrix}2&1&1\\4&-6&0\\-2&7&2\end{bmatrix}\begin{bmatrix}x\\y\\z\end{bmatrix}=\begin{bmatrix}5\\-2\\9\end{bmatrix}\)

\(A\underline x=\underline b\)

\(\begin{bmatrix}2&1&1\\4&-6&0\\-2&7&2\end{bmatrix}=\begin{bmatrix}1&0&0\\2&1&0\\-1&-1&1\end{bmatrix}\begin{bmatrix}2&1&1\\0&-8&-2\\0&0&1\end{bmatrix}\)

A=LU

\(\begin{bmatrix}1&0&0\\2&1&0\\-1&-1&1\end{bmatrix}\begin{bmatrix}c_1\\c_2\\c_3\end{bmatrix}=\begin{bmatrix}5\\-2\\9\end{bmatrix}\)

\(L\underline c=\underline b\)

\(\underline c=\begin{bmatrix}c_1\\c_2\\c_3\end{bmatrix}=\begin{bmatrix}5\\-12\\2\end{bmatrix}\)

\(\begin{bmatrix}2&1&1\\0&-8&-2\\0&0&1\end{bmatrix}\begin{bmatrix}x\\y\\z\end{bmatrix}=\begin{bmatrix}5\\-12\\2\end{bmatrix}\)

\(U\underline x=\underline c\)

\(\underline x=\begin{bmatrix}x\\y\\z\end{bmatrix}=\begin{bmatrix}1\\1\\2\end{bmatrix}\)



Complexity of Elimination

1. solve \(A\underline x=\underline b\)

\(A=\left[\begin{array}{lccc}a_{11}&a_{12}&\cdots&a_{1n}\\a_{21}&a_{22}&\cdots&a_{2n}\\\vdots&\vdots&\ddots&\vdots\\a_{n1}&a_{n2}&\cdots&a_{nn}\end{array}\right]=LU\)

\(L=\left[\begin{array}{lccc}a_{11}&0&\cdots&0\\a_{21}&a_{22}&\cdots&0\\\vdots&\vdots&\ddots&\vdots\\a_{n1}&a_{n2}&\cdots&a_{nn}\end{array}\right],\;U=\left[\begin{array}{lccc}a_{11}&a_{12}&\cdots&a_{1n}\\0&a_{22}&\cdots&a_{2n}\\\vdots&\vdots&\ddots&\vdots\\0&0&\cdots&a_{nn}\end{array}\right]\)

[Elimination]: 1st stage n(n-1), (# of multiplications and # of additions) \(n(n-1)\approx n^2\)

2nd stage \((n-1)(n-2)\approx{(n-1)}^2\)

3rd stage \(\approx{(n-2)}^2\)

\(n^2+{(n-1)}^2+...+1^2=\frac13n(n+\frac12)(n+1)\)

\(\approx\frac13n^3\)


\(L\underline c=\underline b\)

\(\left[\begin{array}{lccc}a_{11}&0&\cdots&0\\a_{21}&a_{22}&\cdots&0\\\vdots&\vdots&\ddots&\vdots\\a_{n1}&a_{n2}&\cdots&a_{nn}\end{array}\right]\begin{bmatrix}c_1\\c_2\\\vdots\\c_n\end{bmatrix}=\begin{bmatrix}b_1\\b_2\\\vdots\\b_n\end{bmatrix}\)

\((n-1)+(n-2)+...+1=\frac12n(n-1)\approx\frac12n^2\)


\(U\underline x=\underline c\)

\(\left[\begin{array}{lccc}p_1&a_{12}&\cdots&a_{1n}\\0&p_2&\cdots&a_{2n}\\\vdots&\vdots&\ddots&\vdots\\0&0&\cdots&p_n\end{array}\right]\begin{bmatrix}x_1\\x_2\\\vdots\\x_n\end{bmatrix}=\begin{bmatrix}c_1\\c_2\\\vdots\\c_n\end{bmatrix}\)

\(1+2+3+...+n=\frac12n(n+1)\approx\frac12n^2\)

\(\#=\frac12n^2+\frac12n^2=n^2\ll\frac13n^3\)

\(Total\;\#=\frac13n^3\)

Above is for one column solution. The complexity is to calculate one column only!


Compute \(A^{-1}\)

As \(A^{-1}\) is a \(N\times N\) matrix, we need solve N column system. Please don't confuse the complexity with previous section.

\(A\begin{bmatrix}\underline{x_1}&\underline{x_2}&\cdots&\underline{x_n}\end{bmatrix}=\begin{bmatrix}\underline{e_1}&\underline{e_2}&\cdots&\underline{e_n}\end{bmatrix}\)

\(where\;\underline{e_i}=\begin{bmatrix}0\\0\\1\\\vdots\\0\end{bmatrix}\Rightarrow i_{th}\;component\;=\;1\)

solve \(A\underline x=\underline b,\;\#=\frac13n^3\)

\(L\underline{c_i}=\;\underline{e_i}=\begin{bmatrix}0\\0\\1\\\vdots\\0\end{bmatrix}\)

\(\#=\frac12{(n-i+1)}^2\)

\(Since\;1\;\leq i\;\leq\;n,\;\#=\frac{n^2}2+\frac{{(n-1)}^2}2+...+\frac{2^2}2+\frac{1^2}2\approx\frac{n^3}6\)


\(U\underline{x_i}=\underline{c_i}\)

\(\#=\frac12n^2\)

\(Since\;1\;\leq i\;\leq\;n,\;\#=n\cdot\frac{n^2}2=\frac{n^3}2\)

\(Total\;\#=\frac{n^3}3+\frac{n^3}6+\frac{n^3}2=n^3\)

\(compared\;with\;\#\;for\;A^2=A\cdot A\Rightarrow n\cdot n^2=n^3\)



LU decompositions are NOT unique

\(A=LU=\begin{bmatrix}l_{11}&0&0\\l_{21}&l_{22}&0\\l_{31}&l_{32}&l_{33}\end{bmatrix}\begin{bmatrix}1&u_{12}&u_{13}\\0&1&u_{23}\\0&0&1\end{bmatrix}\)

\(=\begin{bmatrix}1&0&0\\\frac{l_{21}}{l_{11}}&1&0\\\frac{l_{31}}{l_{11}}&\frac{l_{32}}{l_{22}}&1\end{bmatrix}\begin{bmatrix}l_{11}&0&0\\0&l_{22}&0\\0&0&l_{33}\end{bmatrix}\begin{bmatrix}1&u_{12}&u_{13}\\0&1&u_{23}\\0&0&1\end{bmatrix}\)

\(=\begin{bmatrix}1&0&0\\\frac{l_{21}}{l_{11}}&1&0\\\frac{l_{31}}{l_{11}}&\frac{l_{32}}{l_{22}}&1\end{bmatrix}\begin{bmatrix}l_{11}&l_{11}u_{12}&l_{11}u_{13}\\0&l_{22}&l_{22}u_{23}\\0&0&l_{33}\end{bmatrix}\)

LDU decomposition

If a square, invertible matrix has an LDU (factorization with all diagonal entries of L and U equal to 1), then the factorization is unique.

Transposes and Permutations

\(A=\begin{bmatrix}1&2&3\\4&5&6\\7&8&9\end{bmatrix},\;A^T=\begin{bmatrix}1&4&7\\2&5&8\\3&6&9\end{bmatrix}\;\rightarrow\;transpose\;of\;A\)

\(A=\begin{bmatrix}1&2&3\\0&0&4\end{bmatrix},\;A^T=\begin{bmatrix}1&0\\2&0\\3&4\end{bmatrix}\)

\({(A^T)}_{ij}=A_{ji}\)

\(claim:\;{(A+B)}^T=A^T+B^T\)

\(claim:\;{(AB)}^T=B^TA^T,\;A:n\times m,\;B:m\times l\)

Proof: \({\lbrack{(AB)}^T\rbrack}_{ij}={(AB)}_{ji}=\overset m{\underset{k=1}\sum}A_{jk}B_{ki}=\sum_{k=1}^m{(A^T)}_{kj}{(B^T)}_{ik}\)

\(=\sum_{k=1}^m{(B^T)}_{ik}{(A^T)}_{kj}=B^T{A^T}_{ij}\)

Remark \({(ABC)}^T=C^TB^TA^T\)

\(claim:\;{(A^{-1})}^T={(A^T)}^{-1}\)

Proof: \(AA^{-1}=I\Rightarrow{(A^{-1})}^TA^T={(AA^{-1})}^T=I^T=I\)

\(A^{-1}A=I\Rightarrow A^T{(A^{-1})}^T={(A^{-1}A)}^T=I^T=I\)

\(\Rightarrow{(A^{-1})}^T={(A^T)}^{-1}\)

Def \(An\;n\times n\;matrix\;A\;is\;symmetric\;if\;A^T=A\)

Remark \(A_{ij}=A_{ji}\;if\;A\;is\;symmetric\)

\(A=\begin{bmatrix}1&2\\2&5\end{bmatrix}=A^T,\;D=\begin{bmatrix}1&0\\0&10\end{bmatrix}=D^T\)

\(claim:\;Given\;any\;matrix\;R_{n\times m},\;R_{m\times n}^TR_{n\times m}\;is\;symmetric\)

Proof: \({(R^TR)}^T=R^T{(R^T)}^T=R^TR\)

Remark \(RR^T\;is\;also\;symmetric\)

\(A=\begin{bmatrix}1&2\\2&7\end{bmatrix}\rightarrow symmetric\)

\(\begin{bmatrix}1&0\\-2&1\end{bmatrix}\begin{bmatrix}1&2\\2&7\end{bmatrix}=\begin{bmatrix}1&2\\0&3\end{bmatrix}\)

\(\begin{bmatrix}1&2\\2&7\end{bmatrix}=\begin{bmatrix}1&0\\2&1\end{bmatrix}\begin{bmatrix}1&2\\0&3\end{bmatrix}\)

\(=\begin{bmatrix}1&0\\2&1\end{bmatrix}\begin{bmatrix}1&0\\0&3\end{bmatrix}\begin{bmatrix}1&2\\0&1\end{bmatrix}\)

\(A=LDU\Rightarrow LDL^T\;as\;U=L^T\)

\(claim:\;If\;a\;symmetric\;matrix\;is\;factored\;into\;LDU\;with\;no\;row\;exchanges,\;then\;U=L^T\)

\(A=LDL^T\)

Proof: \(A=LDU\Rightarrow A^T=U^TD^TL^T=U^TDL^T\)

\(Since\;A\;is\;symmetric,\;A=LDU=U^TDL^T\)

\(Recall\;that\;this\;factorization\;is\;unique.\;We\;then\;have\;U=L^T\)


Permutation Matrix

Def: A permutation matrix has the rows of the Identity I in any order.

There are six 3x3 permutation matrices.



\(\therefore\;There\;are\;n!\;permutation\;matrices\;of\;order\;n\)

\(claim:\;If\;P\;is\;a\;permutation\;matrix,\;then\;P^{-1}=P^T\)

Proof: \(P=\left[0\begin{array}{ccccc}1&0&0&\cdots&0\end{array}\right]\;(i_{th}\;row),\;P^T=\begin{bmatrix}0\\1\\0\\0\\\vdots\\0\end{bmatrix}\;(i_{th}\;column)\)

\({(PP^T)}_{ii}=1\;for\;all\;i\)

\({(PP^T)}_{ij}\neq1\;for\;all\;i\neq j\)

\(\therefore PP^T=I\)

similarly, \(P^TP=I\)


\(Recall\;if\;no\;row\;exchanges\;are\;required,\;then\;A=LU\)

\(If\;row\;exchange\;are\;neede,\;we\;then\;have\)

\((\cdots P_1\cdots E_2\cdots P_3\cdots E_4)A=U\)

\(\Rightarrow A=(\cdots E_4^{-1}\cdots P_3^{-1}\cdots E_2^{-1}\cdots P_1^{-1})U\)

If row exchanges are needed during elimination, we can do them in advance.

The product \(PA\) will put the rows in the right order so that no row exchanges are needed for \(PA\),

Hence \(PA=LU\)

\(\begin{bmatrix}0&1&1\\1&2&1\\2&7&9\end{bmatrix}\Rightarrow\begin{bmatrix}1&2&1\\0&1&1\\2&7&9\end{bmatrix}\Rightarrow\begin{bmatrix}1&2&1\\0&1&1\\0&3&7\end{bmatrix}\Rightarrow\begin{bmatrix}1&2&1\\0&1&1\\0&0&4\end{bmatrix}\)

\(\begin{bmatrix}0&1&0\\1&0&0\\0&0&1\end{bmatrix}\begin{bmatrix}0&1&1\\1&2&1\\2&7&9\end{bmatrix}=\begin{bmatrix}1&0&0\\0&1&0\\2&3&1\end{bmatrix}\begin{bmatrix}1&2&1\\0&1&1\\0&0&4\end{bmatrix}\)

\(PA=LU\)

If we hold row exchanges until after elimination, we then have \(A=LPU\)


Elimination with only the necessary row exchanges will produce the triangular factorization \(A=LPU\), with the (unique) permutation P in the middle. The entries in L are reordered in comparison with the more familiar \(A=PLU\) (where P is not unique).


[Ex]

\(A=\begin{bmatrix}2&1&1\\2&1&-1\\1&3&1\end{bmatrix}\)

\(P_{23}A=\begin{bmatrix}2&1&1\\1&3&1\\2&1&-1\end{bmatrix}\)

\(P_{23}A=LU=\begin{bmatrix}1&0&0\\\frac12&1&0\\1&0&1\end{bmatrix}\begin{bmatrix}2&1&1\\0&\frac52&\frac12\\0&0&-2\end{bmatrix}\)

\(A=P_{23}^{-1}LU\)


\(E_1A=\begin{bmatrix}1&0&0\\-1&1&0\\-\frac12&0&1\end{bmatrix}A=\begin{bmatrix}2&1&1\\0&0&-2\\0&\frac52&\frac12\end{bmatrix}\)

\(P_{23}E_1A=\begin{bmatrix}2&1&1\\0&\frac52&\frac12\\0&0&-2\end{bmatrix}=U'\)

\(E_1A=P_{23}^{-1}U'=\begin{bmatrix}1&0&0\\0&0&1\\0&1&0\end{bmatrix}\begin{bmatrix}2&1&1\\0&\frac52&\frac12\\0&0&-2\end{bmatrix}\)

\(A=E_1^{-1}P_{23}^{-1}U'=\begin{bmatrix}1&0&0\\1&1&0\\\frac12&0&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&0&1\\0&1&0\end{bmatrix}\begin{bmatrix}2&1&1\\0&\frac52&\frac12\\0&0&-2\end{bmatrix}\)

\(A=LPU\)

Vector Space and Subspaces

To have a vector space, the eight following axioms must be satisfied for every u, v and w in V, and a & b in F.

Axiom Statement
Associativity of vector addition u + (v + w) = (u + v) + w
Commutativity of vector addition u + v = v + u
Identity element of vector addition There exists a "unique" element 0V, called the zero vector, such that v + 0 = v for all vV.
Inverse elements of vector addition For every vV, there exists a "unique" element −vV, called the additive inverse of v, such that v + (−v) = 0.
Compatibility of scalar multiplication with field multiplication a(bv) = (ab)v
Identity element of scalar multiplication 1v = v, where 1 denotes the multiplicative identity in F.
Distributivity of scalar multiplication with respect to vector addition   a(u + v) = au + av
Distributivity of scalar multiplication with respect to field addition (a + b)v = av + bv

\(\underline A=(a_1,\;a_1,\;...,\;a_n)\)

\(\underline B=(b_1,\;b_2,\;...,\;b_n)\)

\(\underline A+\underline B=(a_1+b_1,\;a_2+b_2,\;...,\;a_n+b_n)\)

Vector Space: V: a set of vectors

Two operations: vector addition & scalar mulplication.

\(\underline v\;\in\;V,\;\underline w\;\in\;V\;\Rightarrow\underline v+\underline w\;\in\;V\)

\(\underline v\;\in\;V\;\Rightarrow c\underline v\;\in\;V,\;c:field\)


Remark: \(0\cdot\underline v=\underline0\)


Remark: \(z=\{\underline0\}\)

\(\underline0+\underline0=\underline0\)

\(c\cdot\underline0=\underline0\)

\(z\;is\;a\;vector\;space.\)


Remark: \((-1)\underline v=-\underline v\)


Subspace, Column space, Null space

A subspaceof \(\mathbb{R}^n\) is a subset \(V\) of \(\mathbb{R}^n\) satisfying:

  1. Non-emptiness: The zero vector is in \(V\).
  2. Closure under addition: If \(u\) and \(v\) are in \(V\text{,}\) then \(u+v\) is also in \(V\).
  3. Closure under scalar multiplication: If \(v\) is in \(V\) and \(c\) is in \(\mathbb{R}\text{,}\) then \(cv\) is also in \(V\).

Let \( U \subset V \) be a subset of a vector space \(V\) over \(F\). Then \(U\) is a subspace of \(V\) if and only if the following three conditions hold.

  1. additive identity: \( 0 \in U \);
  2. closure under addition: \(u, v \in U \Rightarrow u + v \in U\);
  3. closure under scalar multiplication: \( a \in \mathbb{F}, ~u \in U \implies au \in U \).

Spans are Subspaces and Subspaces are Spans

If \(v_1,v_2,\ldots,v_p\) are any vectors in \(\mathbb{R}^n\text{,}\) then \(\text{Span}\{v_1,v_2,\ldots,v_p\}\) is a subspace of \(\mathbb{R}^n\). Moreover, any subspace of \(\mathbb{R}^n\) can be written as a span of a set of \(p\) linearly independent vectors in \(\mathbb{R}^n\) for \(p\leq n\).


Column Space and Null Space

Let \(A\) be an \(m\times n\) matrix.

  • The column space of \(A\) is the subspace of \(\mathbb{R}^m\) spanned by the columns of \(A\). It is written \(\text{Col}(A)\).
  • The null space of \(A\) is the subspace of \(\mathbb{R}^n\) consisting of all solutions of the homogeneous equation \(Ax=0\text{:}\)

    \[ \text{Nul}(A) = \bigl\{ x \text{ in } \mathbb{R}^n\bigm| Ax=0 \bigr\}. \nonumber \]

[Ex]

\(M=\{\begin{bmatrix}a&b\\c&d\end{bmatrix}:\;a,\;b,\;c,\;d\;\in\mathfrak R\}\)

\(U=\{\begin{bmatrix}a&b\\0&d\end{bmatrix}:\;a,\;b,\;d\;\in\mathfrak R\}\)

\(If\;A,\;B\;\in U,\;A+B\;\in U\)

\(If\;A\in U,\;cA\in U\)

\(\therefore U\;is\;a\;subspace\;of\;M\)



Column Space

The column space of a matrix is the span, or all possible linear combinations, of its columns.

definition of the column space

\(C(A)=column\;space\)

\(A\;set\;of\;vectors\;which\;span\{v_1,\;v_2,\;...,\;v_n\}\)

\(\underline b\in C(A),\;\underline b=x_1v_1+x_2v_2+...+x_nv_n\)

\(The\;system\;A\underline x=\underline b\;is\;solvable\;iff\;\underline b\in C(A)\)


claim \(If\;A\;is\;an\;m\times n\;real\;matrix,\;C(A)\;is\;a\;subspace\;of\;\mathfrak R^m\)


S = set of vectors in a vector space V (probably NOT a subspace)

SS = the set of all linear combinations of vectors in S

We call SS the "span" of S

Then SS is a subspace of V, called the subspace "spanned" by S

NullSpace

\(A\) is a map from \(V=\mathbb{F}^5\) to \(W=\mathbb{F}^4\). The theorem indeed says

\(\mathrm{rk}(A)+\mathrm{nul}(A)=\dim(V)\Leftrightarrow{}\\\mathrm{rk}(A)+3=5{}\Leftrightarrow\\\mathrm{rk}(A)=2\)

\( A= \begin{pmatrix} 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{pmatrix} \)

Take \(A=[\mathbf{e_1}^{t},\mathbf{e_2}^{t},0,0,0]\)

Let \(\mathbf{v}=(v_1,..,v_5) \in\mathbb R^5\). Of course \(A \mathbf{v} \to (v_1,v_2)\). The kernel of \(A\) is the subspace \(\mathbb R^3\) generated by vectors of the form \((0,0,\mathbf{e}_3,\mathbf{e}_4,\mathbf{e}_5)\),

so it has dimension 3. Alternatively, we can see that dim \(\dim \mathbb R^5- (\operatorname{rank} A)=\dim (\ker A)\). But the rank is exactly 2.



The null space of an \(m\)-by-\(n\) matrix \(A\) is the collection of those vectors in \(\mathbb{R}^{n}\) that \(A\) maps to the zero vector in \(\mathbb{R}^m\). More precisely,

\[\mathcal{N}(A) = \{x \in \mathbb{R}^n | Ax = 0\} \nonumber\]


As an example, we examine the matrix \(A\)

\[A = \begin{pmatrix} {0}&{1}&{0}&{0}\\ {-1}&{0}&{1}&{0}\\ {0}&{0}&{0}&{1} \end{pmatrix} \nonumber\]

It is fairly easy to see that the null space of this matrix is:

\[\mathcal{N}(A) = \{t \begin{pmatrix} {1}\\ {0}\\ {1}\\ {0} \end{pmatrix} | t \in \mathbb{R}\} \nonumber\]

This is a line in \(\mathbb{R}^{4}\)



In mathematics, the kernel of a linear map, also known as the null space or nullspace, is the part of the domain which is mapped to the zero vector of the co-domain; the kernel is always a linear subspace of the domain.

The nullspace of A (or called the "kernel" of A) consists of all solutions to

\(A\underline x=\underline0\)

\(N(A)=\{\underline x:A\underline x=\underline0\}\)


claim: If A is \(m\times n\), then N(A) is a subspace of \(\mathfrak R^n\)


Proof: \((i)\;If\;A\underline x=\underline0\;and\;A\underline y=\underline0,\;then\;A(\underline x+\underline y)=A\underline x+A\underline y=\underline0+\underline0=\underline0\)

\((ii)\;If\;A\underline x=\underline0,\;then\;A(c\underline x)=c(A\underline x)=c\underline0=\underline0\)


[Ex] \(A=\begin{bmatrix}1&2\\3&6\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&2\\0&0\end{bmatrix}=U,\;x+2y=0\)


[Ex] \(A=\begin{bmatrix}1&2\\3&8\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&2\\0&2\end{bmatrix},\;x+2y=0,\;2y=0\)

\(\therefore\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}0\\0\end{bmatrix}\)

\(\Rightarrow N(A)=\underline0\)


[Ex] \(B=\begin{bmatrix}1&2\\3&8\\2&4\\6&16\end{bmatrix}=\begin{bmatrix}A\\2A\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&2\\0&2\\0&0\\0&4\end{bmatrix}\Rightarrow\begin{bmatrix}1&2\\0&2\\0&0\\0&0\end{bmatrix}\Rightarrow x+2y=0,\;2y=0\)

\(\therefore\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}0\\0\end{bmatrix}\)

\(\Rightarrow N(B)=\underline0\)


\(C=\begin{bmatrix}1&2&2&4\\3&8&6&16\end{bmatrix}=\begin{bmatrix}A&2A\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&2&2&4\\0&2&0&4\end{bmatrix}\Rightarrow\begin{bmatrix}1&0&2&0\\0&2&0&4\end{bmatrix}\Rightarrow\begin{bmatrix}1&0&2&0\\0&1&0&2\end{bmatrix}=R\)

reduced row echelon form (RRE)

\(c\underline x=\underline0\;\Leftrightarrow\;R\underline x=\underline0\)

\(x_1+2x_3=0\)

\(x_2+2x_4=0\)

\(x_1,\;x_2\;pivot\;variables\)

\(x_3,\;x_4\;free\;variables\)

\(\Rightarrow\left\{\begin{array}{l}x_1=-2x_3\\x_2=-2x_4\end{array}\right.\)

\(\therefore\underline x=\begin{bmatrix}x_1\\x_2\\x_3\\x_4\end{bmatrix}=\begin{bmatrix}-2x_3\\-2x_4\\x_3\\x_4\end{bmatrix}=\begin{bmatrix}-2x_3\\0\\x_3\\0\end{bmatrix}+\begin{bmatrix}0\\-2x_4\\0\\x_4\end{bmatrix}\)

\(=x_3\begin{bmatrix}-2\\0\\1\\0\end{bmatrix}+x_4\begin{bmatrix}0\\-2\\0\\1\end{bmatrix}\)

\(N(C)=\{\underline x:\underline x=x_3\begin{bmatrix}-2\\0\\1\\0\end{bmatrix}+x_4\begin{bmatrix}0\\-2\\0\\1\end{bmatrix},\;x_3,\;x_4\in\mathfrak R\}\)

\(How\;to\;get\;\begin{bmatrix}-2\\0\\1\\0\end{bmatrix}:\;set\;x_3=1,\;x_4=0,\;then\;x_1=-2,\;x_2=0\)

\(How\;to\;get\;\begin{bmatrix}0\\-2\\0\\1\end{bmatrix}:set\;x_3=0,\;x_4=1,\;then\;x_1=0,\;x_2=-2\)

\(A\underline x=\underline0,\;A\xrightarrow{Gaussian\;elimination}U\xrightarrow[{Produce\;zeros\;above\;pivots}]{Reduced\;row\;echelon\;form}R\)

\(Produce\;ones\;in\;the\;pivots\)

\(Solve\;R\underline x=\underline0\)


[Ex] \(A=\begin{bmatrix}1&1&2&3\\2&2&8&10\\3&3&10&13\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&1&2&3\\0&0&4&4\\0&0&4&4\end{bmatrix}\Rightarrow\begin{bmatrix}1&1&2&3\\0&0&4&4\\0&0&0&0\end{bmatrix}=U\)

\(\Rightarrow\begin{bmatrix}1&1&0&1\\0&0&4&4\\0&0&0&0\end{bmatrix}=\begin{bmatrix}1&1&0&1\\0&0&1&1\\0&0&0&0\end{bmatrix}=R\)

\(x_1+x_2+x_4=0\)

\(x_3+x_4=0\)

\(x_1,\;x_3\;pivot\;variables\)

\(x_2,\;x_4\;free\;variables\)

\(x_1=-x_2-x_4\)

\(x_3=-x_4\)

\(\underline x=\begin{bmatrix}x_1\\x_2\\x_3\\x_4\end{bmatrix}=\begin{bmatrix}-x_2-x_4\\x_2\\-x_4\\x_4\end{bmatrix}=\begin{bmatrix}-x_2\\x_2\\0\\0\end{bmatrix}+\begin{bmatrix}-x_4\\0\\-x_4\\x_4\end{bmatrix}\)

\(=x_2\begin{bmatrix}-1\\1\\0\\0\end{bmatrix}+x_4\begin{bmatrix}-1\\0\\-1\\1\end{bmatrix}\) special solution

\(\left\{\begin{array}{l}x_2=1,\;x_4=0,\Rightarrow x_1=-1,\;x_3=0\\x_2=0,\;x_4=1,\Rightarrow x_1=-1,\;x_3=-1\end{array}\right.\)

\(N(A)=\{\underline x:\underline x=x_2\begin{bmatrix}-1\\1\\0\\0\end{bmatrix}+x_4\begin{bmatrix}-1\\0\\-1\\1\end{bmatrix},\;x_2,\;x_4\in\mathfrak R\}\)

[Ex] suppose \(A\) is \(m\times n\) If there are \(r\) pivots, \(r\leq m,\;r\leq n\), there are \(n-r\) free variables.

And there are \(n-r\) special solutions.

\(N(A)\) consists of all the linear combinations of these \(n-r\) special solutions.

\(N(A)=the\;subspace\;spanned\;by\;these\;n-r\;special\;solutions\)


Rank Nullity Theorem

\[A \in \mathbb{R}^{m \times n} \Rightarrow rank(A) + nullity(A) = n\]

For example, if B is a 4 \(\times\) 3 matrix and \(rank(B) = 2\), then from the rank--nullity theorem, on can deduce that

\[rank(B) + nullity(B) = 2 + nullity(B) = 3 \Rightarrow nullity(B) = 1\]


\(If\;n\;>\;m\;(more\;columns\;than\;rows),\;i.e.,\;A\underline x=\underline0\;has\;more\;unknowns\;than\;equations,\)

\(then\;r\;\leq\;m\;<\;n\)

\(Hence\;n-r\;>\;0,\;i.e.,\;there\;must\;be\;nonzero\;solution.\)

\(Def\;The\;rank\;of\;a\;matrix\;A\;is\;the\;number\;of\;pivots.\)

\(Pivot\;columns\;and\;special\;solutions\)

\(A=\begin{bmatrix}1&3&0&2&-1\\0&0&1&4&-3\\1&3&1&6&-4\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&3&0&2&-1\\0&0&1&4&-3\\0&0&0&0&0\end{bmatrix}=R\)

\(x_1+3x_2+2x_4-x_5=0\)

\(x_3+4x_4-3x_5=0\)

\(x_1=-3x_2-2x_4+x_5\)

\(x_3=-4x_4+3x_5\)

\(x_2=1,\;x_4=0,\;x_5=0\;\Rightarrow x_1=-3,\;x_3=0\)

\(x_2=0,\;x_4=1,\;x_5=0\;\Rightarrow x_1=-2,\;x_3=-4\)

\(x_2=0,\;x_4=0,\;x_5=1\;\Rightarrow x_1=1,\;x_3=3\)

\(Three\;special\;solutions:\begin{bmatrix}-3\\1\\0\\0\\0\end{bmatrix},\;\begin{bmatrix}-2\\0\\-4\\1\\0\end{bmatrix},\;\begin{bmatrix}1\\0\\3\\0\\1\end{bmatrix}\)

\(\begin{bmatrix}{\color[rgb]{0.0, 0.0, 1.0}1}&{\color[rgb]{0.0, 0.5, 0.0}3}&{\color[rgb]{0.0, 0.0, 1.0}0}&{\color[rgb]{0.0, 0.5, 0.0}2}&\color[rgb]{0.0, 0.5, 0.0}{-1}\\{\color[rgb]{0.0, 0.0, 1.0}0}&{\color[rgb]{0.0, 0.5, 0.0}0}&{\color[rgb]{0.0, 0.0, 1.0}1}&{\color[rgb]{0.0, 0.5, 0.0}4}&\color[rgb]{0.0, 0.5, 0.0}{-3}\\{\color[rgb]{0.0, 0.0, 1.0}0}&{\color[rgb]{0.0, 0.5, 0.0}0}&{\color[rgb]{0.0, 0.0, 1.0}0}&{\color[rgb]{0.0, 0.5, 0.0}0}&{\color[rgb]{0.0, 0.5, 0.0}0}\end{bmatrix},\;{\color[rgb]{0.0, 0.0, 1.0}\mathrm blue}:\;\mathrm{pivot}\;\mathrm{columns}.\;{\color[rgb]{0.0, 0.5, 0.0}\mathrm green}:\;\mathrm{free}\;\mathrm{columns}\)

\(A\underline x=\underline0\;\Rightarrow R\underline x=\underline0\)

\(A\begin{bmatrix}-3\\1\\0\\0\\0\end{bmatrix}=\begin{bmatrix}0\\0\\0\end{bmatrix},\;R\begin{bmatrix}-3\\1\\0\\0\\0\end{bmatrix}=\begin{bmatrix}0\\0\\0\end{bmatrix}\)

\(-3\begin{bmatrix}{\underline a}_1\end{bmatrix}+\begin{bmatrix}{\underline a}_2\end{bmatrix}=\underline0\)

\(\Rightarrow\underbrace{\begin{bmatrix}{\underline a}_2\end{bmatrix}}_{free\;columns}=3\underbrace{\begin{bmatrix}{\underline a}_1\end{bmatrix}}_{pivot\;columns}\)

\(A\begin{bmatrix}-2\\0\\-4\\1\\0\end{bmatrix}=\begin{bmatrix}0\\0\\0\end{bmatrix},\;R\begin{bmatrix}-2\\0\\-4\\1\\0\end{bmatrix}=\begin{bmatrix}0\\0\\0\end{bmatrix}\)

\(\underbrace{\begin{bmatrix}{\underline a}_4\end{bmatrix}}_{free\;columns}=2\underbrace{\begin{bmatrix}{\underline a}_1\end{bmatrix}}_{pivot\;columns}+4\underbrace{\begin{bmatrix}{\underline a}_3\end{bmatrix}}_{pivot\;columns}\)

\(similarly,\;\underbrace{\begin{bmatrix}{\underline a}_5\end{bmatrix}}_{free\;columns}=-\underbrace{\begin{bmatrix}{\underline a}_1\end{bmatrix}}_{pivot\;columns}-3\underbrace{\begin{bmatrix}{\underline a}_3\end{bmatrix}}_{pivot\;columns}\)

\(For\;R,\;we\;can\;also\;obtain\)

\(\begin{bmatrix}{\underline r}_2\end{bmatrix}=3\begin{bmatrix}{\underline r}_1\end{bmatrix}\)

\(\begin{bmatrix}{\underline r}_4\end{bmatrix}=2\begin{bmatrix}{\underline r}_1\end{bmatrix}+4\begin{bmatrix}{\underline r}_3\end{bmatrix}\)

\(\begin{bmatrix}{\underline r}_5\end{bmatrix}=-\begin{bmatrix}{\underline r}_1\end{bmatrix}-3\begin{bmatrix}{\underline r}_3\end{bmatrix}\)

Free columns are linear combinations of pivot columns and special solutions describe these combinations.

\(Note\;R=\;\begin{bmatrix}1&3&0&2&-1\\0&0&1&4&-3\\0&0&0&0&0\end{bmatrix}\)

\(N=\begin{bmatrix}-3&-2&1\\1&0&0\\0&-4&3\\0&1&0\\0&0&1\end{bmatrix},\;nullspace\;matrix\)

\(\begin{bmatrix}{\color[rgb]{0.0, 0.0, 1.0}1}&3&{\color[rgb]{0.0, 0.0, 1.0}0}&2&-1\\{\color[rgb]{0.0, 0.0, 1.0}0}&0&{\color[rgb]{0.0, 0.0, 1.0}1}&4&-3\\{\color[rgb]{0.0, 0.0, 1.0}0}&0&{\color[rgb]{0.0, 0.0, 1.0}0}&0&0\end{bmatrix}\leftrightarrow\begin{bmatrix}\color[rgb]{0.0, 0.0, 1.0}{-3}&\color[rgb]{0.0, 0.0, 1.0}{-2}&{\color[rgb]{0.0, 0.0, 1.0}1}\\1&0&0\\{\color[rgb]{0.0, 0.0, 1.0}0}&\color[rgb]{0.0, 0.0, 1.0}{-4}&{\color[rgb]{0.0, 0.0, 1.0}3}\\0&1&0\\0&0&1\end{bmatrix},\;{\color[rgb]{0.0, 0.0, 1.0}b}{\color[rgb]{0.0, 0.0, 1.0}l}{\color[rgb]{0.0, 0.0, 1.0}u}{\color[rgb]{0.0, 0.0, 1.0}e}:pivot\)

\(R=\begin{bmatrix}\mathrm I&\vdots&\mathrm F\end{bmatrix}\)

\(\mathrm N=\begin{bmatrix}-\mathrm F\\\cdots\\\mathrm I\end{bmatrix}\)

\(\mathrm R=\left.\begin{bmatrix}\overbrace{\mathrm I}^{\mathrm r\times\mathrm r}&\vdots&\overbrace{\mathrm F}^{\mathrm r\times(\mathrm n-\mathrm r)}\\0&0&\left.0\right\}(\mathrm m-\mathrm r)\end{bmatrix}\right\}\mathrm m,\)

\(\mathrm N=\begin{bmatrix}-\mathrm F\\\cdots\\\mathrm I\end{bmatrix},\;-\mathrm F:\mathrm r\times(\mathrm n-\mathrm r),\;\mathrm I:(\mathrm n-\mathrm r)\times(\mathrm n-\mathrm r)\)



NullSpace






complete solution to \(A\underline x=\underline b\)

\(\begin{bmatrix}1&3&0&2\\0&0&1&4\\1&3&1&6\end{bmatrix}\begin{bmatrix}x_1\\x_2\\x_3\\x_4\end{bmatrix}=\begin{bmatrix}1\\6\\7\end{bmatrix}\)

\(\begin{bmatrix}1&3&0&2&\vdots&1\\0&0&1&4&\vdots&6\\1&3&1&6&\vdots&7\end{bmatrix}\Rightarrow\begin{bmatrix}1&3&0&2&\vdots&1\\0&0&1&4&\vdots&6\\0&0&1&4&\vdots&6\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&3&0&2&\vdots&1\\0&0&1&4&\vdots&6\\0&0&0&0&\vdots&0\end{bmatrix}\)

\(x_1+3x_2+2x_4=1\)

\(x_3+4x_4=6\)

\(\underline x=\begin{bmatrix}x_1\\x_2\\x_3\\x_4\end{bmatrix}=\begin{bmatrix}-3x_2-2x_4+1\\x_2\\-4x_4+6\\x_4\end{bmatrix}\)

\(=x_2\begin{bmatrix}-3\\1\\0\\0\end{bmatrix}+x_4\begin{bmatrix}-2\\0\\-4\\1\end{bmatrix}+\begin{bmatrix}1\\0\\6\\0\end{bmatrix}\)

\(what\;is\;x_2\begin{bmatrix}-3\\1\\0\\0\end{bmatrix}+x_4\begin{bmatrix}-2\\0\\-4\\1\end{bmatrix}?\)

\(It\;is\;the\;general\;solution\;({\underline x}_n)\;to\;A\underline x=\underline0\)

\(what\;is\;\begin{bmatrix}1\\0\\6\\0\end{bmatrix}?\)

It is a particular solution to \(\;A\underline x=\underline b\)

\(\therefore\underline x={\underline x}_p+{\underline x}_n\)

Claim: \(If\;\;A\underline x=\underline b,\;then\;the\;complete\;solution\;\underline x={\underline x}_p+{\underline x}_n\)

\(where\;{\underline x}_p\;is\;a\;particular\;solution\;to\;\;A\underline x=\underline b\;and\;{\underline x}_n\;is\;the\;general\;solution\;to\;A\underline x=\underline0\)

Proof: \(If\;\underline x={\underline x}_p+{\underline x}_n,\;then\;A\underline x=A({\underline x}_p+{\underline x}_n)=A{\underline x}_p+A{\underline x}_n\)

\(=\underline b+\underline0=\underline b\)

\(\therefore\underline x\;is\;a\;solution\;to\;A\underline x=\underline b\)

\(If\;\underline x\;is\;a\;solution\;to\;A\underline x=\underline b\)

\(then\;A(\underline x-{\underline x}_p)=A\underline x-A{\underline x}_p=\underline b-\underline b=\underline0\)

\(\therefore\underline x-{\underline x}_p\;is\;a\;solution\;to\;A\underline x=\underline0\)

\(\Rightarrow\underline x-{\underline x}_p={\underline x}_n\;\Rightarrow\underline x={\underline x}_p+{\underline x}_n\)

\(\begin{bmatrix}1&3&0&2\\0&0&1&4\\1&3&1&6\end{bmatrix}\begin{bmatrix}x_1\\x_2\\x_3\\x_4\end{bmatrix}=\begin{bmatrix}b_1\\b_2\\b_3\end{bmatrix}\)

\(\begin{bmatrix}1&3&0&2&\vdots&b_1\\0&0&1&4&\vdots&b_2\\1&3&1&6&\vdots&b_3\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&3&0&2&\vdots&b_1\\0&0&1&4&\vdots&b_2\\0&0&0&0&\vdots&b_3-b_1-b_2\end{bmatrix}\)

\(For\;A\underline x=\underline b\;to\;be\;sovable,\;we\;must\;have\;b_3-b_1-b_2=0,\;i.e.\;b_3=b_1+b_2\)

\(If\;b_3=b_1+b_2,\;then\)

\(\begin{bmatrix}1&3&0&2&\vdots&b_1\\0&0&1&4&\vdots&b_2\\0&0&0&0&\vdots&0\end{bmatrix}\)

\(\left\{\begin{array}{l}x_1+3x_2+2x_4=b_1\\x_3+4x_4=b_2\end{array}\right.\)

\(A\;particular\;solution\;{\underline x}_p\;to\;\;A\underline x=\underline b\;can\;be\;found\;by\;setting\;x_2=x_4=0\)

\(which\;gives\;x_1=b_1,\;x_3=b_2\)

\(\therefore{\underline x}_p=\begin{bmatrix}b_1\\0\\b_2\\0\end{bmatrix}\)

\(\left\{\begin{array}{l}x_1+3x_2+2x_4=0\\x_3+4x_4=0\end{array}\right.\)

\(\Rightarrow{\underline x}_n=x_2\begin{bmatrix}-3\\1\\0\\0\end{bmatrix}+x_4\begin{bmatrix}-2\\0\\-4\\1\end{bmatrix}\)

\(Therefore,\;\underline x={\underline x}_p+{\underline x}_n\)

\(=\begin{bmatrix}b_1\\0\\b_2\\0\end{bmatrix}+x_2\begin{bmatrix}-3\\1\\0\\0\end{bmatrix}+x_4\begin{bmatrix}-2\\0\\-4\\1\end{bmatrix}\)

\(If\;b_3=b_1+b_2\)

Question: If \(A\) is a square invertible matrix. i.e., m=n=r

\(what\;are\;{\underline x}_p\;and\;{\underline x}_n?\)

\(The\;only\;particular\;solution\;to\;A\underline x=\underline b\;is\;\underline x=A^{-1}\underline b\)

\(And\;the\;only\;solution\;to\;A\underline x=\underline0\;is\;{\underline x}_n=\underline0\)

\(\therefore\underline x={\underline x}_p+{\underline x}_n=A^{-1}\underline b+\underline0=A^{-1}\underline b\)

\(\begin{bmatrix}A&\vdots&\underline b\end{bmatrix}\Rightarrow\begin{bmatrix}I&\vdots&A^{-1}\underline b\end{bmatrix}\)


[Ex] \(\begin{bmatrix}1&1\\1&2\\-2&-3\end{bmatrix}\begin{bmatrix}x_1\\x_2\end{bmatrix}=\begin{bmatrix}b_1\\b_2\\b_3\end{bmatrix}\)

\(\begin{bmatrix}1&1&\vdots&b_1\\1&2&\vdots&b_2\\-2&-3&\vdots&b_3\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&1&\vdots&b_1\\0&1&\vdots&b_2-b_1\\0&-1&\vdots&b_3+2b_1\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&0&\vdots&2b_1-b_2\\0&1&\vdots&b_2-b_1\\0&0&\vdots&b_1+b_2+b_3\end{bmatrix}\)

\(\Rightarrow For\;A\underline x=\underline b\;to\;be\;solvable,\;we\;must\;have\;b_1+b_2+b_3=0\)

\(\therefore{\underline x}_p=\begin{bmatrix}2b_1-b_2\\b_2-b_1\end{bmatrix}\)

\(and\;{\underline x}_n=\begin{bmatrix}0\\0\end{bmatrix},\;(no\;free\;variables)\)

\(\Rightarrow\underline x={\underline x}_p+\;{\underline x}_n=\begin{bmatrix}2b_1-b_2\\b_2-b_1\end{bmatrix}+\begin{bmatrix}0\\0\end{bmatrix}\)

\(=\begin{bmatrix}2b_1-b_2\\b_2-b_1\end{bmatrix},\;if\;b_1+b_2+b_3=0\)


In general, if \(r=n,\;\;(m\geq n)\)

A has full column rank

\(A=\underset n{\underbrace{\left.\begin{bmatrix}?&?\\?&?\end{bmatrix}\right\}}m\;}\Rightarrow R=\begin{bmatrix}I\\0\end{bmatrix}\)

There are no free variables or free columns.

\(Also\;N(A)=\{\underline0\}\)

Therefore, if A has full column rank (r=n),

Then (i) All columns of A are pivot columns.

(ii) There are no free variables or no special solutions.

(iii) \(N(A)=\{\underline0\}\)

(iv) \(If\;A\underline x=\underline b\;has\;a\;solution\;(it\;might\;not),\;then\;it\;has\;only\;one\;solution.\)


[Ex]

\(\begin{bmatrix}1&1&1\\1&2&-1\end{bmatrix}\begin{bmatrix}x_1\\x_2\\x_3\end{bmatrix}=\begin{bmatrix}3\\4\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&1&1&\vdots&3\\1&2&-1&\vdots&4\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&0&3&\vdots&2\\0&1&-2&\vdots&1\end{bmatrix}\)

\(\underline x={\underline x}_p+\;{\underline x}_n=\begin{bmatrix}2\\1\\0\end{bmatrix}+x_3\begin{bmatrix}-3\\2\\1\end{bmatrix}\)

\(\begin{bmatrix}-3\\2\\1\end{bmatrix}:\;line\;of\;solutions\;to\;A\underline x=\underline0\)

\(\underline x={\underline x}_p+\;{\underline x}_n=\begin{bmatrix}2\\1\\0\end{bmatrix}+x_3\begin{bmatrix}-3\\2\\1\end{bmatrix},\;line\;of\;solutions\;to\;A\underline x=\underline b\)


\(If\;r=m\;(m\leq n),\;A\;has\;full\;row\;rank\)

\(A=\underset n{\underbrace{\left.\begin{bmatrix}?&?&?\\?&?&?\end{bmatrix}\right\}}m\;}\Rightarrow R=\begin{bmatrix}I&F\end{bmatrix}\)

If A has full row rank (r=m)

then (i) All rows have pivots (R has no zero rows)

(ii) \(A\underline x=\underline b\;is\;solvable\;for\;all\;\underline b\)

(iii) \(C(A)=\mathcal R^m\)

(iv) \(There\;are\;n-r=n-m\;special\;solutions\;in\;N(A)\)


Summary \(A\underline x=\underline b\)

Case I: \(r=m=n,\;(A\;invertible)\)

\(R=I,\;A\underline x=\underline b\;has\;1\;solution.\;\underline x=A^{-1}\underline b\)

Case II: \(r=m\;<\;n\;(full\;row\;rank),\;A:short\;and\;wide\)

\(R=\begin{bmatrix}I&F\end{bmatrix},\;A\underline x=\underline b\;has\;\infty\;solutions\)

Case III: \(r=n\;<\;m\;(full\;column\;rank),\;A:tall\;and\;thin\)

\(R=\begin{bmatrix}I\\0\end{bmatrix},\;A\underline x=\underline b\;has\;0\;or\;1\;solution\)

Case IV: \(r\;<\;m,\;r\;<\;n.\;(A\;not\;full\;rank)\)

\(R=\begin{bmatrix}I&F\\0&0\end{bmatrix},\;A\underline x=\underline b\;has\;0\;or\;\infty\;solutions\)


Independence, Basis, and Dimension







\(Def:\;If\;x_1{\underline v}_1+x_2{\underline v}_2+...+x_n{\underline v}_n=0\;and\;only\;happens\;when\;x_1=x_2=...=x_n=0,\)

\(then\;the\;vectors\;{\underline v}_1,\;{\underline v}_2,\;...,\;{\underline v}_n\;are\;linearly\;independent.\)

\(otherwise,\;they\;are\;linearly\;dependent.\)

[Ex] \(\begin{bmatrix}1\\0\end{bmatrix},\;\begin{bmatrix}0\\1\end{bmatrix}\)

\(suppose\;x_1\begin{bmatrix}1\\0\end{bmatrix}+x_2\begin{bmatrix}0\\1\end{bmatrix}=\begin{bmatrix}0\\0\end{bmatrix}\)

\(\Rightarrow x_1=0,\;x_2=0\)

\(\therefore They\;are\;(linearly)\;independent.\)

[Ex] \(\begin{bmatrix}1\\2\\1\\0\end{bmatrix},\begin{bmatrix}3\\4\\0\\1\end{bmatrix}\)

\(spuppose\;x_1\begin{bmatrix}1\\2\\1\\0\end{bmatrix}+x_2\begin{bmatrix}3\\4\\0\\1\end{bmatrix}=\begin{bmatrix}0\\0\\0\\0\end{bmatrix}\)

\(\Rightarrow\;x_1+3x_2=0\)

\(2x_1+4x_2=0\)

\(x_1=0\)

\(x_2=0\)

\(\Rightarrow\;x_1=x_2=0\)

\(\therefore They\;are\;independent\)

Remark: These special solutions we found in the nullspace of a matrix are independent

[Ex] \(\begin{bmatrix}1\\2\\1\end{bmatrix},\begin{bmatrix}0\\1\\0\end{bmatrix},\begin{bmatrix}3\\5\\3\end{bmatrix}\)

\(suppose\;x_1\begin{bmatrix}1\\2\\1\end{bmatrix}+x_2\begin{bmatrix}0\\1\\0\end{bmatrix}+x_3\begin{bmatrix}3\\5\\3\end{bmatrix}=\begin{bmatrix}0\\0\\0\end{bmatrix}\)

\(\begin{bmatrix}1&0&3\\2&1&5\\1&0&3\end{bmatrix}\begin{bmatrix}x_1\\x_2\\x_3\end{bmatrix}=\begin{bmatrix}0\\0\\0\end{bmatrix}\)

\(\begin{bmatrix}1&0&3\\2&1&5\\1&0&3\end{bmatrix}\Rightarrow\begin{bmatrix}1&0&3\\0&1&-1\\0&0&0\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&0&3\\0&1&-1\\0&0&0\end{bmatrix}=R\)

\(\therefore\underline x=x_3\begin{bmatrix}-3\\1\\1\end{bmatrix}\)

\(We\;can\;have\;\;-3\cdot\begin{bmatrix}1\\2\\1\end{bmatrix}+1\cdot\begin{bmatrix}0\\1\\0\end{bmatrix}+1\cdot\begin{bmatrix}3\\5\\3\end{bmatrix}=\begin{bmatrix}0\\0\\0\end{bmatrix}\)

These vectors are linearly dependent.

\(\begin{bmatrix}3\\5\\3\end{bmatrix}=3\begin{bmatrix}1\\2\\1\end{bmatrix}-\begin{bmatrix}0\\1\\0\end{bmatrix}\)

In general, the columns of A ar linearly independent exactly when r=n (full column rank)

\(R=\begin{bmatrix}I\\0\end{bmatrix},\;There\;are\;n\;pivots\;and\;no\;free\;variables\)

\(Hence\;N(A)=\{\underline0\}\)

Claim: \(any\;set\;of\;n\;vectors\;in\;\mathcal R^m\;must\;be\;linearly\;dependent\;if\;n\;>\;m\)

Proof \(Put\;these\;n\;vectors\;as\;columns\;of\;A\)

\(A={\begin{bmatrix}{\underline v}_1&{\underline v}_2&...&{\underline v}_n\end{bmatrix}}_{mxn}\)

\(\Rightarrow R=\begin{bmatrix}I&F\\0&0\end{bmatrix}\)

\(There\;are\;n-r\;>\;0\;free\;variables\;and\;hence\;n-r\;special\;solutions.\)

\(Therefore,\;N(A)\;contains\;vectors\;other\;than\;\underline0.\)


Def: A set of vectors spans a vector space if their linear combinations fill the space.

Remark: The columns of a matrix span its column space.

[Ex] \(\begin{bmatrix}1\\0\end{bmatrix},\begin{bmatrix}0\\1\end{bmatrix}\;span\;\mathcal R^2\)

\(\begin{bmatrix}a\\b\end{bmatrix}=a\begin{bmatrix}1\\0\end{bmatrix}+b\begin{bmatrix}0\\1\end{bmatrix}\)

[Ex] \(\begin{bmatrix}1\\0\end{bmatrix},\begin{bmatrix}0\\1\end{bmatrix},\begin{bmatrix}4\\7\end{bmatrix}\;span\;\mathcal R^2?\;Yes\)

[Ex] \(\begin{bmatrix}1\\1\end{bmatrix},\begin{bmatrix}-1\\-1\end{bmatrix}\;span\;\mathcal R^2?\;No\)

\(\begin{bmatrix}a\\b\end{bmatrix}=x_1\begin{bmatrix}1\\1\end{bmatrix}+x_2\begin{bmatrix}-1\\-1\end{bmatrix}\;?\)

\(\begin{bmatrix}1&-1\\1&-1\end{bmatrix}\begin{bmatrix}x_1\\x_2\end{bmatrix}=\begin{bmatrix}a\\b\end{bmatrix}?\)

\(\begin{bmatrix}1\\1\end{bmatrix},\begin{bmatrix}-1\\-1\end{bmatrix}\;only\;span\;a\;line\;in\;\mathcal R^2,\;so\;does\;\begin{bmatrix}1\\1\end{bmatrix}\;by\;itself.\)


Def: \(The\;\boldsymbol r\boldsymbol o\boldsymbol w\boldsymbol\;\boldsymbol s\boldsymbol p\boldsymbol a\boldsymbol c\boldsymbol e\;of\;an\;m\times n\;matrix\;A\;is\;the\;subspace\;of\;\mathcal R^n\;spanned\;by\;the\;rows\;of\;A.\)

\(The\;row\;space\;of\;A\;is\;\mathcal C(A^T).\;It\;is\;the\;column\;space\;of\;A^T.\)

\(The\;column\;space\;of\;A\;is\;\mathcal C(A)\)

\(The\;null\;space\;of\;A\;is\;\mathcal N(A)\)


Def: \(A\;\boldsymbol b\boldsymbol a\boldsymbol s\boldsymbol i\boldsymbol s\;for\;a\;vector\;space\;is\;a\;sequence\;of\;vectors\;satisfying\;two\;properties:\)

(1) The basis vectors are linearly independent.

(2) They span the space.


[Ex] \(\begin{bmatrix}1\\0\end{bmatrix},\begin{bmatrix}0\\1\end{bmatrix}\;constitute\;a\;basis\;for\;\mathcal R^2.\)

[Ex] \(\begin{bmatrix}1\\0\\0\end{bmatrix},\begin{bmatrix}0\\1\\0\end{bmatrix},\begin{bmatrix}0\\0\\1\end{bmatrix}\;constitute\;a\;basis\;for\;\mathcal R^3.\)

The standard basis \(for\;\mathcal R^n\)

[Ex] \(\begin{bmatrix}1\\0\\\vdots\\\vdots\\0\end{bmatrix}\begin{bmatrix}0\\1\\0\\\vdots\\0\end{bmatrix}\cdots\begin{bmatrix}0\\\vdots\\\vdots\\0\\1\end{bmatrix}\;constitute\;a\;basis\;for\;\mathcal R^n\)


[Ex] \(\begin{bmatrix}1\\0\\0\end{bmatrix},\begin{bmatrix}1\\1\\0\end{bmatrix},\begin{bmatrix}1\\1\\1\end{bmatrix}\)

\(Let\;A=\begin{bmatrix}1&1&1\\0&1&1\\0&0&1\end{bmatrix}\)

3 pivots -> columns are independent.

\(\begin{bmatrix}1&1&1\\0&1&1\\0&0&1\end{bmatrix}\begin{bmatrix}x_1\\x_2\\x_3\end{bmatrix}=\begin{bmatrix}b_1\\b_2\\b_3\end{bmatrix}\;is\;solvable\;for\;every\;\begin{bmatrix}b_1\\b_2\\b_3\end{bmatrix}\)

\(\therefore\mathcal C(A)=\mathcal R^3\)

\(\begin{bmatrix}1\\0\\0\end{bmatrix},\begin{bmatrix}1\\1\\0\end{bmatrix},\begin{bmatrix}1\\1\\1\end{bmatrix}\;are\;a\;basis\;for\;\mathcal R^3\)

[Ex] \(Any\;three\;independent\;vectors\;in\;\mathcal R^3\;are\;a\;basis\;for\;\mathcal R^3\)

\(Assume\;{\underline v}_1,\;{\underline v}_2,\;{\underline v}_3\;are\;independent\)

\(A=\begin{bmatrix}{\underline v}_1&{\underline v}_2&{\underline v}_3\end{bmatrix}\)

\(rank=3,\;A\;is\;invertible\)

\(A\underline x=\underline b\;is\;solvable\;for\;every\;\underline b\)

\(\mathcal C(A)=\mathcal R^3\)

\({\underline v}_1,\;{\underline v}_2,\;{\underline v}_3\;are\;a\;basis\;for\;\mathcal R^3\)

Claim: \(Any\;n\;independent\;vectors\;in\;\mathcal R^n\;are\;a\;basis\;for\;\mathcal R^n\)

Claim: \(The\;vectors\;{\underline v}_1,\;{\underline v}_2,...,\;{\underline v}_n\;are\;a\;basis\;for\;\mathcal R^n\)

\(exactly\;when\;they\;are\;the\;columns\;of\;an\;invertible\;matrix\)

Remark: \(\mathcal R^n\) has infinitely many different bases.

Claim: There is one and only one way to write any vector in a vector space as a linear combination of the basis vectors.

Proof \(Suppose\;{\underline v}=a_1{\underline v}_1+a_2{\underline v}_2+...+a_n{\underline v}_n\)

\(=b_1{\underline v}_1+b_2{\underline v}_2+...+b_n{\underline v}_n\)

\(\Rightarrow(a_1-b_1){\underline v}_1+(a_2-b_2){\underline v}_2+...+(a_n-b_n){\underline v}_n=\underline0\)

\(a_1-b_1=0,\;a_2-b_2=0,\;...,\;a_n-b_n=0\)

\(\sin ce\;{\underline v}_1,\;{\underline v}_2,...,\;{\underline v}_n\;are\;independent\)

\(\Rightarrow a_1=b_1,\;a_2=b_2,\;...,\;a_n=b_n\)


[Ex] \(A=\begin{bmatrix}1&3&0&2&-1\\0&0&1&4&-3\\1&3&1&6&-4\end{bmatrix}\)

\(\Rightarrow R=\begin{bmatrix}1&3&0&2&-1\\0&0&1&4&-3\\0&0&0&0&0\end{bmatrix}\)

\(\Rightarrow R=\begin{bmatrix}{\color[rgb]{0.0, 0.0, 1.0}1}&{\color[rgb]{0.68, 0.46, 0.12}3}&{\color[rgb]{0.0, 0.0, 1.0}0}&{\color[rgb]{0.68, 0.46, 0.12}2}&\color[rgb]{0.68, 0.46, 0.12}{-1}\\{\color[rgb]{0.0, 0.0, 1.0}0}&{\color[rgb]{0.68, 0.46, 0.12}0}&{\color[rgb]{0.0, 0.0, 1.0}1}&{\color[rgb]{0.68, 0.46, 0.12}4}&\color[rgb]{0.68, 0.46, 0.12}{-3}\\{\color[rgb]{0.0, 0.0, 1.0}0}&{\color[rgb]{0.68, 0.46, 0.12}0}&{\color[rgb]{0.0, 0.0, 1.0}0}&{\color[rgb]{0.68, 0.46, 0.12}0}&{\color[rgb]{0.68, 0.46, 0.12}0}\end{bmatrix},\;{\color[rgb]{0.0, 0.0, 1.0}b}{\color[rgb]{0.0, 0.0, 1.0}l}{\color[rgb]{0.0, 0.0, 1.0}u}{\color[rgb]{0.0, 0.0, 1.0}e}:pivot\;column.\;{\color[rgb]{0.68, 0.46, 0.12}g}{\color[rgb]{0.68, 0.46, 0.12}r}{\color[rgb]{0.68, 0.46, 0.12}e}{\color[rgb]{0.68, 0.46, 0.12}e}{\color[rgb]{0.68, 0.46, 0.12}n}:free\;column.\)

Recall that free columns of R are linear combinations of pivot columns of R.

Pivot columns of R are independent.

\(pivot\;columns\;of\;R\;are\;a\;basis\;for\;\mathcal C(R)\)

Recall that free columns of A are linear combinations of pivot columns of A.

Pivot columns of A are independent.

\(pivot\;columns\;of\;A\;are\;a\;basis\;for\;\mathcal C(A)\)

\((A\underline x=\underline0\;\Leftrightarrow R\underline x=\underline0)\)

\(\Rightarrow\begin{bmatrix}1\\0\\1\end{bmatrix},\begin{bmatrix}0\\1\\1\end{bmatrix}\;are\;a\;basis\;for\;\mathcal C(A)\)

\(\begin{bmatrix}1\\0\\0\end{bmatrix},\begin{bmatrix}0\\1\\0\end{bmatrix}\;are\;a\;basis\;for\;\mathcal C(R)\)

\(\mathcal C(A)\neq\mathcal C(R)\)


Dimention

Claim \({\underline v}_1,\;{\underline v}_2,\;...,\;{\underline v}_m\) and \({\underline w}_1,\;{\underline w}_2,\;...,\;{\underline w}_n\) are both bases for the same vector space, then \(m=n\)

Proof: Suppose \(n\;>\;m\)

We can have \({\underline w}_j=a_{1j}{\underline v}_1+a_{2j}{\underline v}_2+...+a_{mj}{\underline v}_m,\;for\;j=1,2,...,n\)

Consider \(x_1{\underline w}_1+x_2{\underline w}_2+...+x_n{\underline w}_n=\underline0\)

\(\Rightarrow x_1(a_{11}{\underline v}_1+a_{21}{\underline v}_2+...+a_{m1}{\underline v}_m)\)

\(+x_2(a_{12}{\underline v}_1+a_{22}{\underline v}_2+...+a_{m2}{\underline v}_m)\)

\(+...\)

\(+x_n(a_{1n}{\underline v}_1+a_{2n}{\underline v}_2+...+a_{mn}{\underline v}_m)=\underline0\)

\(\Rightarrow(a_{11}x_1+a_{12}x_2+...+a_{1n}x_n){\underline v}_1\)

\(+(a_{21}x_1+a_{22}x_2+...+a_{2n}x_n){\underline v}_2\)

\(+...\)

\(+(a_{m1}x_1+a_{m2}x_2+...+a_{mn}x_n){\underline v}_m=\underline0\)

Since \{\underline v}_1,\;{\underline v}_2,\;...,\;{\underline v}_m\) are independent (because they are a basis)

\(\begin{bmatrix}a_{11}&a_{12}&\dots&a_{1n}\\a_{21}&a_{22}&\dots&a_{2n}\\\vdots&\vdots&\ddots&\vdots\\a_{m1}&a_{m2}&\dots&a_{mn}\end{bmatrix}\begin{bmatrix}x_1\\x_2\\\vdots\\x_n\end{bmatrix}={\begin{bmatrix}0\\0\\\vdots\\0\end{bmatrix}}_{m\times1}\)

Since \(n\;>\;m\), there exist nonzero solutions for \(x_1,\;x_2,\;...,\;x_n\), \({\underline w}_1,\;{\underline w}_2,\;...,\;{\underline w}_n\) are dependent.

This is impossible because \({\underline w}_1,\;{\underline w}_2,\;...,\;{\underline w}_n\) are a basis.

If \(m\;>\;n\), we exchange the \(\underline v's\;and\;\underline w's\) and repeat the same steps.

We will then also have a contradiction. The only way to avoid a contradiction is to have \(m=n\)

Def The dimension of a vector space is the number of vectors in every basis.


[Ex] \(dim(\mathcal R^2)=2\)

\(\begin{bmatrix}1\\0\end{bmatrix}\begin{bmatrix}0\\1\end{bmatrix}\)

\(dim(\mathcal R^n)=n\)


[Ex] \(A=\begin{bmatrix}1&3&0&2&1\\0&0&1&4&-3\\1&3&1&6&-4\end{bmatrix}\)

\(R=\begin{bmatrix}1&3&0&2&1\\0&0&1&4&-3\\0&0&0&0&0\end{bmatrix}\)

\(dim(\mathcal C(A))=dim(\mathcal C(R))=2=r\)

\((note\;\mathcal C(A)\neq\mathcal C(R))\)


[Ex] M = the vector space of all 2x2 real matrices

A basis for M:

\(\begin{bmatrix}1&0\\0&0\end{bmatrix},\;\begin{bmatrix}0&1\\0&0\end{bmatrix},\;\begin{bmatrix}0&0\\1&0\end{bmatrix},\;\begin{bmatrix}0&0\\0&1\end{bmatrix}\)

dim(M)=4


[Ex] dim of the \(n\times n\) real matrix space \(=n^2\)


dim of the subspace of all \(n\times n\) upper triangular matrices \(=n+(n-1)+...+1\)

\(=\frac{n(n+1)}2\)


dim of the subspace of all \(n\times n\) diagonal matrices \(=n\)


dim of the subspace of all \(n\times n\) symmetric matrices \(=\frac{n(n+1)}2\)


Dimension of the Four subspaces


\(A\;m\times n\)

\(\begin{array}{ccc}row\;space&\mathcal C(A^T)&a\;subspace\;of\;\mathcal R^n\\column\;space&\mathcal C(A)&a\;subspace\;of\;\mathcal R^m\\nullspace&\mathcal N(A)&a\;subspace\;of\;\mathcal R^n\\left\;nullspace&\mathcal N(A^T)&a\;subspace\;of\;\mathcal R^m\end{array}\)

\(\mathcal N(A^T)=\{\underline y\;:\;A^T\underline y=\underline0\}\)

\(=\{\underline y\;:\;\underline y^TA=\underline0^T\}\)



\(A=\begin{bmatrix}1&3&5&0&7\\0&0&0&1&2\\1&3&5&1&9\end{bmatrix}\)

\(R=\begin{bmatrix}1&3&5&0&7\\0&0&0&1&2\\0&0&0&0&0\end{bmatrix}\)

\(=\begin{bmatrix}1&0&0\\0&1&0\\0&-1&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&1&0\\-1&0&1\end{bmatrix}\begin{bmatrix}1&3&5&0&7\\0&0&0&1&2\\1&3&5&1&9\end{bmatrix}\)

\(=\begin{bmatrix}1&0&0\\0&1&0\\-1&-1&1\end{bmatrix}\begin{bmatrix}1&3&5&0&7\\0&0&0&1&2\\1&3&5&1&9\end{bmatrix}\)

\(=EA\)

1. Raw Space

\(basis\;for\;\mathcal C(R^T):\begin{pmatrix}1&3&5&0&7\end{pmatrix},\;\begin{pmatrix}0&0&0&1&2\end{pmatrix}\)

pivot rows

\(dim(\mathcal C(R^T))=r=2\)

We can have \(EA=R\) for an invertible matrix \(E\)

\(EA=R\Leftrightarrow\;A=E^{-1}R\)

Every row of \(A\) is a linear combination of the rows of \(R\)

Every ro of R is a linear combination of the rows of \(A\)

\(\mathcal C(A^T)=\mathcal C(R^T)\)

\(\therefore dim(\mathcal C(A^T))=dim(\mathcal C(R^T))=2=r\equiv rank(A^T)=rank(A)\)

2. Column Space

\(Basis\;for\;\mathcal C(R):\begin{bmatrix}1\\0\\0\end{bmatrix},\begin{bmatrix}0\\1\\0\end{bmatrix},\;(pivot\;columns\;of\;R)\)

\(Basis\;for\;\mathcal C(A):\begin{bmatrix}1\\0\\1\end{bmatrix},\begin{bmatrix}0\\1\\1\end{bmatrix},\;(pivot\;columns\;of\;A)\)

\(Bases\;are\;different\;for\;\mathcal C(A)\;and\;\mathcal C(R)\)

\(dim(\mathcal C(R))=dim(\mathcal C(A))=r=2=rank(A)\)

3. Nullspace

\(A\underline x=\underline0\;\Leftrightarrow\;R\underline x=\underline0\)

\(\therefore\mathcal N(A)=\mathcal N(R)\)

\(R=\begin{bmatrix}1&3&5&0&7\\0&0&0&1&2\\0&0&0&0&0\end{bmatrix}\)

\(Basis\;for\;\mathcal N(R),\;(or\;\mathcal N(A))\)

\(\begin{bmatrix}-3\\1\\0\\0\\0\end{bmatrix},\begin{bmatrix}-5\\0\\1\\0\\0\end{bmatrix},\begin{bmatrix}-7\\0\\0\\-2\\1\end{bmatrix}\)

\(dim(\mathcal N(R))=dim(\mathcal N(A))=n-r=n-rank((A))=5-2=3\)

\(Bases\;are\;identical\;for\;\mathcal N(A)\;and\;\mathcal N(R)\)

4. Left Nullspace

\(\underline y^TR=\underline0^T\)

\(\Leftrightarrow{\begin{pmatrix}y_1&y_2&y_3\end{pmatrix}}_{1\times3}{\begin{bmatrix}1&3&5&0&7\\0&0&0&1&2\\0&0&0&0&0\end{bmatrix}}_{3\times5}={\begin{pmatrix}0&0&0&0&0\end{pmatrix}}_{1\times5}\)

\(\Rightarrow y_1\begin{pmatrix}1&3&5&0&7\end{pmatrix}\)

\(+y_2\begin{pmatrix}0&0&0&1&2\end{pmatrix}\)

\(=\begin{pmatrix}0&0&0&0&0\end{pmatrix}\)

\(\Rightarrow y_1=0,\;y_2=0\)

\(\begin{pmatrix}y_1&y_2&y_3\end{pmatrix}=y_3\begin{pmatrix}0&0&1\end{pmatrix}\)

\(Basis\;for\;\mathcal N(R^T)\;:\;\begin{pmatrix}0&0&1\end{pmatrix}\)

\(dim(\mathcal N(R^T))=3-2=1\)

In general,

\(\underline y^TR=\underline0\)

\(\Rightarrow\underline y^T=\begin{pmatrix}y_1&y_1&\cdots&y_m\end{pmatrix}=\begin{pmatrix}0&\cdots&0&y_{r+1}&\cdots&y_m\end{pmatrix}\)

\((\underbrace{0,\;...,0}_r,\underbrace1_{r+1},0,...,0)(\underbrace{0,\;...,0}_r,0,\underbrace1_{r+2},...,0)...(\underbrace{0,\;...,0}_r,0,0,...,\underbrace1_m)\)

\(for\;a\;basis\;for\;\mathcal N(R^T)\)

In other words, the last \(m-r\) rows of \(I_m\) the \(m\times m\) identity matrix form a basis for \(\mathcal N(R^T)\)

\(\therefore dim(\mathcal N(R^T))=m-r\)

\(For\;\mathcal N(A^T),\;\sin ce\;A^T\;is\;n\times m\)

\(dim(\mathcal N(A^T))=m-rank(A^T)=m-rank(A)=m-r\)

\(Recall\;EA=R\;where\;E\;is\;invertible\)

\(\underbrace{\begin{bmatrix}1&0&0\\0&1&0\\-1&-1&1\end{bmatrix}}_E\underbrace{\begin{bmatrix}1&3&5&0&7\\0&0&0&1&2\\1&3&5&1&9\end{bmatrix}}_A=\underbrace{\begin{bmatrix}1&3&5&0&7\\0&0&0&1&2\\0&0&0&0&0\end{bmatrix}}_R\)

\(\underline y^TA=\underline0^T\)

\(\Rightarrow\begin{pmatrix}-1&-1&1\end{pmatrix}\begin{bmatrix}1&3&5&0&7\\0&0&0&1&2\\1&3&5&1&9\end{bmatrix}=\begin{pmatrix}0&0&0&0&0\end{pmatrix}\)

\(Since\;dim(\mathcal N(A^T))=m-r=3-2=1\)

\(\underline y^TA=\underline0^T\Rightarrow\underline y^T=y_1\begin{pmatrix}-1&-1&1\end{pmatrix}\)

\(\begin{pmatrix}-1&-1&1\end{pmatrix}\;forms\;a\;basis\;for\;\mathcal N(A^T)\)

In general, \(EA=R\)

\({\begin{bmatrix}&&\\&E&\\\cdots&\cdots&\cdot_{m-r}\end{bmatrix}}_{m\times m}\begin{bmatrix}&&\\&A&\\&&\end{bmatrix}=\underbrace{\begin{bmatrix}I&F\\0&{\left.0\right|}_{m-r}\end{bmatrix}}_R\)

\({\begin{bmatrix}\cdots\\\vdots\\\cdots\end{bmatrix}}_{last\;m-r\;rows\;of\;E}\begin{bmatrix}&&\\&A&\\&&\end{bmatrix}={\begin{bmatrix}0&\cdots&0\\&&\\0&\cdots&0\end{bmatrix}}_{m-r}\)

Since \(E\) is invertible, all rows of \(E\) are independent

\(dim(\mathcal N(A^T))=m-r\)

The last \(m-r\) rows of \(E\) form a basis for \(\mathcal N(A^T)\)


Fundamental theorem of linear algebra

Here is a summary of the four subspaces associated with an M by N matrix A of rank R.

M, N, Rdim R(A) dim N(A)dim C(A) dim R(AT) M=N=RR0R0 M>N=RR0RM-R N>M=RRN-RR0 R<min(M,N)RN-R RM-R

\(dim(C(A^T))=dim(C(R^T))=r,\;subspace\;of\;\mathcal R^n\)

\(dim(C(A))=dim(C(R))=r,\;subspace\;of\;\mathcal R^m\)

\(dim(N(A))=dim(N(R))=n-r,\;subspace\;of\;\mathcal R^n\)

\(dim(N(A^T))=dim(N(R^T))=m-r,\;subspace\;of\;\mathcal R^m\)



Fundamental theorem of linear algebra part I

The column space and row space both have dimension r. Th nullspaces have dimensions n-r and m-r.

Orthogonality

Dot product or inner product \((over\;\mathcal R^n)\)

\(\underline v\cdot\underline w=\underline v^T\underline w={\begin{bmatrix}v_1&v_2&\cdots&v_n\end{bmatrix}}_{1\times n}{\begin{bmatrix}w_1\\w_2\\\vdots\\w_n\end{bmatrix}}_{n\times1}\)

\(=v_1w_1+v_2w_2+\cdots+v_nw_n=0\)

\(\underline v^T\underline v=v_1^2+v_2^2+\cdots+v_n^2\overset\Delta={\parallel\underline v\parallel}^2\)

Length of a vector \(\parallel\underline v\parallel={(\underline v^T\underline v)}^\frac12\)

Def \(Two\;vectors\;\underline v\;and\;\underline w\;are\;orthogonal\;if\;\underline v^T\underline w=0\)

[Ex] in \(\mathcal R^2,\;\begin{bmatrix}1\\-1\end{bmatrix}\;and\;\begin{bmatrix}1\\1\end{bmatrix}\;are\;orthogonal.\)

Remark \(\underline v^T\underline w=0\;\Leftrightarrow{\parallel\underline v\parallel}^2+{\parallel\underline w\parallel}^2={\parallel\underline v+\underline w\parallel}^2\)

Proof \({\parallel\underline v+\underline w\parallel}^2={(\underline v+\underline w)}^T(\underline v+\underline w)\)

\(=\underline v^T\underline v+\underline w^T\underline w+\underline v^T\underline w+\underline w^T\underline v\)

\(={\parallel\underline v\parallel}^2+{\parallel\underline w\parallel}^2+\underline v^T\underline w+\underline w^T\underline v\)

\(={\parallel\underline v\parallel}^2+{\parallel\underline w\parallel}^2+\underline v^T\underline w+{(\underline v^T\underline w)}^T,\;\underline v^T\underline w=0\Rightarrow{(\underline v^T\underline w)}^T=0\)

\(={\parallel\underline v\parallel}^2+{\parallel\underline w\parallel}^2\)

Def \(Two\;subspaces\;V\;and\;W\;are\;orthogonal\;if\;\underline v^T\underline w=0\;for\;all\;\underline v\in V\;and\;all\;\underline w\in W\)

\(A\underline x=\begin{bmatrix}row\;1\\row\;2\\\vdots\\row\;m\end{bmatrix}\begin{bmatrix}\\\underline x\\\\\end{bmatrix}=\begin{bmatrix}0\\0\\\vdots\\0\end{bmatrix}\Leftrightarrow\begin{array}{c}(row\;1)\cdot\underline x=0\\(row\;2)\cdot\underline x=0\\\vdots\\(row\;m)\cdot\underline x=0\end{array}\)

\(The\;nullspace\;N(A)\;and\;the\;row\;space\;C(A^T)\;are\;orthogonal\;subspaces\;of\;\mathcal R^n\)

Similarly, \(A^T\underline y=\begin{bmatrix}{(column\;1)}^T\\{(column\;2)}^T\\\vdots\\{(column\;n)}^T\end{bmatrix}\begin{bmatrix}\\\underline y\\\\\end{bmatrix}=\begin{bmatrix}0\\0\\\vdots\\0\end{bmatrix}\)

\(The\;left\;nullspace\;N(A^T)\;and\;subspaces\;column\;space\;C(A)\;are\;orthogonal\;of\;\mathcal R^m\)


\(N(A)\perp C(A^T)\)

\(N(A^T)\perp C(A)\)


Def: The orthogonal complement \(V^\perp\) of a subspace \(V\) contains every vector that is orthogonal to \(V\).

\(i.e.,\;V^\perp=\{\underline w:\underline w^T\underline v=0\;for\;all\;\underline v\in V\}\)

\(V^\perp\) is called by "V perp"

Remark \(V^\perp\) is also a subspace.

Claim \((C{(A^T))}^\perp=N(A)\)

\(\begin{bmatrix}\cdots\\\cdots\\\cdots\end{bmatrix}\begin{bmatrix}\underline x\end{bmatrix}=\begin{bmatrix}0\\\vdots\\0\end{bmatrix}\)

Proof \(We\;need\;to\;show\;that\;if\;\underline v^T\underline w=0\;for\;all\;\underline w\in N(A)\;then\;\underline v\in C(A^T)\)

\(If\;\underline v\not\in C(A^T),\;then\;we\;can\;add\;\underline v\;as\;an\;extra\;row\;to\;A\)

\(A'=\begin{bmatrix}A\\\underline v\end{bmatrix}\)

\(\begin{bmatrix}A\\\underline v\end{bmatrix}\begin{bmatrix}\underline x\end{bmatrix}=\begin{bmatrix}0\\\vdots\\0\end{bmatrix}\)

\(It\;then\;follows\;N(A')=N(A)\)

\(But\;dim(C(A^T))=r\;and\;dim(C({A'}^T))=r+1\)

\(While\;dim(N(A'))=dim(N(A))=n-r,\;which\;yields\;a\;contradiction\)


Fundamental theorem of linear algebra part II

\((C{(A^T))}^\perp=N(A)\)

\((N{(A))}^\perp=C{(A^T)}\)

Similarly

\((C{(A))}^\perp=N(A^T)\)

\((N{(A^T))}^\perp=C{(A)}\)


\(N(A)\;is\;the\;orthogonal\;complement\;of\;C(A^T)\;in\;\mathcal R^n\)

\(N(A^T)\;is\;the\;orthogonal\;complement\;of\;C(A)\;in\;\mathcal R^m\)



Claim \({\underline v}_1,\;{\underline v}_2,\;\cdots,\;{\underline v}_n\;are\;independent\;in\;\mathcal R^n\;iff\;{\underline v}_1,\;{\underline v}_2,\;\cdots,\;{\underline v}_n\;span\;\mathcal R^n\)

Proof \(Let\;A=\begin{bmatrix}{\underline v}_1&{\underline v}_2&\cdots&{\underline v}_n\end{bmatrix}\)

\(\Rightarrow\)

\(If\;{\underline v}_1,\;{\underline v}_2,\;\cdots,\;{\underline v}_n\;are\;independent\;can\;they\;form\;a\;basis\;for\;C((A)),\;then\;rank(A)=n\)

\(Hence\;A\;is\;invertible\;and\;A\underline x=\underline b\;is\;solvable\;for\;every\;\underline b\in\mathcal R^n\)

\(Then\;C(A)=\mathcal R^n\)

\(Therefore\;{\underline v}_1,\;{\underline v}_2,\;\cdots,\;{\underline v}_n\;span\;\mathcal R^n\)

\(\Leftarrow\)

\(If\;{\underline v}_1,\;{\underline v}_2,\;\cdots,\;{\underline v}_n\;span\;\mathcal R^n\;then\;A\underline x=\underline b\;is\;solvable\;for\;every\;\underline b\in\mathcal R^n\)

\(Hence\;elimination\;produces\;no\;zero\;rows,\;there\;are\;n\;pivots\;and\;thus\;rank(A)=n\)

\(The\;nullspace\;N(A)=\{\underline0\},\;which\;implies\;{\underline v}_1,\;{\underline v}_2,\;\cdots,\;{\underline v}_n\;are\;independent.\)

Remark \(Any\;n\;independent\;vectors\;form\;a\;basis\;in\;\mathcal R^n.\;Also\;any\;n\;vectors\;that\;span\;\mathcal R^n\;are\;a\;basis\;for\;\mathcal R^n.\)


Claim \(If\;V\perp W,\;then\;V\cap W=\{\underline0\}\)

\(Suppose\;\underline v\in V\cap W,\;then\;\;\underline v\in V\;and\;\;\underline v\in W\)

\(We\;have\;\underline v^T\underline v=\underline0={\parallel\underline v\parallel}^2\;which\;implies\;\underline v=\underline0\)

Claim \(For\;any\;vector\;\underline x\in\mathcal R^n,\;we\;can\;have\;\underline x={\underline x}_r+{\underline x}_n\)

\(when\;{\underline x}_r\in C(A^T)\;and\;{\underline x}_n\in N(A)\)

Proof \(Let\;{\underline v}_1,\;{\underline v}_2,\;\cdots,\;{\underline v}_r\;be\;a\;basis\;for\;C(A^T)\;and\;{\underline w}_1,\;{\underline w}_2,\;\cdots,\;{\underline w}_{n-r}\;be\;a\;basis\;for\;N(A)\)

\(Suppose\;a_1{\underline v}_1+a_2{\underline v}_2+\cdots+a_r{\underline v}_r\;\;+\;\;b_1{\underline w}_1+b_2{\underline w}_2+\cdots+b_{n-r}{\underline w}_{n-r}=\underline0\)

\(Let\;\underline u=a_1{\underline v}_1+a_2{\underline v}_2+\cdots+a_r{\underline v}_r\)

\(=-b_1{\underline w}_1-b_2{\underline w}_2-\cdots-b_{n-r}{\underline w}_{n-r}\)

\(Then\;\underline u\in C(A^T)\;and\;\underline u\in N(A)\)

\(Since\;C(A^T)\perp N(A),\;we\;must\;have\;\underline u=\underline0\)

\(Since\;{\underline v}_1,\;{\underline v}_2,\;\cdots,\;{\underline v}_r\;are\;a\;basis\;for\;C(A^T)\;and\;{\underline w}_1,\;{\underline w}_2,\;\cdots,\;{\underline w}_{n-r}\;are\;a\;basis\;for\;N(A)\)

\(a_1=a_2=\cdots=a_r\;\;=\;\;b_1=b_2=\cdots=b_{n-r}\;\;=\;\;0\)

\(Therefore,\;{\underline v}_1,\;{\underline v}_2,\;\cdots,\;{\underline v}_r,\;\;{\underline w}_1,\;{\underline w}_2,\;\cdots,\;{\underline w}_{n-r}\;are\;independent\;and\;hence\;form\;a\;basis\;for\;\mathcal R^n\)

\(For\;every\;\underline x\in\mathcal R^n,\;we\;can\;have\;\underline x=c_1{\underline v}_1+c_2{\underline v}_2+\cdots+c_r{\underline v}_r+c_{r+1}{\underline w}_1+c_{r+2}{\underline w}_2+\cdots+c_n{\underline w}_{n-r}\)

\(={\underline x}_r+{\underline x}_n\)

\(where\;{\underline x}_r\in C(A^T)\;and\;{\underline x}_n\in N(A)\)

Remark \(The\;decomposition\;\underline x={\underline x}_r+{\underline x}_n\;is\;unique\)

Remark \(For\;every\;\underline x\in\mathcal R^m,\;\underline x={\underline x}_c+{\underline x}_{ln}\)

\(where\;{\underline x}_c\in C(A)\;and\;{\underline x}_{ln}\in N(A^T)\)

\(For\;\underline x\in\mathcal R^n\)

\(A\underline x=A({\underline x}_r+{\underline x}_n)\)

\(=A{\underline x}_r+\underline0\)

\(=A{\underline x}_r\)

\(Via\;A,\)

\({\underline x}_n\;goes\;to\;\underline0\;\because\;A{\underline x}_n=\underline0\)

\({\underline x}_r\;goes\;to\;the\;column\;space\;\because A{\underline x}_r=A\underline x\)




Claim: Every \(\underline b\) in the column space comes from one and only one vector in the row space.

Proof: \(Let\;\underline b=A{\underline x}_r=A\underline x'_r\in C(A)\;where\;{\underline x}_r\;and\;\underline x'_r\in C(A^T)\)

\(\Rightarrow A(\;{\underline x}_r-\underline x'_r)=\underline0\)

\(\Rightarrow{\underline x}_r-\underline x'_r\in N(A)\)

\(Since\;also\;{\underline x}_r-\underline x'_r\in N(A^T),\;we\;can\;have\;{\underline x}_r-\underline x'_r=\underline0,\;yielding\;{\underline x}_r=\underline x'_r\)

[Ex]

\(A=\begin{bmatrix}1&2\\3&6\end{bmatrix},\;we\;have\;\underline x=\begin{bmatrix}4\\3\end{bmatrix}=\underbrace{\begin{bmatrix}2\\4\end{bmatrix}}_{{\underline x}_r}+\underbrace{\begin{bmatrix}2\\-1\end{bmatrix}}_{{\underline x}_n}\)

Projections

\(\underline b=\begin{bmatrix}2\\3\\4\end{bmatrix}\)

\(what\;are\;the\;projections\;of\;\underline b\;onto\;the\;z\;axis\;and\;the\;xy\;plane?\)

\({\underline p}_1=\begin{bmatrix}0\\0\\4\end{bmatrix},\;{\underline p}_2=\begin{bmatrix}2\\3\\0\end{bmatrix}\)

\({\underline p}_1={\underline P}_1\underline b=\begin{bmatrix}0&0&0\\0&0&0\\0&0&1\end{bmatrix}\begin{bmatrix}2\\3\\4\end{bmatrix}=\begin{bmatrix}0\\0\\4\end{bmatrix}\)

\({\underline p}_2={\underline P}_2\underline b=\begin{bmatrix}1&0&0\\0&1&0\\0&0&0\end{bmatrix}\begin{bmatrix}2\\3\\4\end{bmatrix}=\begin{bmatrix}2\\3\\0\end{bmatrix}\)

\(For\;vectors\;{\underline p}_1+{\underline p}_2=\underline b\)

\(\begin{bmatrix}0\\0\\4\end{bmatrix}+\begin{bmatrix}2\\3\\0\end{bmatrix}=\begin{bmatrix}2\\3\\4\end{bmatrix}\)

\(For\;matrices,\;{\underline P}_1+{\underline P}_2=I\)


Projections onto a line


\(We\;mant\;to\;find\;the\;projection\;of\;\underline b=\begin{bmatrix}b_1\\b_2\\\vdots\\b_m\end{bmatrix}\;onto\;the\;line\;in\;the\;direction\;of\;\underline a=\begin{bmatrix}a_1\\a_2\\\vdots\\a_m\end{bmatrix}\)

\(Let\;\underline p=\widehat x\underline a\)

\(then\;\underline e=\underline b-\underline p=\underline b-\widehat x\underline a,\;\underline e\;and\;\underline a\;should\;be\;orthogonal\)

\(\Rightarrow\underline a^T\underline e=\underline a^T(\underline b-\widehat x\underline a)=0\)

\(\Rightarrow\underline a^T\underline b-\widehat x\underline a^T\underline a=0\)

\(\Rightarrow\widehat x=\frac{\underline a^T\underline b}{\underline a^T\underline a}\)

\(\therefore\underline p=\widehat x\underline a=(\frac{\underline a^T\underline b}{\underline a^T\underline a})\underline a\)

\(If\;\underline b=\underline a,\;then\;\widehat x=1\;and\;\underline p=\underline a\)

\(If\;\underline b\;and\;\underline a\;are\;orthogonal,\;then\;\;\widehat x=0\;and\;\underline p=\underline0\)

\(\underline p=\widehat x\underline a=\underline a\widehat x=\underline a(\frac{\underline a^T\underline b}{\underline a^T\underline a})=(\frac{\underline a\;\underline a^T}{\underline a^T\underline a})\underline b=\underline P\;\underline b\)

\(where\;\underline P=\frac{\underline a\;\underline a^T}{\underline a^T\underline a}\;is\;the\;projection\;matrix,\;\underline a\;\underline a^T\rightarrow m\times m\)

\(\underline e=\underline b-\underline p=\underline b-\underline P\;\underline b=(I-\underline P)\underline b\)


[Ex] \(\underline b=\begin{bmatrix}1\\1\\1\end{bmatrix},\;\underline a=\begin{bmatrix}1\\2\\2\end{bmatrix}\)

\(\underline P=\frac{\underline a\;\underline a^T}{\underline a^T\underline a}=\frac1{1^2+2^2+2^2}\begin{bmatrix}1\\2\\2\end{bmatrix}\begin{bmatrix}1&2&2\end{bmatrix}\)

\(=\frac19\begin{bmatrix}1&2&2\\2&4&4\\2&4&4\end{bmatrix}\)

\(\underline p=\underline P\;\underline b=\frac19\begin{bmatrix}1&2&2\\2&4&4\\2&4&4\end{bmatrix}\begin{bmatrix}1\\1\\1\end{bmatrix}=\frac19\begin{bmatrix}5\\10\\10\end{bmatrix}=\begin{bmatrix}\frac59\\\frac{10}9\\\frac{10}9\end{bmatrix}\)

\(\underline e=\underline b-\underline p\)

\(=\begin{bmatrix}1\\1\\1\end{bmatrix}-\begin{bmatrix}\frac59\\\frac{10}9\\\frac{10}9\end{bmatrix}=\begin{bmatrix}\frac49\\-\frac19\\-\frac19\end{bmatrix}\)

\(\underline a^T\underline e=\begin{bmatrix}1&2&2\end{bmatrix}\begin{bmatrix}\frac49\\-\frac19\\-\frac19\end{bmatrix}=\frac49-\frac29-\frac29=0\)



Note that \(\underline P^2\)

\(=\underline P\;\underline P=\frac19\begin{bmatrix}1&2&2\\2&4&4\\2&4&4\end{bmatrix}\frac19\begin{bmatrix}1&2&2\\2&4&4\\2&4&4\end{bmatrix}\)

\(=\frac1{81}\begin{bmatrix}9&18&18\\18&36&36\\18&36&36\end{bmatrix}=\frac19\begin{bmatrix}1&2&2\\2&4&4\\2&4&4\end{bmatrix}=\underline P\)

(since projecting a second time does not change anything.)

In general, \(\underline P^2=(\frac{\underline a\;\underline a^T}{\underline a^T\underline a})(\frac{\underline a\;\underline a^T}{\underline a^T\underline a})=\frac{\underline a(\underline a^T\underline a)\underline a^T}{{(\underline a^T\underline a)}^2}=\frac{(\cancel{\underline a^T\underline a})\underline a\;\underline a^T}{{(\underline a^T\underline a)}^\cancel2}\)

\(=\frac{\underline a\;\underline a^T}{\underline a^T\underline a}=\underline P\)


Projection onto a subspace


\(start\;with\;n\;independent\;vectors\;{\underline a}_1,\;\;{\underline a}_2,\;\cdots\;{\underline a}_n\;in\;\mathcal R^m,\)

\(we\;want\;to\;find\;\underline p={\widehat x}_1{\underline a}_1+{\widehat x}_2{\underline a}_2+\cdots+{\widehat x}_n{\underline a}_n\;as\;a\;projection\;of\;a\;given\;vector\;b\)

\(Let\;A={\begin{bmatrix}{\underline a}_1&{\underline a}_2&\cdots&{\underline a}_n\end{bmatrix}}_{m\times n}\)

\(then\;\;\underline p=\begin{bmatrix}{\underline a}_1&{\underline a}_2&\cdots&{\underline a}_n\end{bmatrix}\begin{bmatrix}{\widehat x}_1\\{\widehat x}_2\\\vdots\\{\widehat x}_n\end{bmatrix}=A\underline{\widehat x}\)

\(The\;error\;vector\;\underline b-A\underline{\widehat x}\;should\;be\;orthogonal\;to\;the\;subspace\;spanned\;by\;{\underline a}_1,\;\;{\underline a}_2,\;\cdots\;{\underline a}_n.\;(the\;column\;space\;of\;A)\)

\(\begin{array}{c}\underline a_1^T(\underline b-A\underline{\widehat x})=0\\\underline a_2^T(\underline b-A\underline{\widehat x})=0\\\vdots\\\underline a_n^T(\underline b-A\underline{\widehat x})=0\end{array}\)

\(\Rightarrow\begin{bmatrix}\underline a_1^T\\\underline a_2^T\\\vdots\\\underline a_n^T\end{bmatrix}\begin{bmatrix}\underline b-A\underline{\widehat x}\end{bmatrix}=\begin{bmatrix}0\\0\\\vdots\\0\end{bmatrix}\)

\(\Rightarrow A^T(\underline b-A\underline{\widehat x})=\underline0\)

\(\Rightarrow A^T\underline b-A^TA\underline{\widehat x}=\underline0\)

ATA n×n x^ n×1 = ATb n×1

The matrix \(A^TA\) is square and symmetric, and it is invertible iff \({\underline a}_1,\;\;{\underline a}_2,\;\cdots\;{\underline a}_n\) are independent.

\(\Rightarrow\underline{\widehat x}={(A^TA)}^{-1}A^T\underline b\)

\(\therefore\underline p=A\underline{\widehat x}={\color[rgb]{0.0, 0.0, 1.0}(}A{(A^TA)}^{-1}A^T{\color[rgb]{0.0, 0.0, 1.0})}\underline b\)

The projection matrix \(\underline P=A{(A^TA)}^{-1}A^T\)

\({\color[rgb]{0.9, 0.76, 0.02}\underline{\widehat x}}{\color[rgb]{0.9, 0.76, 0.02}=}{\color[rgb]{0.9, 0.76, 0.02}{(A^TA)}}^{\color[rgb]{0.9, 0.76, 0.02}{-1}}{\color[rgb]{0.9, 0.76, 0.02}A}^{\color[rgb]{0.9, 0.76, 0.02}T}{\color[rgb]{0.9, 0.76, 0.02}\underline b}\)

\({\color[rgb]{0.9, 0.76, 0.02}\underline p}{\color[rgb]{0.9, 0.76, 0.02}=}{\color[rgb]{0.9, 0.76, 0.02}A}{\color[rgb]{0.9, 0.76, 0.02}\underline{\widehat x}}{\color[rgb]{0.9, 0.76, 0.02}=}{\color[rgb]{0.9, 0.76, 0.02}A}{\color[rgb]{0.9, 0.76, 0.02}{(A^TA)}}^{\color[rgb]{0.9, 0.76, 0.02}{-1}}{\color[rgb]{0.9, 0.76, 0.02}A}^{\color[rgb]{0.9, 0.76, 0.02}T}{\color[rgb]{0.9, 0.76, 0.02}\underline b}\)

\({\color[rgb]{0.9, 0.76, 0.02}\underline P}{\color[rgb]{0.9, 0.76, 0.02}=}{\color[rgb]{0.9, 0.76, 0.02}A}{\color[rgb]{0.9, 0.76, 0.02}{(A^TA)}}^{\color[rgb]{0.9, 0.76, 0.02}{-1}}{\color[rgb]{0.9, 0.76, 0.02}A}^{\color[rgb]{0.9, 0.76, 0.02}T}\)

compared with the case n=1,

\(\widehat x=\frac{\underline a^T\underline b}{\underline a^T\underline a}\)

\(\underline p=\underline a\frac{\underline a^T\underline b}{\underline a^T\underline a}\)

\(\underline P=\frac{\underline a\;\underline a^T}{\underline a^T\underline a}\)

Note \(\underline{\mathrm P}^2=\mathrm A{(\mathrm A^{\mathrm T}\mathrm A)}^{-1}\cancel{\mathrm A^{\mathrm T}\mathrm A}\cancel{{(\mathrm A^{\mathrm T}\mathrm A)}^{-1}}\mathrm A^{\mathrm T}\)

\(=\mathrm A{(\mathrm A^{\mathrm T}\mathrm A)}^{-1}\mathrm A^{\mathrm T}=\underline{\mathrm P}\)


[Ex] \(A=\begin{bmatrix}1&0\\1&1\\1&2\end{bmatrix}\;and\;\underline b=\begin{bmatrix}6\\0\\0\end{bmatrix}\)

\(A^TA=\begin{bmatrix}1&1&1\\0&1&2\end{bmatrix}\begin{bmatrix}1&0\\1&1\\1&2\end{bmatrix}=\begin{bmatrix}3&3\\3&5\end{bmatrix}\)

\(A^T\underline b=\begin{bmatrix}1&1&1\\0&1&2\end{bmatrix}\begin{bmatrix}6\\0\\0\end{bmatrix}=\begin{bmatrix}6\\0\end{bmatrix}\)

\(A^TA\widehat x=A^T\underline b\)

\(\Rightarrow\begin{bmatrix}3&3\\3&5\end{bmatrix}\begin{bmatrix}{\widehat x}_1\\{\widehat x}_2\end{bmatrix}=\begin{bmatrix}6\\0\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}{\widehat x}_1\\{\widehat x}_2\end{bmatrix}=\begin{bmatrix}5\\-3\end{bmatrix}\)

\(\underline p=A\underline{\widehat x}=\begin{bmatrix}1&0\\1&1\\1&2\end{bmatrix}\begin{bmatrix}5\\-3\end{bmatrix}=\begin{bmatrix}5\\2\\-1\end{bmatrix}\)

\(\underline e=\underline b-\underline p=\begin{bmatrix}6\\0\\0\end{bmatrix}-\begin{bmatrix}5\\2\\-1\end{bmatrix}=\begin{bmatrix}1\\-2\\1\end{bmatrix}\)

\(A^T\underline e=\begin{bmatrix}1&1&1\\0&1&2\end{bmatrix}\begin{bmatrix}1\\-2\\1\end{bmatrix}=\begin{bmatrix}0\\0\end{bmatrix}\)

\(\underline P=A{(A^TA)}^{-1}A^T\)

\(=\begin{bmatrix}1&0\\1&1\\1&2\end{bmatrix}{(\begin{bmatrix}1&1&1\\0&1&2\end{bmatrix}\begin{bmatrix}1&0\\1&1\\1&2\end{bmatrix})}^{-1}\begin{bmatrix}1&1&1\\0&1&2\end{bmatrix}\)

\(=\begin{bmatrix}1&0\\1&1\\1&2\end{bmatrix}{(\begin{bmatrix}3&3\\3&5\end{bmatrix})}^{-1}\begin{bmatrix}1&1&1\\0&1&2\end{bmatrix}\)

\(=\begin{bmatrix}1&0\\1&1\\1&2\end{bmatrix}\begin{bmatrix}\frac56&-\frac12\\-\frac12&\frac12\end{bmatrix}\begin{bmatrix}1&1&1\\0&1&2\end{bmatrix}\)

\(=\frac16\begin{bmatrix}5&-3\\3&0\\-1&3\end{bmatrix}\begin{bmatrix}1&1&1\\0&1&2\end{bmatrix}\)

\(\underline P=\frac16\begin{bmatrix}5&2&-1\\2&2&2\\-1&2&5\end{bmatrix}\)

\(\underline P^2=\frac16\begin{bmatrix}5&2&-1\\2&2&2\\-1&2&5\end{bmatrix}\frac16\begin{bmatrix}5&2&-1\\2&2&2\\-1&2&5\end{bmatrix}=\frac16\begin{bmatrix}5&2&-1\\2&2&2\\-1&2&5\end{bmatrix}=\underline P\)


Claim \(rank(A^TA)=rank(AA^T)=rank(A)\)

Proof:

We first show that \(A^TA\) has the same nullspace as \(A\),

\(i.e.,\;A^TA\underline x=\underline0\;\Leftrightarrow\;A\underline x=\underline0\)

\(''\Leftarrow''\;A\underline x=\underline0\;\Rightarrow A^TA\underline x=A^T\underline0\Rightarrow A^TA\underline x=\underline0\)

\(''\Rightarrow''\;A^TA\underline x=\underline0\)

\(\Rightarrow\underline x^TA^TA\underline x=\underline x^T\underline0\;=0\)

\(\Rightarrow{(A\underline x)}^TA\underline x=0\)

\(\Rightarrow\left\|A\underline x\right\|^2=0\)

\(\Rightarrow A\underline x=\underline0\)

\(Therefore,\;N(A^TA)=N(A)\;which\;implies,\)

\(n-rank(A^TA)=n-rank(A)\)

\(\Rightarrow rank(A^TA)=rank(A)\)

\(similary,\;putting\;A^T\;as\;A,\;we\;can\;get\;rank(AA^T)=rank(A^T)=rank(A)\)



Remark 1: \(A^TA\;is\;invertible\;iff\;A\;has\;independent\;columns.\)

Proof: \(A\) has independent columns,

\(\Rightarrow rank(A)=n\)

\(\Rightarrow rank(A^TA)=n\)

\(\Rightarrow A^TA\;is\;invertible\)

"\(\Leftarrow\)"

\(A^TA\;is\;invertible\)

\(\Rightarrow rank(A^TA)=n\)

\(\Rightarrow rank(A)=n\)

\(\Rightarrow A\;has\;independent\;columns\)


Remark 2: \(When\;A\;has\;independent\;columns,\;A^TA\;is\;square,\;symmetric,\;and\;invertible.\)

Least Square Approximations

\(It\;often\;happens\;that\;A\underline x=\underline b\;has\;no\;solution.\)

\(\Rightarrow We\;cannot\;always\;get\;the\;error\;\underline e\;=\;\underline b-A\underline x\;down\;to\;zero.\)

\(In\;this\;case,\;we\;may\;want\;the\;length\;of\;\underline e\;,\;or\;\left\|\underline e\right\|^2\;as\;small\;as\;possible.\)

\(\Rightarrow\;least\;squares\;solution\)


[Ex] Find the closest line to the points (0,6), (1,0), and (2,0).



\(b=C+Dt\)

\(\begin{array}{c}C+D\cdot0=6\\C+D\cdot1=0\\C+D\cdot2=0\end{array}\)

\(A=\begin{bmatrix}1&0\\1&1\\1&2\end{bmatrix},\;\underline x=\begin{bmatrix}C\\D\end{bmatrix},\;\underline b=\begin{bmatrix}6\\0\\0\end{bmatrix}\)

\(A\underline x=\underline b\;is\;not\;solvable.\;We\;want\;to\;make\;\left\|\underline b-A\underline x\right\|^2\;as\;small\;as\;possible.\;How?\)

\(By\;geometry,\;the\;best\;fit\;is\;\widehat{\underline x}\;such\;that\;\underline p=a\widehat{\underline x}\;is\;the\;projection\;of\;\underline b\;onto\;\mathcal C(A)\)



\(A^T(\underline b-A\widehat{\underline x})=\underline0\)

\(\Rightarrow A^TA\widehat{\underline x}=A^T\underline b\)

\(\begin{bmatrix}1&1&1\\0&1&2\end{bmatrix}\begin{bmatrix}1&0\\1&1\\1&2\end{bmatrix}\begin{bmatrix}C\\D\end{bmatrix}=\begin{bmatrix}1&1&1\\0&1&2\end{bmatrix}\begin{bmatrix}6\\0\\0\end{bmatrix}\)

\(\begin{bmatrix}3&3\\3&5\end{bmatrix}\begin{bmatrix}C\\D\end{bmatrix}=\begin{bmatrix}6\\0\end{bmatrix}\)

\(\widehat{\underline x}=\begin{bmatrix}C\\D\end{bmatrix}=\begin{bmatrix}5\\-3\end{bmatrix}\)

Therefore, the best line for the 3 points is b=5-3t.

Recall \(\underline x={\underline x}_r+{\underline x}_n,\;where\;\underline{\;x}\in\mathcal R^n,\;{\underline x}_r\in\mathcal C(A^T),\;{\underline x}_n\in\mathcal N(A)\)

By linear algebra, \(\underline b=\underline p+\underline e,\;where\;\underline b\in\mathcal R^m,\;\underline p\in\mathcal C(A),\;\underline e\in\mathcal N(A^T),\;and\;\underline p^T\underline e=0\)

\(We\;know\;that\;A\underline x=\underline b=\underline p+\underline e\;is\;impossible\;and\;a\widehat{\underline x}=\underline p\;is\;solvable.\)

\(\left\|A\underline x-\underline b\right\|^2=\left\|A\underline x-\underline p-\underline e\right\|^2\)

\(={(A\underline x-\underline p-\underline e)}^T(A\underline x-\underline p-\underline e)\)

\(=\left\|A\underline x-\underline p\right\|^2+\left\|\underline e\right\|^2-{(A\underline x-\underline p)}^T\underline e-\underline e^T(A\underline x-\underline p)\)

\({(A\underline x-\underline p)}^T\underline e=0\;\sin ce\;(A\underline x-\underline p)\in\mathcal C(A)\;and\;\underline e\in\mathcal N(A^T)\)

\(\left\|A\underline x-\underline b\right\|^2=\left\|A\underline x-\underline p\right\|^2+\left\|\underline e\right\|^2\)

\(\geq\left\|\underline e\right\|^2\)

\(\therefore\left\|A\underline x-\underline b\right\|^2\;is\;minimized\;when\;\underline x=\widehat{\underline x}\)

\(i.e.,\;\underset{\underline x}{min}\left\|A\underline x-\underline b\right\|^2=\left\|A\widehat{\underline x}-\underline b\right\|^2\)

\(=\left\|A\widehat{\underline x}-\underline p\right\|^2+\left\|\underline e\right\|^2=\left\|\underline e\right\|^2\)

\(where\;A^TA\widehat{\underline x}=A^T\underline b\;and\;\underline p=A\widehat{\underline x}\)



\(By\;calculus,\;E=\left\|A\underline x-\underline b\right\|^2\)

\(={(C+D\cdot0-6)}^2+{(C+D\cdot1)}^2+{(C+D\cdot2)}^2=0\)

\(\frac{\partial E}{\partial C}=2(C-6)+2(C+D)+2(C+2D)=0\)

\(\frac{\partial E}{\partial D}=0+2(C+D)+2(C+2D)\cdot2=0\)

\(\Rightarrow3C+3D=6\)

\(3C+5D=0\)

\(\Rightarrow\begin{bmatrix}3&3\\3&5\end{bmatrix}\begin{bmatrix}C\\D\end{bmatrix}=\begin{bmatrix}6\\0\end{bmatrix}\)

\(A^TA\widehat{\underline x}=A^T\underline b\)

\(\therefore\widehat{\underline x}=\begin{bmatrix}C\\D\end{bmatrix}=\begin{bmatrix}5\\-3\end{bmatrix}\)

\(The\;partial\;derivatives\;of\;\left\|A\underline x-\underline b\right\|^2\;are\;zero\;when\;A^TA\widehat{\underline x}=A^T\underline b.\)

\(b=5-3t\;is\;best\;fit.\)

\(at\;t=0,1,2.\;this\;line\;goes\;through\;5,2,-1.\;and\;it\;could\;not\;go\;through\;6,0,0.\)

\(The\;error\;are\;1,-2,1\;which\;is\;the\;vector\;\underline e.\)

\(\underset{\underline x}{min}\left\|A\underline x-\underline b\right\|^2=\left\|\underline e\right\|^2=6\)

\(When\;A\underline x-\underline b\;is\;not\;solvable,\;the\;least\;square\;solution\;\widehat{\underline x}\;satisfies\)

\(A^TA\widehat{\underline x}=A^T\underline b\)


Fitting a straight line

\(Problem:\;Fit\;hieghts\;b_1,b_2,\cdots,b_m\;at\;times\;t_1,t_2,\cdots,t_m\;by\;a\;straight\;line\;C+Dt\)

solution: \(\left\{\begin{array}{l}C+Dt_1=b_1\\C+Dt_2=b_2\\\vdots\\C+Dt_m=b_m\end{array}\right.\)

\(\begin{bmatrix}1&t_1\\1&t_2\\\vdots&\vdots\\1&t_m\end{bmatrix}\begin{bmatrix}C\\D\end{bmatrix}=\begin{bmatrix}b_1\\b_2\\\vdots\\b_m\end{bmatrix}\)

\(A\underline x=\underline b\)

\(This\;is\;not\;solvable.\;Best\;fit:\;A^TA\widehat{\underline x}=A^T\underline b\)

\(A^TA=\begin{bmatrix}1&1&\cdots&1\\t_1&t_2&\cdots&t_m\end{bmatrix}\begin{bmatrix}1&t_1\\1&t_2\\\vdots&\vdots\\1&t_m\end{bmatrix}=\begin{bmatrix}m&\underset i{\sum t_i}\\\underset i{\sum t_i}&\underset i{\sum t_i^2}\end{bmatrix}\)

\(A^T\underline b=\begin{bmatrix}1&1&\cdots&1\\t_1&t_2&\cdots&t_m\end{bmatrix}\begin{bmatrix}b_1\\b_2\\\vdots\\b_m\end{bmatrix}=\begin{bmatrix}\underset i{\sum b_i}\\\underset i{\sum t_ib_i}\end{bmatrix}\)

\(solve\;\begin{bmatrix}m&\underset i{\sum t_i}\\\underset i{\sum t_i}&\underset i{\sum t_i^2}\end{bmatrix}\begin{bmatrix}C\\D\end{bmatrix}=\begin{bmatrix}\underset i{\sum b_i}\\\underset i{\sum t_ib_i}\end{bmatrix}\)

\(for\;C\;and\;D,\;the\;best\;\widehat{\underline x}=(C,D)\;minimizes\;\left\|A\underline x-\underline b\right\|^2\)

\(\left\|A\underline x-\underline b\right\|^2=\left\|\underline e\right\|^2=\sum_{i=1}^m{(C+Dt_i-b_i)}^2\)


Fitting a Parabola

\(Problem:\;Fit\;heights\;b_1,b_2,\cdots,b_m\;at\;t_1,t_2,\cdots,t_m\;by\;a\;parabola\;C+Dt+Et^2\)

Solution: \(\left\{\begin{array}{l}C+Dt_1+Et_1^2=b_1\\C+Dt_2+Et_2^2=b_2\\\vdots\\C+Dt_m+Et_m^2=b_m\end{array}\right.\)

\(A=\begin{bmatrix}1&t_1&t_1^2\\1&t_2&t_2^2\\\vdots&\vdots&\vdots\\1&t_m&t_m^2\end{bmatrix},\;\underline x=\begin{bmatrix}C\\D\\E\end{bmatrix},\;\underline b=\begin{bmatrix}b_1\\b_2\\\vdots\\b_m\end{bmatrix}\)

with different solutions, \(C+Dt_m+E\cdot e^{t_m}\;or\;C+Dt_m+E\cdot\log(t_m)\)

Orthogonal Bases and Gram-Schmidt


Def \(The\;vectors\;{\underline q}_1,{\underline q}_2,\dots{\underline q}_n\;are\;orthogonal\;if\;\underline q_i^T{\underline q}_j=0\;whenever\;i\neq j\)

Claim \(If\;nonzero\;vectors\;{\underline q}_1,{\underline q}_2,\dots{\underline q}_n\;are\;othogonal\;then\;they\;are\;independent.\)

Proof \(consider\;x_1{\underline q}_1+x_2{\underline q}_2+\;\dots+x_n{\underline q}_n=\underline0\)

\(\Rightarrow x_1\underline q_1^T{\underline q}_1+\underbrace{x_2\underline q_1^T{\underline q}_2}_{=0}+\;\dots+\underbrace{x_n\underline q_1^T{\underline q}_n}_{=0}=\underline0\)

\(=\underline q_1^T\underline0=0\)

\(\Rightarrow x_1\underline q_1^T{\underline q}_1=0\)

\(\Rightarrow x_1\left\|{\underline q}_1\right\|^2=0\)

\(\left\|{\underline q}_1\right\|^2\neq0\;\Rightarrow\;x_1=0\)

\(Similarly,\;x_2=x_3=\dots=x_n=0\)


Def \(The\;vectors\;{\underline q}_1,{\underline q}_2,\dots{\underline q}_n\;are\;{\color[rgb]{0.0, 0.0, 1.0}\boldsymbol o}{\color[rgb]{0.0, 0.0, 1.0}\boldsymbol t}{\color[rgb]{0.0, 0.0, 1.0}\boldsymbol h}{\color[rgb]{0.0, 0.0, 1.0}\boldsymbol o}{\color[rgb]{0.0, 0.0, 1.0}\boldsymbol n}{\color[rgb]{0.0, 0.0, 1.0}\boldsymbol o}{\color[rgb]{0.0, 0.0, 1.0}\boldsymbol r}{\color[rgb]{0.0, 0.0, 1.0}\boldsymbol m}{\color[rgb]{0.0, 0.0, 1.0}\boldsymbol a}{\color[rgb]{0.0, 0.0, 1.0}\boldsymbol l}\;if,\)

\(\underline q_i^T{\underline q}_j=\left\{\begin{array}{l}0,\;if\;i\neq j\\1,\;if\;i=j\end{array}\right.\)

\(A\;matrix\;Q_{mxn}\;with\;orthonormal\;columns\;satisfies\)

\(Q^TQ={\begin{bmatrix}\underline q_1^T\\\underline q_2^T\\\vdots\\\underline q_n^T\end{bmatrix}}_{nxm}{\begin{bmatrix}{\underline q}_1&{\underline q}_2&\cdots&{\underline q}_n\end{bmatrix}}_{mxn}\)

\(={\begin{bmatrix}1&0&\vdots&0\\0&1&\vdots&0\\\vdots&\vdots&\ddots&\vdots\\0&0&\cdots&1\end{bmatrix}}_{nxn}=I\)

\(When\;Q\;is\;a\;square\;matrix\;(m=n)\;Q^TQ=I\;means\;that\;Q^T=Q^{-1}.\)

\(In\;this\;case,\;Q\;is\;called\;an\;{\color[rgb]{0.68, 0.46, 0.12}o}{\color[rgb]{0.68, 0.46, 0.12}r}{\color[rgb]{0.68, 0.46, 0.12}t}{\color[rgb]{0.68, 0.46, 0.12}h}{\color[rgb]{0.68, 0.46, 0.12}o}{\color[rgb]{0.68, 0.46, 0.12}g}{\color[rgb]{0.68, 0.46, 0.12}o}{\color[rgb]{0.68, 0.46, 0.12}n}{\color[rgb]{0.68, 0.46, 0.12}a}{\color[rgb]{0.68, 0.46, 0.12}l}{\color[rgb]{0.68, 0.46, 0.12}\;}{\color[rgb]{0.68, 0.46, 0.12}m}{\color[rgb]{0.68, 0.46, 0.12}a}{\color[rgb]{0.68, 0.46, 0.12}t}{\color[rgb]{0.68, 0.46, 0.12}r}{\color[rgb]{0.68, 0.46, 0.12}i}{\color[rgb]{0.68, 0.46, 0.12}x}.\)


[Ex] (Rotation matrix) \(Q=\begin{bmatrix}\cos\theta&-\sin\theta\\\sin\theta&\cos\theta\end{bmatrix}\)

\(Q\begin{bmatrix}1\\0\end{bmatrix}=\begin{bmatrix}\cos\theta\\\sin\theta\end{bmatrix}\)

\(Q\begin{bmatrix}0\\1\end{bmatrix}=\begin{bmatrix}-\sin\theta\\\cos\theta\end{bmatrix}\)



\(Q^{-1}=Q^T=\begin{bmatrix}\cos\theta&\sin\theta\\-\sin\theta&\cos\theta\end{bmatrix}\)

\(Q^TQ=\begin{bmatrix}1&0\\0&1\end{bmatrix}\)


reflection across the y axis, \(Q_1=\begin{bmatrix}-1&0\\0&1\end{bmatrix}\)

\(Q_1\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}-x\\y\end{bmatrix}\)

\({Q_1}^{-1}={Q_1}^T=\begin{bmatrix}-1&0\\0&1\end{bmatrix}\)

\({Q_1}^T{Q_1}=\begin{bmatrix}1&0\\0&1\end{bmatrix}\)


reflection across \(45^\circ\)-line \(Q_2=\begin{bmatrix}0&1\\1&0\end{bmatrix}\)

\(Q_2\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}y\\x\end{bmatrix}\)

\(Q_2^TQ_2=\begin{bmatrix}1&0\\0&1\end{bmatrix}\)

\(Q_2^{-1}=Q_2^T=\begin{bmatrix}1&0\\0&1\end{bmatrix}=Q_2\)


Claim \(If\;Q\;has\;orthonormal\;columns.\;i.e.,\;Q^TQ=I,\;then\)

(i) \(\left\|Q\underline x\right\|=\left\|\underline x\right\|\)

(ii) \({(Q\underline x)}^T(Q\underline y)=\underline x^T\underline y\)

Proof \((i)\;\left\|Q\underline x\right\|^2=(Q\underline x{)^T(Q\underline x)}\)

\(=(\underline x^TQ^T)(Q\underline x)\)

\(=\underline x^T(Q^TQ)\underline x\)

\(=\underline x^TI\underline x\)

\(=\underline x^T\underline x\)

\(=\left\|\underline x\right\|^2\)

Proof: \((ii)\;(Q\underline x{)^T(Q\underline y)}\)

\(=(\underline x^TQ^T)(Q\underline y)\)

\(=\underline x^T(Q^TQ\underline{)y}\)

\(=\underline x^TI\underline y\)

\(=\underline x^T\underline y\)


Projection using Orthonormal Bases

Suppose the basis vectors are orthonormal, \(Q=\begin{bmatrix}{\underline q}_1&{\underline q}_2&\cdots&{\underline q}_n\end{bmatrix}\;with\;Q^TQ=I\)

\(The\;least\;squars\;solution\;of\;Q\underline x=\underline b\)

\(Q^TQ\widehat{\underline x}=Q^T\underline b\)

\(\Rightarrow I\widehat{\underline x}=Q^T\underline b\)

\(\Rightarrow\widehat{\underline x}=Q^T\underline b\)

\(Q^T(\underline b-Q\widehat{\underline x})=\underline0\)

The projection vector,

\(\left\{\begin{array}{l}\underline p=Q\widehat{\underline x}\\\widehat{\underline x}=Q^T\underline b\end{array}\right.\Rightarrow\underline p=Q\widehat{\underline x}=QQ^T\underline b\)

\(={\begin{bmatrix}{\underline q}_1&{\underline q}_2&\cdots&{\underline q}_n\end{bmatrix}}_{m\times n}{\begin{bmatrix}\underline q_1^T\\\underline q_2^T\\\vdots\\\underline q_n^T\end{bmatrix}}_{n\times m}{\underline b}_{m\times1}\)

\(={\begin{bmatrix}{\underline q}_1&{\underline q}_2&\cdots&{\underline q}_n\end{bmatrix}}_{m\times n}{\begin{bmatrix}\underline q_1^T\underline b\\\underline q_2^T\underline b\\\vdots\\\underline q_n^T\underline b\end{bmatrix}}_{n\times1}\)

\(={\underline q}_1(\underline q_1^T\underline b)+{\underline q}_2(\underline q_2^T\underline b)+\cdots+{\underline q}_n(\underline q_n^T\underline b)\)

recall: projection vector,

\(\underline p=\frac{\underline a^T\underline b}{\underline a^T\underline a}\underline a=\frac{\underline a^T\underline b}{\left\|\underline a\right\|^2}\underline a\)

\(={(\frac{\underline a}{\left\|\underline a\right\|})}^T\underline b\frac{\underline a}{\left\|\underline a\right\|}\)

\(=(\underline q^T\underline b)\underline q\)

\(\underline q^T\underline b=\left\|\underline q\right\|\left\|\underline b\right\|\cos\theta=\left\|\underline b\right\|\cos\theta\)

\({\underline q}_1,\;{\underline q}_2,\;\cdots,\;{\underline q}_n\;orthonormal\)

\(\underline p={\underline q}_1(\underline q_1^T\underline b)+{\underline q}_2(\underline q_2^T\underline b)+\cdots+{\underline q}_n(\underline q_n^T\underline b)\)

The projection matrix \(\underline P=QQ^T\)

\(When\;Q\;is\;a\;square\;matrix\;(m=n),\;the\;subspace\;(\mathcal C(Q))\;is\;the\;whole\;vector\;space\;\mathcal R^n\;and\;Q^T=Q^{-1}\)

\(\underline{\widehat x}=Q^T\underline b=Q^{-1}\underline b\)

\(which\;is\;the\;exact\;solution\;to\;Q\underline x=\underline b\)

\(In\;this\;case,\;\underline P=QQ^T=QQ^{-1}=I\)

\(and\;the\;projection\;of\;\underline b\;is\;\underline b\;itself.\;i.e.,\;\underline p=\underline b\)

\(Therefore,\;\underline b={\underline q}_1(\underline q_1^T\underline b)+{\underline q}_2(\underline q_2^T\underline b)+\cdots+{\underline q}_n(\underline q_n^T\underline b)\)


Gram–Schmidt (Orthogonalization) process

\(Given\;n\;independent\;vector\;{\underline a}_1,\;{\underline a}_2,\;\cdots,\;{\underline a}_n\)

\(we\;want\;to\;find\;n\;orthonormal\;vectors\;{\underline q}_1,\;{\underline q}_2,\;\cdots,\;{\underline q}_n\;with\;the\;same\;span.\)

1. \({\underline A}_1={\underline a}_1,\;then\;{\underline q}_1=\frac{{\underline A}_1}{\left\|{\underline A}_1\right\|}\)

2. \({\underline A}_2={\underline a}_2-(\underline a_1^T{\underline a}_2){\underline q}_1,\;then\;{\underline q}_2=\frac{{\underline A}_2}{\left\|{\underline A}_2\right\|}\)

3. \({\underline A}_3={\underline a}_3-(\underline a_1^T{\underline a}_3){\underline q}_1-(\underline a_2^T{\underline a}_3){\underline q}_2,\;then\;{\underline q}_3=\frac{{\underline A}_3}{\left\|{\underline A}_3\right\|}\)

In gneral, \({\underline A}_i={\underline a}_i-\sum_{j=1}^{i-1}(\underline a_j^T{\underline a}_i){\underline q}_j,\;{\underline q}_i=\frac{{\underline A}_i}{\left\|{\underline A}_i\right\|},\;for\;i=1,2,3,\dots,n\)



We now come to a fundamentally important algorithm, which is called the Gram-Schmidt orthogonalization procedure. This algorithm makes it possible to construct, for each list of linearly independent vectors (resp. basis), a corresponding orthonormal list (resp. orthonormal basis).

If \((v_1,\ldots,v_m) \) is a list of linearly independent vectors in \(V\), then there exists an orthonormal list \((e_1,\ldots,e_m) \) such that

\[ span(v_1,\ldots,v_k) = span(e_1,\ldots,e_k), \quad \text{for all \(k=1,\ldots,m\).} \]

The proof is constructive, that is, we will actually construct vectors \(e_1,\ldots,e_m \) having the desired properties. Since \((v_1,\ldots,v_m) \) is linearly independent, \(v_k\neq 0 \) for each \(k=1,2,\ldots,m\). Set \(e_1=\frac{v_1}{\left\|v_1\right\|}\). Then \(e_{1} \) is a vector of norm 1 and satisfies Equation above for \(k=1\). Next, set

\[e_2=\frac{v_2-\left\langle v_2,e_1\right\rangle e_1}{\left\|v_2-\left\langle v_2,e_1\right\rangle e_1\right\|}.\]

This is, in fact, the normalized version of the orthogonal decomposition Equation. I.e.,

\[w=v_2-\left\langle v_2,e_1\right\rangle e_1,\]

where \(w\bot e_1\). Note that \(\left\|e_2\right\|=1\) and \(span(e_1,e_2)=span(v_1,v_2)\).

Now, suppose that \(e_1,\ldots,e_{k-1} \) have been constructed such that \((e_1,\ldots,e_{k-1})\) is an orthonormal list and \(span(v_1,\ldots,v_{k-1}) = span(e_1,\ldots,e_{k-1})\). Then define
\[e_k=\frac{v_k-\left\langle v_k,e_1\right\rangle e_1-\left\langle v_k,e_2\right\rangle e_2-\cdots-\left\langle v_k,e_{k-1}\right\rangle e_{k-1}}{\left\|v_k-\left\langle v_k,e_1\right\rangle e_1-\left\langle v_k,e_2\right\rangle e_2-\cdots-\left\langle v_k,e_{k-1}\right\rangle e_{k-1}\right\|}\]

Since \((v_1,\ldots,v_k) \) is linearly independent, we know that \(v_k\not\in span(v_1,\ldots,v_{k-1})\). Hence, we also know that \(v_k\not\in span(e_1,\ldots,e_{k-1})\). It follows that the norm in the definition of \(e_k \) is not zero, and so \(e_k \) is well-defined (i.e., we are not dividing by zero). Note that a vector divided by its norm has norm 1 so that \(\left\|e_k\right\|=1\). Furthermore,

\[\begin{array}{rl}\left\langle e_k,e_i\right\rangle&=\left\langle\frac{v_k-\left\langle v_k,e_1\right\rangle e_1-\left\langle v_k,e_2\right\rangle e_2-\cdots-\left\langle v_k,e_{k-1}\right\rangle e_{k-1}}{\left\|v_k-\left\langle v_k,e_1\right\rangle e_1-\left\langle v_k,e_2\right\rangle e_2-\cdots-\left\langle v_k,e_{k-1}\right\rangle e_{k-1}\right\|},e_i\right\rangle\end{array}\]

\[=\frac{\left\langle v_k,e_i\right\rangle-\left\langle v_k,e_i\right\rangle}{\left\|v_k-\left\langle v_k,e_1\right\rangle e_1-\left\langle v_k,e_2\right\rangle e_2-\cdots-\left\langle v_k,e_{k-1}\right\rangle e_{k-1}\right\|}=0\]

for each \(1\le i<k\). Hence, \((e_1,\ldots,e_k) \) is orthonormal.


[Ex 1] \({\underline a}_1=\begin{bmatrix}1\\-1\\0\end{bmatrix},\;{\underline a}_2=\begin{bmatrix}2\\0\\-2\end{bmatrix},\;{\underline a}_3=\begin{bmatrix}3\\-3\\3\end{bmatrix}\)

\({\underline A}_1={\underline a}_1=\begin{bmatrix}1\\-1\\0\end{bmatrix}\)

\({\underline q}_1=\frac{{\underline A}_1}{\left\|{\underline A}_1\right\|}=\frac1{\sqrt2}\begin{bmatrix}1\\-1\\0\end{bmatrix}\)

\({\underline A}_2={\underline a}_2-(\underline a_1^T{\underline a}_2){\underline q}_1=\begin{bmatrix}2\\0\\-2\end{bmatrix}-\frac12\cdot2\begin{bmatrix}1\\-1\\0\end{bmatrix}=\begin{bmatrix}1\\1\\-2\end{bmatrix}\)

\({\underline q}_2=\frac{{\underline A}_2}{\left\|{\underline A}_2\right\|}=\frac1{\sqrt6}\begin{bmatrix}1\\1\\-2\end{bmatrix}\)

\({\underline A}_3={\underline a}_3-(\underline a_1^T{\underline a}_3){\underline q}_1-(\underline a_2^T{\underline a}_3){\underline q}_2\)

\(=\begin{bmatrix}3\\-3\\3\end{bmatrix}-\frac12\cdot6\begin{bmatrix}1\\-1\\0\end{bmatrix}-\frac16\cdot(-6)\begin{bmatrix}1\\1\\-2\end{bmatrix}\)

\(=\begin{bmatrix}1\\1\\1\end{bmatrix}\)

\({\underline q}_3=\frac{{\underline A}_3}{\left\|{\underline A}_3\right\|}=\frac1{\sqrt3}\begin{bmatrix}1\\1\\1\end{bmatrix}\)

\(\therefore{\underline q}_1=\frac1{\sqrt2}\begin{bmatrix}1\\-1\\0\end{bmatrix},\;{\underline q}_2=\frac1{\sqrt6}\begin{bmatrix}1\\1\\-2\end{bmatrix},\;{\underline q}_3=\frac1{\sqrt3}\begin{bmatrix}1\\1\\1\end{bmatrix}\)


[Ex 2]

Let \(u_1=\begin{bmatrix}1\\0\\-1\\\end{bmatrix} ,u_2=\begin{bmatrix}2\\-1\\0\\\end{bmatrix} ,u_3=\begin{bmatrix}1\\2\\1\\\end{bmatrix}\).

To find the required orthonormal basis \(\{w_1,w_1,w_3\}\),

first we have \(w_1=\frac{u_1}{\Arrowvert u_1\Arrowvert}=\begin{bmatrix}\frac1{\sqrt2}\\0\\-\frac1{\sqrt2}\end{bmatrix}.\)

Second, find \(u_2-(w_1\cdot u_2)w_1\) as follows: \(u_2-(w_1\cdot u_2)w_1=\begin{bmatrix}2\\-1\\0\end{bmatrix}-\sqrt2\begin{bmatrix}\frac1{\sqrt2}\\0\\-\frac1{\sqrt2}\end{bmatrix}=\begin{bmatrix}1\\-1\\1\end{bmatrix}.\)

By taking the dot product, you can see that \(w_1\) is orthogonal to the above vector: \(w_1\cdot[u_2-(w_1\cdot u_2)w_1]=w_1\cdot u_2-(w_1\cdot u_2)w_1\cdot w_1=0\) since \(w_1\) is an unit vector.

So we can take \(w_2=\frac{u_2-(w_1\cdot u_2)w_1}{\Arrowvert u_2-(w_1\cdot u_2)w_1\Arrowvert}=\begin{bmatrix}\frac1{\sqrt3}\\-\frac1{\sqrt3}\\\frac1{\sqrt3}\end{bmatrix}.\)

Finally, find \(u_3-(w_1\cdot u_3)w_1-(w_2\cdot u_3)w_2\) as follows:

\(u_3-(w_1\cdot u_3)w_1-(w_2\cdot u_3)w_2=\begin{bmatrix}1\\2\\1\end{bmatrix}-0\cdot\begin{bmatrix}\frac1{\sqrt3}\\-\frac1{\sqrt3}\\\frac1{\sqrt3}\end{bmatrix}-0\cdot\begin{bmatrix}\frac1{\sqrt2}\\0\\-\frac1{\sqrt2}\end{bmatrix}=\begin{bmatrix}1\\2\\1\end{bmatrix}.\)

By taking the dot product, you can again see that \(w_1\) and \(w_2\) and is orthogonal to the above vector.

So we can take \(w_3=\frac{u_3-(w_1\cdot u_3)w_1-(w_2\cdot u_3)w_2}{\Arrowvert u_3-(w_1\cdot u_3)w_1-(w_2\cdot u_3)w_2\Arrowvert}=\begin{bmatrix}\frac1{\sqrt6}\\\frac2{\sqrt6}\\\frac1{\sqrt6}\end{bmatrix}.\)

\(w_1=\frac1{\sqrt2}\begin{bmatrix}1\\0\\-1\end{bmatrix},\;w_2=\frac1{\sqrt3}\begin{bmatrix}1\\-1\\1\end{bmatrix},\;w_3=\frac1{\sqrt6}\begin{bmatrix}1\\2\\1\end{bmatrix}\)


The Factorization A=QR


QR-Factorization

One of the main virtues of orthogonal matrices is that they can be easily inverted—the transpose is the inverse. This fact, combined with the factorization theorem in this section, provides a useful way to simplify many matrix calculations (for example, in least squares approximation).



Proposition

Let \(A\) be a \(K\times L\) matrix. If the columns of \(A\) are linearly independent, then \(A\) can be factorized as,

\(A=QR\)

where \(Q\) is a \(K\times L\) matrix whose columns form an orthonormal set, and \(R\) is an \(L\times L\) upper triangular matrix whose diagonal entries are strictly positive.


Proposition

Under the assumptions of the previous proposition, the \(QR\) decomposition is unique, that is, the matrices \(Q\) and \(R\) satisfying the stated properties are unique.


\(Given\;independent\;vectors\;{\underline a}_1,{\underline a}_2,{\underline a}_3,\;Gram-Schmidt\;constructs\;{\underline q}_1,{\underline q}_2,{\underline q}_3\)

\({\underline A}_1,{\underline A}_2,{\underline A}_3\)

\({\underline a}_1,{\underline A}_1\;and\;{\underline q}_1\;span\;the\;same\;subspace.\)

\({\underline a}_1,{\underline a}_2\;and\;{\underline A}_1,{\underline A}_2\;and\;{\underline q}_1,{\underline q}_2\;span\;the\;same\;subspace.\)

\({\underline a}_1,{\underline a}_2,{\underline a}_3\;and\;{\underline A}_1,{\underline A}_2,{\underline A}_3\;and\;{\underline q}_1,{\underline q}_2,{\underline q}_3\;span\;the\;same\;subspace.\)

\(Therefore,\;we\;can\;have\)

\({\underline a}_1=(\underline q_1^T{\underline a}_1){\underline q}_1\)

\({\underline a}_2=(\underline q_1^T{\underline a}_2){\underline q}_1+(\underline q_2^T{\underline a}_2){\underline q}_2\)

\({\underline a}_3=(\underline q_1^T{\underline a}_3){\underline q}_1+(\underline q_2^T{\underline a}_3){\underline q}_2+(\underline q_3^T{\underline a}_3){\underline q}_3\)

\(\underbrace{{\begin{bmatrix}{\underline a}_1&{\underline a}_2&{\underline a}_3\end{bmatrix}}_{m\times3}}_A=\underbrace{{\begin{bmatrix}{\underline q}_1&{\underline q}_2&{\underline q}_3\end{bmatrix}}_{m\times3}}_{Q,\;orthonormal\;columns}\underbrace{{\begin{bmatrix}\underline q_1^T{\underline a}_1&\underline q_1^T{\underline a}_2&\underline q_1^T{\underline a}_3\\0&\underline q_2^T{\underline a}_2&\underline q_2^T{\underline a}_3\\0&0&\underline q_3^T{\underline a}_3\end{bmatrix}}_{3\times3}}_{R,\;upper\;triangular}\)

\(In\;general,\;from\;independent\;vectors\;{\underline a}_1,{\underline a}_2,\dots{\underline a}_n,\;Gram-schmidt\;construct\;{\underline q}_1,{\underline q}_2,\dots{\underline q}_n\)

\(We\;can\;have\;A_{m\times n}=Q_{m\times n}R_{n\times n}\)

\(A_{m\times n}=\begin{bmatrix}{\underline a}_1&{\underline a}_2&\cdots&{\underline a}_n\end{bmatrix}\)

\(Q_{m\times n}=\begin{bmatrix}{\underline q}_1&{\underline q}_2&\cdots&{\underline q}_n\end{bmatrix}\)

\(R_{n\times n}=upper\;triangular\)

\(A=QR\Rightarrow\)

\(Q^TA=Q^TQR=IR=R\)

\(Note\;{\underline A}_1=\left\|{\underline A}_1\right\|{\underline q}_1={\underline a}_1\)

\(\Rightarrow{\underline a}_1=\left\|{\underline A}_1\right\|{\underline q}_1\)

\(\therefore\underline q_1^T{\underline a}_1=\left\|{\underline A}_1\right\|\)

\({\underline A}_2=\left\|{\underline A}_2\right\|{\underline q}_2={\underline a}_2-(\underline q_1^T{\underline a}_2){\underline q}_1\)

\(\Rightarrow{\underline a}_2=(\underline q_1^T{\underline a}_2){\underline q}_1+\left\|{\underline A}_2\right\|{\underline q}_2\)

\(\therefore\underline q_2^T{\underline a}_2=\left\|{\underline A}_2\right\|\)

\({\underline A}_3=\left\|{\underline A}_3\right\|{\underline q}_3={\underline a}_3-(\underline q_1^T{\underline a}_3){\underline q}_1-(\underline q_2^T{\underline a}_3){\underline q}_2\)

\(\Rightarrow{\underline a}_3=(\underline q_1^T{\underline a}_3){\underline q}_1+(\underline q_2^T{\underline a}_3){\underline q}_2+\left\|{\underline A}_3\right\|{\underline q}_3\)

\(\therefore\underline q_3^T{\underline a}_3=\left\|{\underline A}_3\right\|>0\)

\(Therefore,\;the\;diagonal\;elements\;of\;R\;are,\)

\({\underline a}_1=\left\|{\underline A}_1\right\|{\underline q}_1>0,\;\underline q_2^T{\underline a}_2=\left\|{\underline A}_2\right\|>0,\underline q_3^T{\underline a}_3=\left\|{\underline A}_3\right\|>0,\dots,\underline q_n^T{\underline a}_n=\left\|{\underline A}_n\right\|>0\)

R is upper triangular with positive diagonal elements \(\Rightarrow\)R is invertible (n nonzero pivots)

\(The\;least\;squares\;solution\;to\;A\underline x=\underline b\;is\;\underline{\widehat x}\;satisfying\;A^TA\underline{\widehat x}=A^T\underline b\)

\(\Rightarrow{(QR)}^T(QR)\underline{\widehat x}={(QR)}^T\underline b\)

\(\Rightarrow R^TQ^TQR\underline{\widehat x}=R^TQ^T\underline b\)

\(\Rightarrow R^TIR\underline{\widehat x}=R^TQ^T\underline b\)

\(\Rightarrow R^TR\underline{\widehat x}=R^TQ^T\underline b\)

\(\Rightarrow{(R^T)}^{-1}R^TR\underline{\widehat x}={(R^T)}^{-1}R^TQ^T\underline b\)

\(\Rightarrow R\underline{\widehat x}=Q^T\underline b\;(can\;be\;solved\;by\;back\;substitution)\)

\(or\;\underline{\widehat x}=R^{-1}Q^T\underline b\)



Inner Product Space

\(\underline v^T\underline w\overset\triangle=v_1w_1+v_2w_2+\cdots+v_nw_n\)

Def: \(An\;inner\;product\;on\;a\;vector\;space\;V\;is\;a\;function\;that\;assigns\;to\;every\;ordered\;pair\;of\;vectors\;\underline v\;and\;\underline w\;in\;V,\)

\(a\;scalar\;\left\langle\underline v,\underline w\right\rangle\;such\;that\;for\;all\;\underline u,\underline v,\underline w\;in\;V\;and\;all\;scalar\;C,\;the\;following\;hold:\)

(1) \(\left\langle\underline u+\underline v,\underline w\right\rangle=\left\langle\underline u,\underline w\right\rangle+\left\langle\underline v,\underline w\right\rangle\)

(2) \(\left\langle c\underline v,\underline w\right\rangle=c\left\langle\underline v,\underline w\right\rangle\)

(3) \(\left\langle\underline v,\underline w\right\rangle=\left\langle\underline w,\underline v\right\rangle\)

(4) \(\left\langle\underline v,\underline v\right\rangle\geq0\;with\;equality\;iff\;\underline v=\underline0\)

A vector space with an inner product is called an inner product space.

Remark: This definition is for real scalars.

[Ex] \(V=\mathcal R^n\)

\(\left\langle\underline v,\underline w\right\rangle\overset\triangle=\underline v^T\underline w=v_1w_1+v_2w_2+\cdots+v_nw_n\;is\;an\;inner\;product.\)

\(\left\langle\underline u+\underline v,\underline w\right\rangle=\left\langle\underline u,\underline w\right\rangle+\left\langle\underline v,\underline w\right\rangle\Rightarrow{(\underline u+\underline v)}^T\underline w=\underline u^T\underline w+\underline v^T\underline w\)

\(\left\langle\underline v,\underline w\right\rangle=\left\langle\underline w,\underline v\right\rangle\Rightarrow\underline v^T\underline w=\underline w^T\underline v\)

[Ex] \(V=C\lbrack a,b\rbrack\)

\(=the\;vector\;space\;of\;all\;real-valued\;continuous\;functions\;on\;\lbrack a,b\rbrack\)

\(For\;f,g\in V,\;define\;\left\langle f,g\right\rangle=\int_a^bf(t)g(t)\operatorname dt\)

\(\left\langle f_1+f_2,g\right\rangle=\left\langle f_1,g\right\rangle+\left\langle f_2,g\right\rangle\)

\(\int_a^b\lbrack f_1(t)+f_2(t)\rbrack g(t)\operatorname dt=\int_a^bf_1(t)g(t)\operatorname dt+\int_a^bf_2(t)g(t)\operatorname dt\)

(1) and (2) are immediate and (3) is trivial.

For (4) \(\left\langle f,f\right\rangle=\int_a^bf^2(t)\operatorname dt\geq0\)

\(If\;f\neq0,\;then\;f^2\;is\;bounded\;away\;from\;zero\;on\;some\;interval\;of\;\lbrack a,b\rbrack\;(since\;f\;is\;continuous)\)

\(and\;hence\;\left\langle f,f\right\rangle=\int_a^bf^2(t)\operatorname dt>0\)


[Ex] \(Consider\;\mathcal C\lbrack0,2\mathrm\pi\rbrack,\;\left\langle\mathrm f,\mathrm g\right\rangle=\int_0^{2\mathrm\pi}\mathrm f(\mathrm t)\mathrm g(\mathrm t)\operatorname d\mathrm t\)

\(consider\;1,\;\cos t,\;\sin t,\;\cos2t,\;\sin2t,\dots,\cos Nt,\;\sin Nt\)

\(\left\langle1,\cos nt\right\rangle=\int_0^{2\mathrm\pi}\cos nt\operatorname d\mathrm t=0\)

\(\left\langle1,\sin nt\right\rangle=\int_0^{2\mathrm\pi}\sin nt\operatorname d\mathrm t=0\)

\(\left\langle\cos nt,\sin mt\right\rangle=\int_0^{2\mathrm\pi}\cos nt\sin mt\operatorname d\mathrm t=\int_0^{2\mathrm\pi}\frac12\lbrack\sin(n+m)t-\sin(n-m)t\rbrack dt=0,\;1\leq n,m\leq N\)

\(\left\langle\cos nt,\cos mt\right\rangle=\int_0^{2\mathrm\pi}\cos nt\cos mt\operatorname d\mathrm t=\int_0^{2\mathrm\pi}\frac12\lbrack\cos(n-m)t+\cos(n+m)t\rbrack dt=0,\;1\leq n\neq m\leq N\)

\(\left\langle\sin nt,\sin mt\right\rangle=\int_0^{2\mathrm\pi}\sin nt\sin mt\operatorname d\mathrm t=\int_0^{2\mathrm\pi}\frac12\lbrack\cos(n-m)t-\cos(n+m)t\rbrack dt=0,\;1\leq n\neq m\leq N\)

So they are mutually orthogonal.

Also \(\left\langle1,1\right\rangle=\int_0^{2\mathrm\pi}1^2\operatorname d\mathrm t=2\mathrm\pi\)

\(\left\langle\cos nt,\cos nt\right\rangle=\int_0^{2\mathrm\pi}\cos^2nt\operatorname d\mathrm t=\mathrm\pi\)

\(\cos(2x)=2\cos^2(x)-1\)

\(\therefore S_N=\{\frac1{\sqrt{2\mathrm\pi}},\frac1{\sqrt{\mathrm\pi}}\cos t,\frac1{\sqrt{\mathrm\pi}}\sin t,\frac1{\sqrt{\mathrm\pi}}\cos2t,\frac1{\sqrt{\mathrm\pi}}\sin2t,\dots,\frac1{\sqrt{\mathrm\pi}}\cos Nt,\frac1{\sqrt{\mathrm\pi}}\sin Nt\}\;is\;an\;orthonormal\;set.\)

\(The\;projection\;of\;f(f)\;onto\;the\;subspace\;spanned\;by\;S_N\;is,\)

\(f_N(t)=A_0\cdot\frac1{\sqrt{2\mathrm\pi}}+\sum_{n=1}^N(A_n\frac1{\sqrt{\mathrm\pi}}\cos nt+B_n\frac1{\sqrt{\mathrm\pi}}\sin nt)\)

\(where\;A_0=\left\langle\frac1{\sqrt{2\mathrm\pi}},f(t)\right\rangle=\frac1{\sqrt{2\mathrm\pi}}\int_0^{2\mathrm\pi}f(t)\operatorname d\mathrm t\)

\(A_n=\left\langle\frac1{\sqrt{\mathrm\pi}}\cos nt,f(t)\right\rangle=\frac1{\sqrt{\mathrm\pi}}\int_0^{2\mathrm\pi}f(t)\cos nt\operatorname d\mathrm t\)

\(B_n=\left\langle\frac1{\sqrt{\mathrm\pi}}\sin nt,f(t)\right\rangle=\frac1{\sqrt{\mathrm\pi}}\int_0^{2\mathrm\pi}f(t)\sin nt\operatorname d\mathrm t\)

\(or\;f_N(t)=a_0+\sum_{n=1}^N(a_n\cos nt+b_n\sin nt)\)

\(where\;a_0=\frac1{2\mathrm\pi}\int_0^{2\mathrm\pi}f(t)\operatorname d\mathrm t\)

\(a_n=\frac1{\mathrm\pi}\int_0^{2\mathrm\pi}f(t)\cos nt\operatorname d\mathrm t\)

\(b_n=\frac1{\mathrm\pi}\int_0^{2\mathrm\pi}f(t)\sin nt\operatorname d\mathrm t\)

\(Let\;N\rightarrow\infty,\;we\;can\;have\)

\(f(t)=\frac{A_0}{\sqrt{2\mathrm\pi}}+\sum_{n=1}^\infty(\frac{A_n}{\sqrt{\mathrm\pi}}\cos nt+\frac{B_n}{\sqrt{\mathrm\pi}}\sin nt)\)

\(=a_0+\sum_{n=1}^\infty(a_n\cos nt+b_n\sin nt)\)

\(where\;a_0=\frac1{2\mathrm\pi}\int_0^{2\mathrm\pi}f(t)\operatorname d\mathrm t\)

\(a_n=\frac1{\mathrm\pi}\int_0^{2\mathrm\pi}f(t)\cos nt\operatorname d\mathrm t,\;\mathrm n=1,2,3,\dots\)

\(b_n=\frac1{\mathrm\pi}\int_0^{2\mathrm\pi}f(t)\sin nt\operatorname d\mathrm t,\;\mathrm n=1,2,3,\dots\)

This is the Fourier series







Note \(\left\langle f,f\right\rangle=\left\|f\right\|^2\)

\(=\int_0^{2\mathrm\pi}f^2(t)dt\)

\(=A_0^2+\sum_{n=1}^\infty(A_n^2+B_n^2)\)

\(=2\mathrm\pi a_0^2+\mathrm\pi\sum_{n=1}^\infty(a_n^2+b_n^2)\)

Determinants


In mathematics, the determinant is a scalar-valued function of the entries of a square matrix. The determinant of a matrix A is commonly denoted det(A), det A, or |A|. Its value characterizes some properties of the matrix and the linear map represented, on a given basis, by the matrix. In particular, the determinant is nonzero if and only if the matrix is invertible and the corresponding linear map is an isomorphism.

[Ex] \(det\;\begin{bmatrix}a&b\\c&d\end{bmatrix}=\begin{vmatrix}a&b\\c&d\end{vmatrix}=ad-bc\)

Def: \(The\;determinant\;of\;an\;n\times n\;(real)\;matrix\;A,\;denoted\;by\;det(A)\;or\;\left|A\right|,\)

\(is\;a\;real\;number\;defined\;by\;the\;following\;three\;rules:\)

1. det I=1

2. The determinant changes sign when two rows are exchanged.

3. The determinant is a linear function of each row separately.

[Ex] \(det\;\begin{bmatrix}1&0&0\\0&0&1\\0&1&0\end{bmatrix}=-det\begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix}=-1\)

\(if\;\underline P\;is\;a\;permutation\;matrix,\;then\;det\;\underline P=\pm1\)

\((+1\;if\;an\;even\;number\;of\;row\;exchanges\;of\;I\;gives\;\underline P;\;-1\;for\;an\;odd\;number\;of\;row\;exchanges.)\)

[Ex] \(\begin{vmatrix}ta&tb\\c&d\end{vmatrix}=t\begin{vmatrix}a&b\\c&d\end{vmatrix}\)

\(\begin{vmatrix}a+a'&b+b'\\c&d\end{vmatrix}=\begin{vmatrix}a&b\\c&d\end{vmatrix}+\begin{vmatrix}a'&b'\\c&d\end{vmatrix}\)

\(\begin{vmatrix}sa+ta'&sb+tb'\\c&d\end{vmatrix}=s\begin{vmatrix}a&b\\c&d\end{vmatrix}+t\begin{vmatrix}a'&b'\\c&d\end{vmatrix}\)

[Ex] \(\begin{vmatrix}2&0\\0&2\end{vmatrix}=2\begin{vmatrix}1&0\\0&2\end{vmatrix}=2^2\begin{vmatrix}1&0\\0&1\end{vmatrix}=4\)

\(\therefore det(tI)=t^n\)

We can then have the following 4-10.

4. If two rows are equal, then \(det\;A=0\)

[Ex] \(\begin{vmatrix}a&b\\a&b\end{vmatrix}=0\)

Proof: Exchanging the two equal rows does not change A. \(\)

\(det\;A=-det\;A,\;therefore\;det\;A=0\) (by rule 2)

5. Subtracting a multiple of one row from another row leaves \(det\;A\) unchanged.

[Ex] \(\begin{vmatrix}a&b\\c-la&d-lb\end{vmatrix}=\begin{vmatrix}a&b\\c&d\end{vmatrix}\)

Proof: \(\begin{vmatrix}\cdots&\cdots&\vdots&\cdots\\a_{i1}&a_{i2}&\vdots&a_{in}\\\cdots&\cdots&\vdots&\cdots\\a_{j1}-la_{i1}&a_{j2}-la_{i2}&\vdots&a_{jn}-la_{in}\\\cdots&\cdots&\vdots&\cdots\end{vmatrix}\)

\(=\begin{vmatrix}\cdots&\cdots&\vdots&\cdots\\a_{i1}&a_{i2}&\vdots&a_{in}\\\cdots&\cdots&\vdots&\cdots\\a_{j1}&a_{j2}&\vdots&a_{jn}\\\cdots&\cdots&\vdots&\cdots\end{vmatrix}-l\begin{vmatrix}\cdots&\cdots&\vdots&\cdots\\a_{i1}&a_{i2}&\vdots&a_{in}\\\cdots&\cdots&\vdots&\cdots\\a_{i1}&a_{i2}&\vdots&a_{in}\\\cdots&\cdots&\vdots&\cdots\end{vmatrix}\) (by rule 3)

\(\begin{vmatrix}\cdots&\cdots&\vdots&\cdots\\a_{i1}&a_{i2}&\vdots&a_{in}\\\cdots&\cdots&\vdots&\cdots\\a_{i1}&a_{i2}&\vdots&a_{in}\\\cdots&\cdots&\vdots&\cdots\end{vmatrix}=0\) (by rule 4)

\(\therefore=\begin{vmatrix}\cdots&\cdots&\vdots&\cdots\\a_{i1}&a_{i2}&\vdots&a_{in}\\\cdots&\cdots&\vdots&\cdots\\a_{j1}&a_{j2}&\vdots&a_{jn}\\\cdots&\cdots&\vdots&\cdots\end{vmatrix}\)

Remark \(Suppose\;A\rightarrow U,\;elimination\;(without\;row\;exchange)\)

\(Then\;det\;A=det\;U\)

6. A matrix with a row of zeros has \(det\;A=0\)

[Ex] \(\begin{vmatrix}0&0\\c&d\end{vmatrix}=\begin{vmatrix}a&b\\0&0\end{vmatrix}=0\)

Proof: \(\begin{vmatrix}\cdots&\cdots&\vdots&\cdots\\a_{i1}&a_{i2}&\vdots&a_{in}\\\cdots&\cdots&\vdots&\cdots\\0&0&\vdots&0\\\cdots&\cdots&\vdots&\cdots\end{vmatrix}=\begin{vmatrix}\cdots&\cdots&\vdots&\cdots\\a_{i1}&a_{i2}&\vdots&a_{in}\\\cdots&\cdots&\vdots&\cdots\\a_{i1}&a_{i2}&\vdots&a_{in}\\\cdots&\cdots&\vdots&\cdots\end{vmatrix}\) (by rule 5)

add row i to the zero row

\(=0\) (by rule 4)

7. determinant of a triangular matrix.

7. If A is triangular, then \(det\;A=a_{11}a_{22}\cdots a_{nn}\) = product of diagonal entries.

[Ex] \(\begin{vmatrix}a&b\\0&d\end{vmatrix}=\begin{vmatrix}a&0\\c&d\end{vmatrix}=ad\)

Proof case 1: Suppose all diagonal entries are nonzero,

\(D=\begin{bmatrix}a_{11}&a_{12}&\dots&a_{1n}\\0&a_{22}&\dots&a_{2n}\\\vdots&\vdots&\ddots&\vdots\\0&0&\dots&a_{nn}\end{bmatrix}\;or\;\begin{bmatrix}a_{11}&0&\dots&0\\a_{21}&a_{22}&\dots&0\\\vdots&\vdots&\ddots&\vdots\\a_{n1}&a_{n2}&\cdots&a_{nn}\end{bmatrix}\)

Then elimination steps can reduce \(A\) to a diagonal matrix \(D\).

\(Then\;det\;D=a_{11}a_{22}\cdots a_{nn}\;by\;rule\;5.\;det\;A=det\;D\)

case 2: Suppose a diagonal entry is zero. Then the triangular A is singular. Elimination produces a zero row.

By rule 5, the determinant is unchanged and by rule 6, a zero row means det A=0.

\(which\;is\;equal\;to\;a_{11}a_{22}\cdots a_{nn}\)

8. determinant of a (non)singular matrix.

8. \(If\;A\;is\;\sin gular,\;then\;det\;A=0.\;If\;A\;is\;invertible\;then\;det\;A\neq0\)

\(\begin{bmatrix}a&b\\c&d\end{bmatrix}\;is\;\sin gular\;iff\;\begin{vmatrix}a&b\\c&d\end{vmatrix}=0\)

Proof \(A\rightarrow U\;(upper\;triangular)\)

elimination & row exchanges

\(By\;rule\;2\;and\;rule\;5,\;det\;A=\pm det\;U\;(\pm\;depends\;on\;the\;number\;of\;row\;exchanges)\)

\(If\;A\;is\;\sin gular,\;then\;U\;has\;a\;zero\;row.\;By\;rule\;6,\;det\;U=0,\;which\;gives\;det\;A=0\)

\(If\;A\;is\;invertible,\;then\;U\;has\;the\;nonzero\;pivots\;along\;its\;diagonal.\)

\(Then\;det\;A=\pm det\;U=\pm p_1p_2\cdots p_n\neq0\) by rule 7

Remark \(det\;A=\pm p_1p_2\cdots p_n\)

9. \(\left|AB\right|=\left|A\right|\left|B\right|\)

Proof: consider the case \(\left|B\right|=0\). Then B is singular (by rule 8).

It implies AB is singular.

\((B\;is\;singular\;\Rightarrow B\underline x=\underline0\;for\;some\;\underline x\neq\underline0)\)

\(\Rightarrow AB\underline x=\underline0\;for\;some\;\underline x\neq\underline0\)

\(\Rightarrow AB\;is\;singular\)

\(Therefore\;\left|AB\right|=0\;and\;\left|A\right|\left|B\right|=0\)

\(When\;\left|B\right|\neq0\;consider\;d(A)=\frac{\left|AB\right|}{\left|B\right|}\)

\(We\;now\;show\;that\;d(A)\;satisfies\;rules\;1,2,3.\;Hence\;d(A)=\left|A\right|\)

\(1.\;If\;A=I\;then\;d(A)=\left|IB\right|/\left|B\right|=\left|B\right|/\left|B\right|=1\)

\(2.\;A=\begin{bmatrix}{\underline a}_1\\{\underline a}_2\\\vdots\\{\underline a}_n\end{bmatrix},\;AB=\begin{bmatrix}{\underline a}_1\\{\underline a}_2\\\vdots\\{\underline a}_n\end{bmatrix}B=\begin{bmatrix}{\underline a}_1B\\{\underline a}_2B\\\vdots\\{\underline a}_nB\end{bmatrix}\)

\(When\;two\;rows\;of\;A\;are\;exchanged,\;so\;are\;the\;same\;two\;rows\;of\;AB.\)

\(Therefore,\;\left|AB\right|\;changes\;sign\;and\;so\;does\;the\;ratio\;\frac{\left|AB\right|}{\left|B\right|}\)

\(Add\;row\;i\;of\;A\;to\;row\;i\;of\;A'.\;Then\;row\;i\;of\;AB\;adds\;to\;row\;i\;of\;A'B.\)

\(By\;rule\;3,\;determinants\;add.\;After\;dividing\;by\;\left|B\right|,\;the\;ratios\;add,\;as\;desired.\)

10. \(\left|A^T\right|=\left|A\right|\)

[Ex] \(\begin{vmatrix}a&b\\c&d\end{vmatrix}=\begin{vmatrix}a&c\\b&d\end{vmatrix}\)

Proof: when \(A\) is singular, \(A^T\) is singular.

\(det\;A=0=det\;A^T\;(by\;rule\;8)\)

\(Otherwise,\;we\;can\;have\;PA=LU\)

\(\Rightarrow A^TP^T=U^TL^T\)

by rule 9, \(\left|P\right|\left|A\right|=\left|L\right|\left|U\right|\)

\(\left|A^T\right|\left|P^T\right|=\left|U^T\right|\left|L^T\right|\)

\(first,\;\left|L\right|=\left|L^T\right|=1\;\sin ce\;both\;have\;1's\;on\;the\;diagonal\;(by\;rule\;7)\)

\(second,\;\left|U\right|=\left|U^T\right|\;\sin ce\;both\;triangular\;matrices\;have\;the\;same\;diagonal\;(by\;rule\;7)\)

\(third,\;PP^T=I,\;(P^{-1}=P^T)\)

\(\Rightarrow\left|P\right|\left|P^T\right|=I,\;(by\;rule\;1\;and\;rule\;9)\)

\(\Rightarrow\left|P\right|\;and\;\left|P^T\right|\;are\;both\;+1\;or\;both\;-1,\;(by\;rule\;1\;and\;rule\;2)\)

\(\Rightarrow\left|P\right|\;=\;\left|P^T\right|,\;therefore\;\left|A\right|\;=\;\left|A^T\right|\)

Remark Every rule for the rows can apply to the columns \((since\;\left|A\right|\;=\;\left|A^T\right|)\)




Theoretical Development

Components
  • Rule of Null Product: Every determinant of a square matrix is null if the matrix has a null row (or column).

  • Row (or Column) Interchange: If the rows (or columns) of a matrix are interchanged, the absolute value of its determinant remains the same, but the determinant's sign may change, meaning the determinant can become the opposite of the original value.

  • Multiplication of a Single Row (or Column) by a Scalar: If a row (or column) of a matrix is multiplied by a scalar k, the absolute value of the determinant is multiplied by k, meaning the determinant becomes the original determinant multiplied by k.

  • Addition of a Multiple of One Row (or Column) to Another Row (or Column): If a row (or column) of a matrix is replaced by the sum of that row (or column) and a multiple of another row (or column), the absolute value of the determinant remains unchanged.

Permutations and cofactors


The Pivot Formula

Recall for an \(n\times n\) matrix \(A\), we have \(PA=LU\)

\(Then\;det(P)det(A)=det(L)det(U)\) by rule 9

\(\Rightarrow\pm det(A)=1\cdot(d_1d_2\cdots d_n)\)

\(\Rightarrow det(A)=\pm d_1d_2\cdots d_n\)

\(If\;there\;are\;fewer\;than\;n\;pivots,\;det(A)=0\;and\;A\;is\;singular.\)

Observe that the first pivot depends only on the left corner of the original matrix A,

\(\begin{bmatrix}a_{11}&a_{12}&\cdots&\cdots\\a_{21}&a_{22}&\cdots&\cdots\\\vdots&\vdots&\ddots&\\\vdots&\vdots&&\ddots\end{bmatrix}\)

\(\begin{bmatrix}a_{11}&a_{12}&a_{13}&\cdots\\a_{21}&a_{22}&a_{23}&\cdots\\a_{31}&a_{32}&a_{33}&\cdots\\\vdots&\vdots&\vdots&\ddots\end{bmatrix}\)

In general, the first k pivots come from the \(k\times k\) submatrix \(A_k\) in the upper left corner of \(A\),

if without row exchanges, Hence \(det(A_k)=d_1d_2\cdots d_k,\;for\;k=1,2,\cdots,n\)

Therefore, the \(k_{th}\) pivot is,

\(d_k=\frac{d_1d_2\cdots d_k}{d_1d_2\cdots d_{k-1}}=\frac{det(A_k)}{det(A_{k-1})}\)

if without row exchanges

Note that we don not need row exchanges when all these corner submatrices \(A_k\;have\;det(A_k)\neq0\)



The Big Formula

\(\begin{vmatrix}a&b\\c&d\end{vmatrix}\)

\(=\begin{vmatrix}a&0\\c&d\end{vmatrix}+\begin{vmatrix}0&b\\c&d\end{vmatrix}\)

\(=\begin{vmatrix}a&0\\c&0\end{vmatrix}+\begin{vmatrix}a&0\\0&d\end{vmatrix}+\begin{vmatrix}0&b\\c&0\end{vmatrix}+\begin{vmatrix}0&b\\0&d\end{vmatrix}\)

\(=\begin{vmatrix}a&0\\0&d\end{vmatrix}+\begin{vmatrix}0&b\\c&0\end{vmatrix}\)

\(=ad\begin{vmatrix}1&0\\0&1\end{vmatrix}+bc\begin{vmatrix}0&1\\1&0\end{vmatrix}\)

\(=ad-bc\)

\(det\;A=\begin{vmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{vmatrix}\)

\(we\;can\;split\;a\;row\;(\begin{array}{ccc}a_{11}&a_{12}&a_{13}\end{array})\;into\;3\;rows\)

\((\begin{array}{ccc}a_{11}&0&0\end{array})+(\begin{array}{ccc}0&a_{12}&0\end{array})+(\begin{array}{ccc}0&0&a_{13}\end{array})\)

using linearity of each row, det A splits into \(3^3=27\) simple determininants.

Yet if a column choice is repeated, the corresponding simple determinant is zero.

e.g., \(\begin{vmatrix}a_{11}&0&0\\a_{21}&0&0\\0&0&a_{33}\end{vmatrix}=0\)

Hence there are only 3!=6 simple determinants left.

\(\begin{vmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{vmatrix}\)

\(=\begin{vmatrix}a_{11}&0&0\\0&a_{22}&0\\0&0&a_{33}\end{vmatrix}+\begin{vmatrix}0&a_{12}&0\\0&0&a_{23}\\a_{31}&0&0\end{vmatrix}+\begin{vmatrix}0&0&a_{13}\\a_{21}&0&0\\0&a_{32}&0\end{vmatrix}\)

\(+\begin{vmatrix}a_{11}&0&0\\0&0&a_{23}\\0&a_{32}&0\end{vmatrix}+\begin{vmatrix}0&a_{12}&0\\a_{21}&0&0\\0&0&a_{33}\end{vmatrix}+\begin{vmatrix}0&0&a_{13}\\0&a_{22}&0\\a_{31}&0&0\end{vmatrix}\)

\(=a_{11}a_{22}a_{33}\begin{vmatrix}1&0&0\\0&1&0\\0&0&1\end{vmatrix}+a_{12}a_{23}a_{31}\begin{vmatrix}0&1&0\\0&0&1\\1&0&0\end{vmatrix}+a_{13}a_{21}a_{32}\begin{vmatrix}0&0&1\\1&0&0\\0&1&0\end{vmatrix}\)

\(+a_{11}a_{23}a_{32}\begin{vmatrix}1&0&0\\0&0&1\\0&1&0\end{vmatrix}+a_{12}a_{21}a_{33}\begin{vmatrix}0&1&0\\1&0&0\\0&0&1\end{vmatrix}+a_{13}a_{22}a_{31}\begin{vmatrix}0&0&1\\0&1&0\\1&0&0\end{vmatrix}\)

\(=a_{11}a_{22}a_{33}+a_{12}a_{23}a_{31}+a_{13}a_{21}a_{32}\)

\(+(-a_{11}a_{23}a_{32})+(-a_{12}a_{21}a_{33})+(-a_{13}a_{22}a_{31})\)

\(In\;general,\;for\;an\;n\times n\;matrix\;A,\)

\(det\;A=\sum_{\sigma=(\alpha,\beta,\cdots,\omega)}(det\;P_\sigma)a_{1\beta}a_{2\beta}\cdots a_{n\omega}\)

\(where\;\sigma=(\alpha,\beta,\cdots,\omega)\;is\;a\;permutation\;of\;(1,2,\cdots,n)\)

\(and\;P_\sigma\;is\;the\;corresponding\;permutation\;matrix\;of\;\sigma\)

\(\sigma=(2\;1\;3)\)

\(P_\sigma=\begin{bmatrix}0&1&0\\1&0&0\\0&0&1\end{bmatrix}\)

The summation is over all possible permutations of \((1,2,\cdots,n)\)

\(\therefore There\;are\;n!\;terms\;in\;det\;A.\)

[Ex] \(n=4,\;\sigma=(1,4,2,3)\)

The corresponding term is \((det\;P_\sigma)\;a_{11}a_{24}a_{32}a_{43}\)

and \(det\;P_\sigma=\begin{vmatrix}1&0&0&0\\0&0&0&1\\0&1&0&0\\0&0&1&0\end{vmatrix}=1\)

[Ex] \(\begin{vmatrix}1&0&a&0\\0&1&b&0\\0&0&c&0\\0&0&d&1\end{vmatrix}=\begin{vmatrix}1&0&0&0\\0&1&0&0\\0&0&c&0\\0&0&0&1\end{vmatrix}=c\begin{vmatrix}1&0&0&0\\0&1&0&0\\0&0&1&0\\0&0&0&1\end{vmatrix}=c\)

[Ex] \(\begin{vmatrix}0&\boxed1&0&0\\1&0&\cancel1&0\\0&\cancel1&0&1\\0&0&\boxed1&0\end{vmatrix}=\begin{vmatrix}0&1&0&0\\1&0&0&0\\0&0&0&1\\0&0&1&0\end{vmatrix}=1\)

Cofactor Formula

\(\begin{vmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{vmatrix}=a_{11}(a_{22}a_{33}-a_{23}a_{32})+a_{12}(a_{23}a_{31}-a_{21}a_{33})+a_{13}(a_{21}a_{32}-a_{22}a_{31})\)

\(=\begin{vmatrix}a_{11}&0&0\\0&a_{22}&a_{23}\\0&a_{32}&a_{33}\end{vmatrix}+\begin{vmatrix}0&a_{12}&0\\a_{21}&0&a_{23}\\a_{31}&0&a_{33}\end{vmatrix}+\begin{vmatrix}0&0&a_{13}\\a_{21}&a_{22}&0\\a_{31}&a_{32}&0\end{vmatrix}\)

\(=a_{11}\cdot(+detM_{11})+a_{12}\cdot(-detM_{12})+a_{13}\cdot(+detM_{13})\)

\(\triangleq a_{11}C_{11}+a_{12}C_{12}+a_{13}C_{13}\)

\(where\;C_{1j}={(-1)}^{1+j}\cdot detM_{1j}\)

\(In\;gneral,\;along\;row\;i\;of\;an\;n\times n\;matrix\;A,\)

\(det\;A=a_{i1}C_{i1}+a_{i2}C_{i2}+\cdots+a_{in}C_{in}\)

\(where\;C_{ij}={(-1)}^{i+j}\cdot detM_{ij}\;and\;M_{ij}\;is\;the\;(n-1)\times(n-1)\;submatrix\;of\;A\)

\(with\;row\;i\;and\;column\;j\;removed.\)


\(det\;A=a_{i1}C_{i1}+a_{i2}C_{i2}+\cdots+a_{in}C_{in}\)

\(where\;{\color[rgb]{0.0, 0.0, 1.0}c}{\color[rgb]{0.0, 0.0, 1.0}o}{\color[rgb]{0.0, 0.0, 1.0}f}{\color[rgb]{0.0, 0.0, 1.0}a}{\color[rgb]{0.0, 0.0, 1.0}c}{\color[rgb]{0.0, 0.0, 1.0}t}{\color[rgb]{0.0, 0.0, 1.0}o}{\color[rgb]{0.0, 0.0, 1.0}r}\;C_{ij}={(-1)}^{i+j}\cdot detM_{ij}\)


[Ex] \(M_{43}=\begin{bmatrix}&\vdots&\\\cdots&a_{43}&\cdots\\&\vdots&\end{bmatrix}\)

\(sign{(-1)}^{i+j}\Rightarrow\begin{bmatrix}+&-&+&-\\-&+&-&+\\+&-&+&-\\-&+&-&+\end{bmatrix}\)


\(since\;det\;A=det\;A^T\)

\(det\;A=a_{1j}C_{1j}+a_{2j}C_{2j}+\cdots+a_{nj}C_{nj}\)

cofactors down column \(j\)

[Ex] \(\begin{vmatrix}2&-1&0&0\\-1&2&-1&0\\0&-1&2&-1\\0&0&-1&2\end{vmatrix}=2\begin{vmatrix}2&-1&0\\-1&2&-1\\0&-1&2\end{vmatrix}-(-1)\begin{vmatrix}-1&-1&0\\0&2&-1\\0&-1&2\end{vmatrix}\)

\(=2\begin{vmatrix}2&-1&0\\-1&2&-1\\0&-1&2\end{vmatrix}-\begin{vmatrix}2&-1\\-1&2\end{vmatrix}\)

\(Let\;A_n=\begin{bmatrix}2&-1&0&\cdots&0\\-1&2&\ddots&\ddots&\vdots\\0&\ddots&\ddots&\ddots&0\\\vdots&\ddots&\ddots&\ddots&-1\\0&\cdots&0&-1&2\end{bmatrix}\)

\(and\;D_n=det\;A_n\)

\(\therefore D_4=2D_3-D_2\)

In general,

\(D_n=2D_{n-1}-D_{n-2}\;for\;n\geq3\) finite difference equation

\(D_1=2\;and\;D_2=4-1=3\)

\(D_3=2D_2-D_1=2\cdot3-2=4\)

\(D_4=2D_3-D_2=2\cdot4-3=5\)

In general, guess \(D_n=n+1\)

Then \(D_1=1+1=2\)

\(D_2=2+1=3\)

\(D_3=3+1=4\)

\(D_n=2D_{n-1}-D_{n-2}=n+1\)

\(=2\lbrack(n-1)+1\rbrack-\lbrack(n-2)+1\rbrack=2n-n+1=n+1\)


Cramer's Rule, Inverses, and Volumes

Cramer's Rule, try to solve \(A\underline x=\underline b\)

\(\begin{bmatrix}&&\\&A&\\&&\end{bmatrix}\begin{bmatrix}x_1\\x_2\\x_3\end{bmatrix}=\begin{bmatrix}\;\\\underline b\\\;\end{bmatrix}\)

\(\begin{bmatrix}&&\\&A&\\&&\end{bmatrix}\begin{bmatrix}x_1&0&0\\x_2&1&0\\x_3&0&1\end{bmatrix}=\begin{bmatrix}\;&a_{12}&a_{13}\\\underline b&a_{22}&a_{23}\\\;&a_{32}&a_{33}\end{bmatrix}\)

\(\Rightarrow det\;A\cdot x_1=det\;B_1\)

\(\Rightarrow x_1=\frac{det\;B_1}{det\;A}\)

\(\begin{bmatrix}&&\\&A&\\&&\end{bmatrix}\begin{bmatrix}1&x_1&0\\0&x_2&0\\0&x_3&1\end{bmatrix}=\begin{bmatrix}a_{11}&b_1&a_{13}\\a_{21}&b_2&a_{23}\\a_{31}&b_3&a_{33}\end{bmatrix}\)

\(\Rightarrow det\;A\cdot x_2=det\;B_2\)

\(\Rightarrow x_2=\frac{det\;B_2}{det\;A}\)

\(If\;det\;A\neq0,\;A\underline x=\underline b\;is\;solved\;by\;x_1=\frac{det\;B_1}{det\;A},\;x_2=\frac{det\;B_2}{det\;A},\cdots,x_n=\frac{det\;B_n}{det\;A}\)

\(where\;B_j\;has\;the\;jth\;column\;of\;A\;replaced\;by\;\underline b\)



\(\begin{bmatrix}3&4\\5&6\end{bmatrix}\begin{bmatrix}x_1\\x_2\end{bmatrix}=\begin{bmatrix}2\\4\end{bmatrix}\)

\(x_1=\frac{\begin{vmatrix}2&4\\4&6\end{vmatrix}}{\begin{vmatrix}3&4\\5&6\end{vmatrix}}=\frac{12-16}{18-20}=\frac{-4}{-2}=2\)

\(x_1=\frac{\begin{vmatrix}3&2\\5&4\end{vmatrix}}{\begin{vmatrix}3&4\\5&6\end{vmatrix}}=\frac{12-10}{18-20}=\frac2{-2}=-1\)


Inverse

\(AA^{-1}=I\)

\(\begin{bmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{bmatrix}\begin{bmatrix}x_{11}&x_{12}&x_{13}\\x_{21}&x_{22}&x_{23}\\x_{31}&x_{32}&x_{33}\end{bmatrix}=\begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix}\)

\(\begin{bmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{bmatrix}\begin{bmatrix}x_{11}\\x_{21}\\x_{31}\end{bmatrix}=\begin{bmatrix}1\\0\\0\end{bmatrix}\)

\(x_{11}=\frac{\begin{vmatrix}1&a_{12}&a_{13}\\0&a_{22}&a_{23}\\0&a_{32}&a_{33}\end{vmatrix}}{det\;A}=\frac{C_{11}}{det\;A}=\frac{\begin{vmatrix}a_{22}&a_{23}\\a_{32}&a_{33}\end{vmatrix}}{det\;A}\)

\(x_{21}=\frac{\begin{vmatrix}a_{11}&1&a_{13}\\a_{21}&0&a_{23}\\a_{31}&0&a_{33}\end{vmatrix}}{det\;A}=\frac{C_{12}}{det\;A}=\frac{-\begin{vmatrix}a_{21}&a_{23}\\a_{31}&a_{33}\end{vmatrix}}{det\;A}\)

\(x_{31}=\frac{\begin{vmatrix}a_{11}&a_{12}&1\\a_{21}&a_{22}&0\\a_{31}&a_{32}&0\end{vmatrix}}{det\;A}=\frac{C_{13}}{det\;A}=\frac{\begin{vmatrix}a_{21}&a_{22}\\a_{31}&a_{32}\end{vmatrix}}{det\;A}\)

\({(A^{-1})}_{ij}=\frac{C_{ji}}{det\;A}\)

\(and\;A^{-1}=\frac{C^T}{det\;A}\)

\(where\;C_{ij}={(-1)}^{i+j}\;det\;M_{ij}\)

\(and\;\boldsymbol C=\lbrack C_{ij}\rbrack\;cofactor\;matrix\)

Inverse Formula





[Ex] \(A=\begin{bmatrix}1&0&0\\1&1&0\\1&1&1\end{bmatrix}\)

\({(A^{-1})}_{11}=\frac{C_{11}}{\begin{vmatrix}1&0&0\\1&1&0\\1&1&1\end{vmatrix}}=\frac{\begin{vmatrix}1&0\\1&1\end{vmatrix}}{\begin{vmatrix}1&0&0\\1&1&0\\1&1&1\end{vmatrix}}=\frac11=1\)

\({(A^{-1})}_{12}=\frac{C_{21}}{\begin{vmatrix}1&0&0\\1&1&0\\1&1&1\end{vmatrix}}=\frac{-\begin{vmatrix}0&0\\1&1\end{vmatrix}}{\begin{vmatrix}1&0&0\\1&1&0\\1&1&1\end{vmatrix}}=0\)

\(\therefore A^{-1}=\begin{bmatrix}1&0&0\\-1&1&0\\0&-1&1\end{bmatrix}=\frac{C^T}{det\;A}\)


Volume

The value of this area is postive if \((x_1,\;y_1)\;to\;(x_2,\;y_2)\) is counter-clockwise.

claim \(Area\;=det\begin{bmatrix}x_1&y_1\\x_2&y_2\end{bmatrix}\)

Proof: we show that the area satisfies the three rules of determinants.

(1) \(A=\begin{bmatrix}1&0\\0&1\end{bmatrix},\;area=1\cdot1=1\;and\;det\begin{vmatrix}1&0\\0&1\end{vmatrix}=1\)

(2) When rows ae exchanged, the area remains unchanged but the sign changes.

\(det\begin{bmatrix}x_1&y_1\\x_2&y_2\end{bmatrix}=-det\begin{bmatrix}x_2&y_2\\x_1&y_1\end{bmatrix}\)

(3-a) If row 1 is multiplied by t, the area is also multiplied by t.

\(det\begin{bmatrix}tx_1&ty_1\\x_2&y_2\end{bmatrix}=t\cdot det\begin{bmatrix}x_1&y_1\\x_2&y_2\end{bmatrix}\)

(3-b) Suppose a new row \((x_1',y_1')\) is added to \((x_1,y_1)\),

\((x_1,y_1)\;to\;(x_2,y_2)\;and\;(x_1',y_1')\;to\;(x_2,y_2)\)

\((x_1,y_1)+(x_1',y_1')\;to\;(x_2,y_2)\)

The corresponding area are also added.

\(det\begin{bmatrix}x_1+x_1'&y_1+y_1'\\x_2&y_2\end{bmatrix}=det\begin{bmatrix}x_1&y_1\\x_2&y_2\end{bmatrix}+det\begin{bmatrix}x_1'&y_1'\\x_2&y_2\end{bmatrix}\)

similarly, the value of the volume is positive for a right-hand triple and negative for a left-hand triple.

Claim \(Volume=det\begin{bmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{bmatrix}\)

Proof: The volume satisfies the three rules of determinants.

Matrix

Singular Value Decomposition



Example (C) Google PageRank

Let's assume the Web contains 6 pages only. Page 2 provides the links connecting to page 3 and page 4. The links between these and the other pages in this simple web are summarised in this diagram.



Their task was to find the "most important" page for a particular search query. The higher priority page would return at the top of the search result.

Google's use of eigenvalues and eigenvectors. Considering Page 1, it has 4 outgoing links (to pages 2, 4, 5, and 6).So in the first column of our "links matrix", we place value \(\frac14\) in each of rows 2, 4, 5 and 6.

\(A=\begin{bmatrix}0&0&0&0&\frac12&0\\\frac14&0&0&0&0&0\\0&\frac12&0&0&0&0\\\frac14&\frac12&0&0&\frac12&0\\\frac14&0&1&1&0&1\\\frac14&0&0&0&0&0\end{bmatrix}\)

Next, to find the eigenvalues. We have

\(\left|A-\lambda I\right|=\begin{bmatrix}-\lambda&0&0&0&\frac12&0\\\frac14&-\lambda&0&0&0&0\\0&\frac12&-\lambda&0&0&0\\\frac14&\frac12&0&-\lambda&\frac12&0\\\frac14&0&1&1&-\lambda&1\\\frac14&0&0&0&0&-\lambda\end{bmatrix}\)

\(=\lambda^6-\frac{5\lambda^4}8-\frac{\lambda^3}4-\frac{\lambda^2}8\)

This expression is zero for \(\lambda=-0.72031,\;-0.13985,\;\pm0.39240j,\;0,\;1\).

We can only use non-negative, real values of \(\lambda\;is\;1\)

we find the corresponding eigenvector is: \(V_1=\begin{bmatrix}4&1&0.5&5.5&8&1\end{bmatrix}^T\)

As Page 5 has the highest PageRank, we conclude it is the most "important", and it will appear at the top of the search results.

We often normalize this vector so the sum of its elements is 1.We just add up the amounts and divide each amount by that total. It is called normalized vector P for "PageRank".

\(P=\begin{bmatrix}0.2&0.05&0.025&0.275&0.4&0.05\end{bmatrix}^T\)

Visualization of Singular Value decomposition of a Symmetric Matrix

The Singular Value Decomposition of a matrix A satisfies

\(\mathbf A = \mathbf U \mathbf \Sigma \mathbf V^\top\) The visualization of it would look like



But when \(\mathbf A\) is symmetric we can do:

\( \begin{align*} \mathbf A\mathbf A^\top&=(\mathbf U\mathbf \Sigma\mathbf V^\top)(\mathbf U\mathbf \Sigma\mathbf V^\top)^\top\\ \mathbf A\mathbf A^\top&=(\mathbf U\mathbf \Sigma\mathbf V^\top)(\mathbf V\mathbf \Sigma\mathbf U^\top) \end{align*} \)

and since \(\mathbf V\) is an orthogonal matrix (\(\mathbf V^\top \mathbf V=\mathbf I\)), so we have:

\(\mathbf A\mathbf A^\top=\mathbf U\mathbf \Sigma^2 \mathbf U^\top\)



singular value decomposition



Multilinear singular value decomposition and low multilinear rank approximation

\[{\left[\begin{array}{cc}{\mathbf A}_{11}&{\mathbf A}_{12}\\{\mathbf A}_{21}&{\mathbf A}_{22}\end{array}\right]}{\left[\begin{array}{c}{\mathbf x}_1\\{\mathbf x}_2\end{array}\right]}={\left[\begin{array}{c}{\mathbf b}_1\\{\mathbf b}_2\end{array}\right]}\]

\[\begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix}\]



tricks for computing eigenvalues



Matrix 2x2

\(\begin{bmatrix}a&b\\c&d\end{bmatrix}\xrightarrow{quick}\lambda_1,\;\lambda_2\)

1) \(\frac12tr\left(\begin{bmatrix}\boxed a&b\\c&\boxed d\end{bmatrix}\right)=\frac{a+d}2=\frac{\lambda_1+\lambda_2}2=m\;(mean)\)

2) \(det\left(\begin{bmatrix}a&b\\c&d\end{bmatrix}\right)=ad-bc=\lambda_1\lambda_2=p\;(product)\)

\(p\;=\;m^2\;-\;d^2\;=(m\;+\;d)(m\;-\;d)\)

\(d^2\;=\;m^2\;-\;p\)

3) \(\lambda_1,\;\lambda_2\;=\;m\pm\sqrt{m^2-p}\;\)


Matrix 3x3

\(A=\begin{bmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{bmatrix}\)

\(\lambda^3-\beta_1\lambda^2+\beta_2\lambda-\beta_3=0\)

\(\beta_1=Trace\;of\;Matrix\;=\;Sum\;of\;diagonal\;elements\)

\(\beta_1=a_{11}+a_{22}+a_{33}\)

\(\beta_2=Sum\;of\;Minors\;of\;diagonal\;elements\)

\(\begin{vmatrix}\left(a_{11}\right)&a_{12}&a_{13}\\a_{21}&\left(a_{22}\right)&a_{23}\\a_{31}&a_{32}&\left(a_{33}\right)\end{vmatrix}\)

a11 a12 a13 a21 a22 a23 a31 a32 a33   a11 a12 a13 a21 a22 a23 a31 a32 a33   a11 a12 a13 a21 a22 a23 a31 a32 a33

\(\beta_2=\begin{vmatrix}a_{11}&a_{12}\\a_{21}&a_{22}\end{vmatrix}+\begin{vmatrix}a_{22}&a_{23}\\a_{32}&a_{33}\end{vmatrix}+\begin{vmatrix}a_{11}&a_{13}\\a_{31}&a_{33}\end{vmatrix}\)

\(\beta_3=det(A)\)

\(\beta_3=\begin{vmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{vmatrix}\)

\(det(A)=\begin{vmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{vmatrix} = a_{11} \begin{vmatrix} a_{22} & a_{23} \\ a_{32} & a_{33} \end{vmatrix} - a_{12} \begin{vmatrix} a_{21} & a_{23} \\ a_{31} & a_{33} \end{vmatrix} + a_{13} \begin{vmatrix} a_{21} & a_{22} \\ a_{31} & a_{32} \end{vmatrix}\)

\(del(A)=\begin{array}{c}a_{13}\\\;\\a_{33}\end{array}\begin{vmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{vmatrix}\begin{array}{c}a_{11}\\\;\\a_{31}\end{array}\)

\(=+\left[\begin{array}{c}\bcancel{a_{13}}\\\;\\a_{33}\\\;\end{array}\begin{vmatrix}\bcancel{a_{11}}&\bcancel{a_{12}}&a_{13}\\\bcancel{a_{21}}&\bcancel{a_{22}}&\bcancel{a_{23}}\\a_{31}&\bcancel{a_{32}}&\bcancel{a_{33}}\\\;&\;&i\end{vmatrix}\begin{array}{cc}a_{11}&\;\\\;&\;\\\bcancel{a_{31}}&\;\\j&k\end{array}\right]-\left[\begin{array}{cc}\;&a_{13}\\\;&\;\\\;&\cancel{a_{33}}\\x&y\end{array}\begin{vmatrix}a_{11}&\cancel{a_{12}}&\cancel{a_{13}}\\\cancel{a_{21}}&\cancel{a_{22}}&\cancel{a_{23}}\\\cancel{a_{31}}&\cancel{a_{32}}&a_{33}\\z&\;&\;\end{vmatrix}\begin{array}{c}\cancel{a_{11}}\\\;\\a_{31}\\\;\end{array}\right]\)

\(=(i\;+\;j\;+\;k)\;–\;(x\;+\;y\;+\;z)\)


Example (D)

Humans and Zombies

Let me a give a toy example. Suppose that I have a population of humans and zombies. We’ll check in on them once every few days (what’s the worst that can happen?). Suppose that every human has an 80% chance of surviving those few days, and a 20% chance of turning into a zombie. Thankfully, the humans aren’t completely defenseless—they have found a cure! Unfortunately, it’s very difficult to apply the cure, so every zombie has a 99% chance of remaining a zombie, and a 1% chance of actually being cured. Let’s write this information down in matrix form:

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}\)

Here is why this is useful: if I write the number of humans \(H\) and the number of zombies \(Z\) as a vector\((H, Z)\), then multiplying by the above matrix will exactly tell me how many humans and how many zombies I should expect in a couple of days. For instance, suppose that I start with one zombie and 99 humans, and check back in two days. Well,

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}\begin{pmatrix}99\\1\end{pmatrix}=\begin{pmatrix}79.21\\20.79\end{pmatrix}\)

so I’m expecting that there should be about 79 humans and 21 zombies at this point. What if I check back in another two days? Well,

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}\begin{pmatrix}79.21\\20.79\end{pmatrix}\approx\begin{pmatrix}63.58\\36.42\end{pmatrix}\)

so I’m expecting that there should be about 64 humans and 36 zombies. I can keep going: after six days, there will be 51 humans and 49 zombies; after eight days, there will be 41 humans and 59 zombies; after ten days, there will be 34 humans and 66 zombies…

This is starting to look worrisome. Are the humans going to be wiped out? It’s certainly possible, but we can use some techniques from linear algebra and calculus to figure out what will happen in the long term. To wit, what we have shown is that the number of humans and zombies after \(2n\) days will be

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}^n\begin{pmatrix}99\\1\end{pmatrix}=\begin{pmatrix}H\\Z\end{pmatrix}\)

On the other hand, one can work out (look up Jordan decomposition if you are interested—you may need to brush up on some linear algebra first) that

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}\approx\begin{pmatrix}-0.05&-0.71\\-1.00&0.71\end{pmatrix}\begin{pmatrix}1&0\\0&0.79\end{pmatrix}\begin{pmatrix}-0.05&-0.71\\-1.00&0.71\end{pmatrix}^{-1}\)

from which it follows that

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}^n=\begin{pmatrix}-0.05&-0.71\\-1.00&0.71\end{pmatrix}\begin{pmatrix}1&0\\0&0.79\end{pmatrix}^n\begin{pmatrix}-0.05&-0.71\\-1.00&0.71\end{pmatrix}^{-1}\)

but since

\(\begin{pmatrix}1&0\\0&0.79\end{pmatrix}^{n\rightarrow\infty}\rightarrow\begin{pmatrix}1&0\\0&0\end{pmatrix}\)

as \(n→∞\) (that is, as \(n\) gets very large, the matrix that we multiply by itself over and over again gets closer and closer to the matrix on the right hand side), it follows that

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}^n=\begin{pmatrix}-0.05&-0.71\\-1.00&0.71\end{pmatrix}\begin{pmatrix}1&0\\0&0\end{pmatrix}\begin{pmatrix}-0.05&-0.71\\-1.00&0.71\end{pmatrix}^{-1}=\begin{pmatrix}0.05&0.05\\0.95&0.95\end{pmatrix}\)

Thus, if we start with 99 humans and 1 zombie, the expectation is that after a long period of time, we’ll have

\(\begin{pmatrix}0.05&0.05\\0.95&0.95\end{pmatrix}\begin{pmatrix}99\\1\end{pmatrix}\approx\begin{pmatrix}4.76\\95.24\end{pmatrix}\)

—that is, about 5 humans and 95 zombies. The good news is that the humans survive; the bad news is that they only just barely survive.

Naturally, zombies aren’t very “real life,” but the spread of genes through a population is, and Markov chains can be applied to that perfectly well. Figuring out how to weigh different websites based on how many other sites link to them can also be modeled as a Markov process—this is the fundamental observation behind PageRank, Google’s algorithm for figuring out what websites it should display first. Predictive text works on basically the same principle. In fact, I strongly recommend looking at the Wikipedia page for applications of Markov chains, because there are a lot of them, coming from all sorts of different disciplines.


Zombie outbreak

Let \(y_t, z_t\) represent the number of human and zombies (respectively) \(t\) months after an initial outbreak. Suppose the outbreak obeys the predator-prey model

\(\begin{bmatrix}y_{t+1}\\z_{t+1}\end{bmatrix}=\begin{bmatrix}1.02&-0.1\\0.08&0.83\end{bmatrix}\begin{bmatrix}y_t\\z_t\end{bmatrix}\)

Prove that any number of zombies will eventually destroy humanity

Before we get to the solution, it's probably important to say a bit more about what this “predator-prey model” means. This matrix equation is equivalent to a system of two linear equations:

\(y_{t+1}=1.02y_t-0.1z_t\)

\(z_{t+1}=0.08y_t+0.83z_t\)

The first equation says that the human population \((y_t)\) would grow exponentially 2% per month in the absence of zombies, but the zombies kill off one-tenth of their number in humans. The second equation says the zombie population decreases in the absence of humans (they require humans as prey) but that most of the humans they kill turn into more zombies. Details and exact numbers aside, this is a reasonable approximation of how one could model a predator-prey relationship. We could replace humans and zombies with the more traditional rabbits and foxes or whatever. The math doesn't care.

Solution outline: use eigenvalues! You can calculate them manually or by calculator, but they are approximately the following:

\(\lambda_1\approx.957,\;\lambda_2\approx.893\)

According to the theorem above, this means that regardless of the initial populations, we have a general solution of the form

\(\begin{bmatrix}y_t\\z_t\end{bmatrix}=c_1\left(.957\right)^tv_1+c_2\left(.893\right)^tv_2\)

Since both eigenvalues are less than 1, this population vector converges to zero as t increases. That is, humanity slowly perishes (and the zombies with them).

We could even be a bit more precise about how they perish — in the long run, the second term is asymptotically smaller in magnitude than the first. We can say that

\(\lim_{t\rightarrow\infty}\begin{bmatrix}y_t\\z_t\end{bmatrix}\lambda_1^t=c_1v_1\)

so the population decays exponentially along the \(λ_1\) eigenvector \(v_1\), which in this case represents a ratio of approximately 846 humans to 533 zombies. So the leading eigenvalue tells you the rate of growth/decay, and the leading eigenvector tells you the limiting population ratios as they grow or decay.

I could go on or do a more complex example, but hopefully this conveys what eigenvalues and eigenvectors tell you at least for the predator/prey linear dynamical system model.

In general, eigenvectors and eigenvalues are crucial for determining the long-term behavior of all kinds of models. A partial list of applications includes the following:

Population Dynamics (biology/epidemiology)

Stabilizing a system e.g. antilock brakes (control theory/engineering)

Finding resonant frequencies (engineering, physics)

Ranking internet pages by importance (computer science)

Principal component analysis (statistics)

Finding stable axes of rotation (physics)

Example (E)

Truck Rental Company

A truck rental company has locations all over Vancouver, where you can rent moving trucks. You can return them to any other location. For simplicity, pretend that there are three locations, and that every customer returns their truck the next day. Let \(v_t\) be the vector whose entries \(x_t, y_t, z_t\) are the number of trucks in locations \(1, 2\), and \(3\), respectively. Let \(A\) be the matrix whose \(i, j\) -entry is the probability that a customer renting a truck from location \(j\) returns it to location \(i\). For example, the matrix

\(A=\begin{pmatrix}.3&.4&.5\\.3&.4&.3\\.4&.2&.2\end{pmatrix}\)

\(AQ=Q\Lambda\)

\( \begin{pmatrix}.3&.4&.5\\.3&.4&.3\\.4&.2&.2\end{pmatrix}\begin{pmatrix}1.4&-1&0.5\\1.2&0&-1.5\\1&1&1\end{pmatrix}=\begin{pmatrix}1.4&-1&0.5\\1.2&0&-1.5\\1&1&1\end{pmatrix}\begin{pmatrix}1&0&0\\0&-0.2&0\\0&0&0.1\end{pmatrix} \)


Therefore, the Truck numbers in location are \(1, 2\), and \(3\) is \(1.4:1.2:1\). That is \(7:6:5\).


Eigenvalue Decomposition


Orthogonal projectors

Truck Rental Company

P is an orthogonal projector if and only if \(P^2=P\) and \(AP\) is symmetric.

In linear algebra, an \(Orthogonal\) \(Matrix\), or \(Orthonormal\) \(Matrix\), is a real square matrix whose columns and rows are orthonormal vectors. One way to express this is

\(Q^TQ=QQ^T=I,\)

where \(QT\) is the transpose of \(Q\), and \(I\) is the identity matrix.

This leads to the equivalent characterization: a matrix \(Q\) is orthogonal if its transpose is equal to its inverse:

\(Q^T=Q^{-1},\)

where \(Q^{-1}\) is the inverse of \(Q\).

An orthogonal matrix Q is necessarily invertible (with inverse \(Q^{-1}=Q^T\)), unitary \((Q^{-1}=Q^*)\), where \(Q^∗\) is the Hermitian adjoint (conjugate transpose) of \(Q\), and therefore normal \((Q^∗Q = QQ^∗)\) over the real numbers.
The determinant of any orthogonal matrix is either +1 or -1. As a linear transformation, an orthogonal matrix preserves the inner product of vectors, and therefore acts as an isometry of Euclidean space, such as a rotation, reflection or rotoreflection. In other words, it is a unitary transformation.



Orthogonal Matrix


Determinant

“negative area” in the case of 2x2 matrix or “negative volume” in the case of 3x3 matrix. It is related to the relative orientation of vectors. You maybe heard of the “right hand rule”






Why determinant of a 2 by 2 matrix is the area of a parallelogram?

Parallelogram = Rectangle - Extra Stuff.

\((c+a)(b+d)-2ad-cd-ab\)

Also interesting to note that if you swap vectors places then you get a negative(opposite of what \(bc−ad\) would produce) area, which is basically:

-Parallelogram = Rectangle - (2*Rectangle - Extra Stuff)

Or more concretely:

\((c+a)(b+d) - [2*(c+a)(b+d) - (2ad+cd+ab)]\)

Also it's \(ad−bc\), when simplified.


so with a little algebra we find that the base length is \(a-b\times\frac{c}{d}\)

Hence the area is just \(d(a-\frac{bc}{d}) = ad - bc\)




Determinant of a \(3×3\) matrix

Proof that the determinant of a 3×3 matrix is the volume of the parallelepiped spanned by the columns.



General Formula for the Inverse of a \(2×2\) matrix

The inverse of \(A\) is \(A^{-1}\) only when:\(AA^{-1} = A^{-1}A = I\). Sometimes there is no inverse at all.

\(A=\begin{bmatrix}a&b\\c&d\end{bmatrix}\)

\(det(A)=ad-bc\)

\(A^{-1}=\begin{bmatrix}a&b\\c&d\end{bmatrix}^{-1}=\frac1{det(A)}\begin{bmatrix}d&-b\\-c&a\end{bmatrix}=\frac1{ad-bc}\begin{bmatrix}d&-b\\-c&a\end{bmatrix}\)


General Formula for the Inverse of a \(3×3\) matrix

\(let M=\begin{bmatrix}a&b&c\\d&e&f\\g&h&i\end{bmatrix},\)

Now produce the matrix of minors

\(\begin{bmatrix}ei-fh&di-fg&dh-eg\\bi-ch&ai-cg&ah-bg\\bf-ce&af-cd&ae-bd\end{bmatrix},\)

Use the alternating law of signs to produce the matrix of cofactors \(M^\backprime \)

\(M^\backprime =\begin{bmatrix}ei-fh&fg-di&dh-eg\\ch-bi&ai-cg&bg-ah\\bf-ce&cd-af&ae-bd\end{bmatrix}\)

Transpose\(M^\backprime\)

\(M^{\backprime T}=\begin{bmatrix}ei-fh&ch-bi&bf-ce\\fg-di&ai-cg&cd-af\\dh-eg&bg-ah&ae-bd\end{bmatrix}\)

\(det\;M=a(ei-fh)-b(di-fg)+c(dh-eg)\)

The grand formula:

\(M^{-1}=\frac1{a(ei-fh)-b(di-fg)+c(dh-eg)}\begin{bmatrix}ei-fh&ch-bi&bf-ce\\fg-di&ai-cg&cd-af\\dh-eg&bg-ah&ae-bd\end{bmatrix}\)


Laplace expansion

Laplace expansion expresses the determinant of a matrix \(A\) recursively in terms of determinants of smaller matrices, known as its minors. The minor \(M_{i,j}\) is defined to be the determinant of the \((n - 1) × (n - 1)\)-matrix that results from \(A\) by removing the \(i\)-th row and the \(j\)-th column. The expression \({(-1)}^{i+j}M_{i,j}\) is known as a cofactor. For every \(i\), one has the equality

\(det(A)=\sum_{j=1}^n{(-1)}^{i+j}a_{i,j}M_{i,j}\)




Example (F)

A Population Growth Model

A population of rabbits has the following characteristics.

(i) Half of the rabbits survive their first year. Of those, half survive their second year. The maximum life span is 3 years.

(ii) During the first year, the rabbits produce no offspring. The average number of offspring is 6 during the second year and 8 during the third year.

The population now consists of 24 rabbits in the first age class, 24 in the second, and 20 in the third. How many rabbits will there be in each age class in 1 year?

\(A=\begin{bmatrix}0&6&8\\0.5&0&0\\0&0.5&0\end{bmatrix}\)

\(x_1=\begin{bmatrix}24\\24\\20\end{bmatrix}\)

\(x_2=Ax_1=\begin{bmatrix}0&6&8\\0.5&0&0\\0&0.5&0\end{bmatrix}\begin{bmatrix}24\\24\\20\end{bmatrix}=\begin{bmatrix}304\\12\\12\end{bmatrix}\)

Finding a Stable Age Distribution Vector

\(Ax=\lambda x\Rightarrow\vert A-\lambda I\vert=0\)

\(det(A-\lambda I)=-\lambda^3+3\lambda+2=-{(\lambda+1)}^2(\lambda-2)\)

\(\lambda=2\Rightarrow x=\begin{bmatrix}16\\4\\1\end{bmatrix}\)

\(\begin{bmatrix}0&6&8\\0.5&0&0\\0&0.5&0\end{bmatrix}\begin{bmatrix}16\\4\\1\end{bmatrix}=2\cdot\begin{bmatrix}16\\4\\1\end{bmatrix}=\begin{bmatrix}32\\8\\2\end{bmatrix}\)

Notice that the ratio of the three age classes is still 16 : 4 : 1, and so the percent of the population in each age class remains the same.


Convolution as Toeplitz matrix

In linear algebra, a Toeplitz matrix or diagonal-constant matrix, named after Otto Toeplitz, is a matrix in which each descending diagonal from left to right is constant. For instance, the following matrix is a Toeplitz matrix:

Given \(2n-1\) numbers \(a_k\) where \(k=-n+1,\;...,\;-1,\;0,\;1,\;...,\;n-1\), a Toeplitz matrix is a matrix which has constant values along negative-sloping diagonals, i.e., a matrix of the form

\(\begin{bmatrix}a_0&a_{-1}&a_{-2}&\cdots&a_{-n+1}\\a_1&a_0&a_{-1}&\ddots&\vdots\\a_2&a_1&a_0&\ddots&a_{-2}\\\vdots&\ddots&\ddots&\ddots&a_{-1}\\a_{n-1}&\cdots&a_2&a_1&a_0\end{bmatrix}\)

Matrix equations of the form

\(\sum_{j=1}^na_{i-j}x_j=y_i\)

\(A_{i,j}=A_{i+1,j+1}=a_{i-j}\)

can be solved with \(\mathcal O(n^2)\) operations. Typical problems modelled by Toeplitz matrices include the numerical solution of certain differential and integral equations (regularization of inverse problems), the computation of splines, time series analysis, signal and image processing, Markov chains, and queuing theory (Bini 1995).



matrix multiplication

Horizontal shear
with m = 1.25.
Reflection through the vertical axis Squeeze mapping
with r = 3/2
Scaling
by a factor of 3/2
Rotation
by π/6 = 30°
\(\begin{bmatrix}1&1.25\\0&1\end{bmatrix}\) \(\begin{bmatrix}-1&0\\0&1\end{bmatrix}\) \(\begin{bmatrix}\frac32&0\\0&\frac23\end{bmatrix}\) \(\begin{bmatrix}\frac32&0\\0&\frac32\end{bmatrix}\) \(\begin{bmatrix}\cos(\frac\pi6)&-\sin(\frac\pi6)\\\sin(\frac\pi6)&\cos(\frac\pi6)\end{bmatrix}\)

Left matrix one row is zero

\(\begin{bmatrix}X&Y&Z\\0&0&0\\A&B&C\end{bmatrix}\begin{bmatrix}1&2&3\\4&5&6\\7&8&9\end{bmatrix}=\begin{bmatrix}\alpha&\beta&\gamma\\0&0&0\\\chi&\psi&\omega\end{bmatrix}\)


Right matrix one column is zero

\(\begin{bmatrix}1&2&3\\4&5&6\\7&8&9\end{bmatrix}\begin{bmatrix}X&0&A\\Y&0&B\\Z&0&C\end{bmatrix}=\begin{bmatrix}\alpha&0&\chi\\\beta&0&\psi\\\gamma&0&\omega\end{bmatrix}\)


Name Example with n = 3
Diagonal matrix \(\begin{bmatrix}a_{11}&0&0\\0&a_{22}&0\\0&0&a_{33}\end{bmatrix}\)
Lower triangular matrix \(\begin{bmatrix}a_{11}&0&0\\a_{21}&a_{22}&0\\a_{31}&a_{32}&a_{33}\end{bmatrix}\)
Upper triangular matrix \(\begin{bmatrix}a_{11}&a_{12}&a_{13}\\0&a_{22}&a_{23}\\0&0&a_{33}\end{bmatrix}\)

\(\begin{bmatrix}x'\\y'\\1\end{bmatrix}=\begin{bmatrix}1&0&t_x\\0&1&t_y\\0&0&1\end{bmatrix}\begin{bmatrix}x\\y\\1\end{bmatrix}\)


Matrix Stretch

\(S_1\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}1+\delta&0\\0&1\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}(1+\delta)x\\y\end{bmatrix}\)

\(S_2\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}1&0\\0&1+\delta\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}x\\(1+\delta)y\end{bmatrix}\)

\(S_{12}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}1+\delta&0\\0&1+\delta\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}(1+\delta)x\\(1+\delta)y\end{bmatrix}\)


Matrix Shrink

\(A\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}\frac1{1+\delta}&0\\0&\frac1{1+\delta}\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}\frac x{1+\delta}\\\frac y{1+\delta}\end{bmatrix}\)


Maxtrix Reflection

\(R_1\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}-1&0\\0&1\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}-x\\y\end{bmatrix}\)

\(R_2\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}1&0\\0&-1\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}x\\-y\end{bmatrix}\)

\(R_{12}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}-1&0\\0&-1\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}-x\\-y\end{bmatrix}\)


Matrix Shear

\(T_1\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}1&\delta\\0&1\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}x+\delta y\\y\end{bmatrix}\)

\(T_2\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}1&0\\-\delta&1\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}x\\-\delta x+y\end{bmatrix}\)

\(T_{12}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}1&\delta\\-\delta&1\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}x+\delta y\\-\delta x+y\end{bmatrix}\)


Matrix Rotation
Counter-clockwise Rotation Matrix in 2D Derivation

Expressing (x, y) in the polar form we have;

\(x=r\cos(\nu)\;-\;(1)\)

\(y=r\sin(\nu)\;-\;(2)\)

Similarly, expressing (x', y') in polar form

x' = r cos (v + θ)

y' = r sin (v + θ)

Expanding the brackets using trigonometric identities we get,

x' = r (cos v.cos θ - sin v.sin θ)

= r cos v.cos θ - r sin v.sin θ

From (1) and (2) we have,

x' = x cos θ - y sin θ -- (3)

y' = r (sin v.cos θ + cos v.sin θ)

= r sin v.cos θ + r cos v.sin θ

y' = y cos θ + x sin θ -- (4)

If we take the help of a 2 x 2 rotation matrix to denote (3) and (4) we get,

\(\begin{bmatrix} x' \\ \\y' \end{bmatrix}\) = \(\begin{bmatrix} cos\theta & -sin\theta \\ \\sin\theta& cos\theta \end{bmatrix}\) \(\begin{bmatrix} x \\ \\y \end{bmatrix}\).

Thus, \(\begin{bmatrix} cos\theta & -sin\theta \\ \\sin\theta& cos\theta \end{bmatrix}\) will be the rotation matrix.


Rotation Matrix in 3D

In 3D space, rotation can occur about the x, y, or z-axis. Such a type of rotation that occurs about any one of the axis is known as a basic or elementary rotation. Given below are the rotation matrices that can rotate a vector through an angle about any particular axis.

P (x, \(\gamma\)) = \(\begin{bmatrix} 1 & 0 & 0\\ 0 & cos\gamma & -sin\gamma \\ 0& sin\gamma & cos\gamma \end{bmatrix}\). This is also known as a roll. It is defined as the counterclockwise rotation of \(\gamma\) about the x axis.

P (y, \(\beta\)) = \(\begin{bmatrix} cos\beta & 0 & sin\beta\\ 0 &1 & 0 \\ -sin\beta & 0 & cos\beta \end{bmatrix}\). Such a matrix is known as a pitch. Here, it represents the counterclockwise rotation of \(\beta\) about the y axis.

P (z, \(\alpha\)) = \(\begin{bmatrix} cos\alpha & -sin\alpha &0 \\ sin\alpha & cos\alpha & 0 \\ 0& 0 & 1 \end{bmatrix}\). This rotation matrix is called a yaw and it is the the counterclockwise rotation of \(\alpha\) about the z axis.

According to the convention, a positive rotation given by angle θ is used to denote a counter-clockwise rotation. However, if we change the signs according to the right-hand rule, we can also represent clockwise rotations. The right-hand rule states that if you curl your fingers around the axis of rotation, where the fingers point to the direction of θ then the thumb points perpendicular to the plane of rotation in the direction of the axis of rotation.

Now if we want to find the new coordinates (x', y', z') of a vector(x, y, z) after rotation about a particular axis we follow the formula given below:

\(\begin{bmatrix} x'\\ y'\\ z' \end{bmatrix}\) = P(x, y or z) \(\begin{bmatrix} x\\ y\\ z \end{bmatrix}\)

Suppose an object is rotated about all three axes, then such a rotation matrix will be a product of the three aforementioned rotation matrices [P (z, \(\alpha\)), P (y, \(\beta\)) and P (x, \(\gamma\))]. The general rotation matrix is represented as follows:

P = \(\begin{bmatrix} cos\alpha & -sin\alpha &0 \\ sin\alpha & cos\alpha & 0 \\ 0& 0 & 1 \end{bmatrix}\) \(\begin{bmatrix} cos\beta & 0 & -sin\beta\\ 0 &1 & 0 \\ sin\beta & 0 & cos\beta \end{bmatrix}\) \(\begin{bmatrix} 1 & 0 & 0\\ 0 & cos\gamma & -sin\gamma \\ 0& sin\gamma & cos\gamma \end{bmatrix}\)

To find the coordinates of the rotated vector about all three axes we multiply the rotation matrix P with the original coordinates of the vector.


Matrix Inverse

\(A^{-1}A=AA^{-1}=I\)

\(S=\begin{bmatrix}1+\delta&0\\0&1\end{bmatrix},\;its\;inverse\;S^{-1}=\begin{bmatrix}\frac1{1+\delta}&0\\0&1\end{bmatrix}\)

\(SS^{-1}=\begin{bmatrix}1+\delta&0\\0&1\end{bmatrix}\begin{bmatrix}\frac1{1+\delta}&0\\0&1\end{bmatrix}=\begin{bmatrix}\frac{1+\delta}{1+\delta}&0\\0&1\end{bmatrix}=\begin{bmatrix}1&0\\0&1\end{bmatrix}\)

\(T_1=\begin{bmatrix}1&\delta\\0&1\end{bmatrix}\;has\;inverse\;T_1^{-1}=\begin{bmatrix}1&-\delta\\0&1\end{bmatrix}\)

\(T_1T_1^{-1}=\begin{bmatrix}1&\delta\\0&1\end{bmatrix}\;\begin{bmatrix}1&-\delta\\0&1\end{bmatrix}=\begin{bmatrix}1&-\delta+\delta\\0&1\end{bmatrix}=\begin{bmatrix}1&0\\0&1\end{bmatrix}\)


Matrix Projection

A square matrix \(P\) is called a projection matrix if it is equal to its square \(P^2=P\)

Orthogonal projection

\(P=\begin{bmatrix}1&0&0\\0&1&0\\0&0&0\end{bmatrix}\)

\(P\begin{bmatrix}x\\y\\z\end{bmatrix}=\begin{bmatrix}x\\y\\0\end{bmatrix}\Rightarrow P^2\begin{bmatrix}x\\y\\z\end{bmatrix}=P\begin{bmatrix}x\\y\\0\end{bmatrix}=\begin{bmatrix}x\\y\\0\end{bmatrix}\)

\(P^2=P\)

Oblique projection

\(P=\begin{bmatrix}0&0\\\alpha&1\end{bmatrix}\)

\(P^2=\begin{bmatrix}0&0\\\alpha&1\end{bmatrix}\begin{bmatrix}0&0\\\alpha&1\end{bmatrix}=\begin{bmatrix}0&0\\\alpha&1\end{bmatrix}=P\)

\(\)



Orthogonal Matrices
Circulant Matrices
Hankel matrix
Moore-Penrose Pseudoinverse

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)


Eigenvalues and Eigenvectors

Introduction to Eigenvalues

[Ex] In a certain town, 30% of the married women get divorced each year,

20% of the single women get married each year.

Suppose initially there are 8k married women and 2k single women.

and the total number of women remains constant.

\(Let\;{\underline w}_0=\begin{bmatrix}8000\\2000\end{bmatrix}\)

\({\underline w}_1=\underbrace{\begin{bmatrix}0.7&0.2\\0.3&0.8\end{bmatrix}}_A\underbrace{\begin{bmatrix}8000\\2000\end{bmatrix}}_{{\underline w}_0}=\underbrace{\begin{bmatrix}6000\\4000\end{bmatrix}}_{{\underline w}_1}\)

\({\underline w}_2=A{\underline w}_1=\underbrace{\begin{bmatrix}0.7&0.2\\0.3&0.8\end{bmatrix}}_A\underbrace{\begin{bmatrix}6000\\4000\end{bmatrix}}_{{\underline w}_1}=\begin{bmatrix}5000\\5000\end{bmatrix}\)

\(=A(A{\underline w}_0)=A^2{\underline w}_0\)

After n years, \({\underline w}_n=A^n{\underline w}_0\)

\({\underline w}_{10}=\begin{bmatrix}4004\\5996\end{bmatrix},\cdots,\;{\underline w}_{20}=\begin{bmatrix}4000\\6000\end{bmatrix}\)

\({\underline w}_{100}=\begin{bmatrix}4000\\6000\end{bmatrix}\leftarrow steady\;state\)

\(If\;\underline w=\begin{bmatrix}4000\\6000\end{bmatrix},\;A\;\underline w=\begin{bmatrix}0.7&0.2\\0.3&0.8\end{bmatrix}\begin{bmatrix}4000\\6000\end{bmatrix}=\begin{bmatrix}4000\\6000\end{bmatrix}=\underline w\)

\(Suppose\;{\underline x}_1=\begin{bmatrix}2\\3\end{bmatrix}\)

\(Then\;A{\underline x}_1=\begin{bmatrix}0.7&0.2\\0.3&0.8\end{bmatrix}\begin{bmatrix}2\\3\end{bmatrix}=\begin{bmatrix}2\\3\end{bmatrix}={\underline x}_1\)

\({\underline x}_2=\begin{bmatrix}-1\\1\end{bmatrix}\)

\(Then\;A{\underline x}_2=\begin{bmatrix}0.7&0.2\\0.3&0.8\end{bmatrix}\begin{bmatrix}-1\\1\end{bmatrix}=\begin{bmatrix}-\frac12\\\frac12\end{bmatrix}=\frac12\begin{bmatrix}-1\\1\end{bmatrix}=\frac12{\underline x}_2\)

\(We\;can\;have\;{\underline w}_0=\begin{bmatrix}8000\\2000\end{bmatrix}=2000\begin{bmatrix}2\\3\end{bmatrix}-4000\begin{bmatrix}-1\\1\end{bmatrix}\)

\(=2000{\underline x}_1-4000{\underline x}_2\)

\(It\;follows\;that\;{\underline w}_1=A{\underline w}_0=A(2000{\underline x}_1-4000{\underline x}_2)\)

\(=2000A{\underline x}_1-4000A{\underline x}_2\)

\(=2000{\underline x}_1-4000(\frac12{\underline x}_2)\)

\({\underline w}_2=A{\underline w}_1=A\lbrack2000{\underline x}_1-4000(\frac12{\underline x}_2)\rbrack\)

\(=2000{\underline x}_1-4000{(\frac12)}^2{\underline x}_2\)

In general, \({\underline w}_n=A^n{\underline w}_0=2000{\underline x}_1-4000{(\frac12)}^n{\underline x}_2\)

\(Let\;n\rightarrow\infty,\;{\underline w}_n\rightarrow2000{\underline x}_1=2000\begin{bmatrix}2\\3\end{bmatrix}=\begin{bmatrix}4000\\6000\end{bmatrix}\)


Def: \(For\;an\;n\times n\;matrix\;A,\;if\;A\underline x=\lambda\underline x\;for\;a\;nonzero\;(n\times1)\;vector\;\underline x\)

\(then\;\lambda\;is\;called\;an\;eigenvalue\;of\;A\)

\(and\;\underline x\;is\;associated\;eigenvector.\)



\(A\underline x=\lambda\underline x\;for\;a\;nonzero\;\underline x\)

\(\Leftrightarrow A\underline x-\lambda\underline x=\underline0\;for\;a\;nonzero\;\underline x\)

\(\Leftrightarrow(A-\lambda I)\underline x=\underline0\;for\;a\;nonzero\;\underline x\)

\(\Leftrightarrow N(A-\lambda I)\neq\{\underline0\}\)

\(\Leftrightarrow A-\lambda I\;is\;\sin gular\)

\(\Leftrightarrow det(A-\lambda I)=0\)


[Ex] \(A=\begin{bmatrix}0.7&0.2\\0.3&0.8\end{bmatrix}\)

\(det(A-\lambda I)=\begin{vmatrix}0.7-\lambda&0.2\\0.3&0.8-\lambda\end{vmatrix}\)

\(=\lambda^2-1.5\lambda+0.56-0.06\)

\(=\lambda^2-\frac32\lambda+\frac12\)

\(a_n\lambda^n+a_{n-1}\lambda^{n-1}+\cdots+a_1\lambda+a_0\;polynomial\)

\(=(\lambda-1)(\lambda-\frac12)=0\)

\(\Rightarrow\lambda=1,\;\frac12\)

\((A-\lambda I)\underline x=\underline0\)

\(For\;\lambda=1,\;A-\lambda I=\begin{bmatrix}-0.3&0.2\\0.3&-0.2\end{bmatrix}\)

\(\therefore{\underline x}_1=\begin{bmatrix}2\\3\end{bmatrix},\;(or\;any\;multiple\;of\;{\underline x}_1)\)

\(For\;\lambda=\frac12,\;A-\lambda I=\begin{bmatrix}0.2&0.2\\0.3&0.3\end{bmatrix}\)

\({\underline x}_2=\begin{bmatrix}-1\\1\end{bmatrix}\)

\(A{\underline x}_1=\begin{bmatrix}0.7&0.2\\0.3&0.8\end{bmatrix}\begin{bmatrix}2\\3\end{bmatrix}=\begin{bmatrix}2\\3\end{bmatrix}={\underline x}_1\)

\(A{\underline x}_2=\begin{bmatrix}0.7&0.2\\0.3&0.8\end{bmatrix}\begin{bmatrix}-1\\1\end{bmatrix}=\frac12\begin{bmatrix}-1\\1\end{bmatrix}=\frac12{\underline x}_2\)

[Ex] \(A=\begin{bmatrix}\frac12&\frac12\\\frac12&\frac12\end{bmatrix}\)

\(A^2=A,\;A^T=A,\;A\;is\;a\;projection\;matrix\)

\(det(A-\lambda I)=\begin{vmatrix}\frac12-\lambda&\frac12\\\frac12&\frac12-\lambda\end{vmatrix}\)

\(=\lambda^2-\lambda+\frac14-\frac14\)

\(=\lambda^2-\lambda=0\)

\(\Rightarrow\lambda=1,\;0\)

\(For\;\lambda=1,\;\begin{bmatrix}-\frac12&\frac12\\\frac12&-\frac12\end{bmatrix}{\underline x}_1=\begin{bmatrix}0\\0\end{bmatrix}\)

\({\underline x}_1=\begin{bmatrix}1\\1\end{bmatrix}\)

\(For\;\underbrace{\lambda=0}_{A\;is\;singular},\;\begin{bmatrix}\frac12&\frac12\\\frac12&\frac12\end{bmatrix}{\underline x}_2=\begin{bmatrix}0\\0\end{bmatrix}\)

\({\underline x}_2=\begin{bmatrix}1\\-1\end{bmatrix}\)

\(A\underline x=\underbrace\lambda_{=0}\underline x=\underline0\)

\(\underbrace{det(A-\lambda I)}_{characteristic\;polynomial}=0\)

\((A-\lambda I)\underline x=\underline0\)

[Ex] \(Q=\begin{bmatrix}0&1\\-1&0\end{bmatrix}\)

\(Q\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}0&1\\-1&0\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}y\\-x\end{bmatrix}\)

\(-90^\circ\;rotation\)

\(det\;(Q-\lambda I)=\begin{vmatrix}-\lambda&1\\-1&-\lambda\end{vmatrix}\)

\(=\lambda^2+1=0\)

\(\Rightarrow\lambda=\pm i\)

No real eigenvalues. Hence there are no real eigenvectors.

\(Q\underline x=\lambda\underline x\)

\(Since\;Q\;is\;a\;rotation\;by\;-90^\circ,\;no\;vector\;Q\underline x\;stays\;in\;the\;same\;direction\;as\;\underline x\)

\(If\;A\underline x=\lambda\underline x,\;then\;A^2\underline x=\lambda A\underline x=\lambda^2\underline x\)

\(Therefore,\;if\;\lambda\;is\;an\;eigenvalue\;of\;A,\;then\;A^2\;has\;an\;eigenvalue\;of\;\lambda^2.\)

\(And\;the\;associated\;eigenvector\;remains\;the\;same.\)

[Ex] \(A=\begin{bmatrix}0.7&0.2\\0.3&0.8\end{bmatrix}\)

\(\lambda_1=1,\;\lambda_2=\frac12\)

\(\lambda_1+\lambda_2=\frac32\)

\(\lambda_1\lambda_2=\frac12\)

\(det\;A=0.56-0.06=\frac12\)

\(\triangleq trace(A)\;\rightarrow\;0.7+0.8=\frac32\)

[Ex] \(A=\begin{bmatrix}\frac12&\frac12\\\frac12&\frac12\end{bmatrix},\;\lambda_1=1,\;\lambda_2=0\)

\(\lambda_1+\lambda_2=1\)

\(\lambda_1\lambda_2=0\)

\(det\;A=0\)

\(trace(A)=\frac12+\frac12=1\)

Claim \(\lambda_1\lambda_2\cdots\lambda_n=det\;A\)

Remark \(det\;A=\pm p_1p_2\cdots p_n\)

Proof \(det\;(A-\lambda I)=\begin{vmatrix}a_{11}-\lambda&a_{12}&\cdots&a_{1n}\\a_{21}&a_{22}-\lambda&\cdots&\vdots\\\vdots&\vdots&\ddots&\vdots\\a_{n1}&\cdots&\cdots&a_{nn}-\lambda\end{vmatrix}\)

\((a_{11}-\lambda)(a_{22}-\lambda)\cdots(a_{nn}-\lambda)=(\lambda_1-\lambda)(\lambda_2-\lambda)\cdots(\lambda_n-\lambda)\)

\(Putting\;\lambda=0\;gives\;det\;A=\lambda_1\lambda_2\cdots\lambda_n\)

Claim \(\lambda_1+\lambda_2+\cdots+\lambda_n=a_{11}+a_{22}+\cdots+a_{nn}\triangleq trace(A)\)

\(compare\;the\;coefficients\;of\;\lambda^{n-1}\;on\;both\;sides.\)

\(The\;only\;term\;in\;det(A-\lambda I)\;containing\;the\;\lambda^{n-1}\;is\;(a_{11}-\lambda)(a_{22}-\lambda)\cdots(a_{nn}-\lambda)\)

\(We\;can\;have\;{(-1)}^{n-1}(a_{11}+a_{22}+\cdots+a_{nn})=\;{(-1)}^{n-1}(\lambda_1+\lambda_2+\cdots+\lambda_n)\;which\;gives\;the\;desired\;result.\)

\(\begin{bmatrix}\cdots&\cdots&\cdots\\\cdots&\cdots&\cdots\\\cdots&\cdots&\cdots\end{bmatrix}\Rightarrow\begin{bmatrix}\ddots&&\\&\ddots&\\&&\ddots\end{bmatrix}\)

\(\begin{bmatrix}1&&\\&2&\\&&3\end{bmatrix}^2=\begin{bmatrix}1^2&&\\&2^2&\\&&3^2\end{bmatrix}\)

\(S^{-1}AS=\Lambda\)

\(=\begin{bmatrix}\lambda_1&0&0&0\\0&\lambda_2&0&0\\0&0&\ddots&0\\0&0&0&\lambda_n\end{bmatrix}\)

\(\Rightarrow A=S\Lambda S^{-1}\)

\(A^n=\underbrace{(S\Lambda S^{-1})(S\Lambda S^{-1})\cdots(S\Lambda S^{-1})}_n\)

\(=S\Lambda^nS^{-1}\)

Eigenvalue & Eigenvectors example

There are three viruses in a lab

\(A=\begin{bmatrix}1&3&1\\2&2&1\\1&1&3\end{bmatrix}\)

\(\lambda=-1,\;2,\;5\)

\(\lambda_1=-1,\;\overset\rightharpoonup v=\begin{bmatrix}-11\\7\\1\end{bmatrix}\)

\(\lambda_2=2,\;\overset\rightharpoonup v=\begin{bmatrix}-1\\-1\\2\end{bmatrix}\)

\(\lambda_2=5,\;\overset\rightharpoonup v=\begin{bmatrix}1\\1\\1\end{bmatrix}\)


\(A=\begin{bmatrix}-11&-1&1\\7&-1&1\\1&2&1\end{bmatrix}\begin{bmatrix}-1&0&0\\0&2&0\\0&0&5\end{bmatrix}\begin{bmatrix}-\frac1{18}&\frac1{18}&0\\-\frac19&-\frac29&\frac13\\\frac5{18}&\frac7{18}&\frac13\end{bmatrix}\)


\(\begin{bmatrix}1&3&1\\2&2&1\\1&1&3\end{bmatrix}\begin{bmatrix}1&3&1\\2&2&1\\1&1&3\end{bmatrix}\begin{bmatrix}1&3&1\\2&2&1\\1&1&3\end{bmatrix}\begin{bmatrix}1&3&1\\2&2&1\\1&1&3\end{bmatrix}\begin{bmatrix}5\\3\\2\end{bmatrix}=\begin{bmatrix}2024\\2022\\1996\end{bmatrix}\)

\(\begin{bmatrix}-11&-1&1\\\:\:\:7&-1&1\\\:\:\:1&2&1\end{bmatrix}\begin{bmatrix}\left(-1\right)^4&0&0\\\:\:0&2^4&0\\\:\:0&0&5^4\end{bmatrix}\begin{bmatrix}-\frac1{18}&\frac1{18}&0\\\:\:\:\frac{-1}9&\frac{-2}9&\frac13\\\:\:\:\frac5{18}&\frac7{18}&\frac13\end{bmatrix}\begin{bmatrix}5\\\:3\\\:2\end{bmatrix}=\begin{bmatrix}2024\\2022\\1996\end{bmatrix}\)


\(\begin{bmatrix}1&3&1\\2&2&1\\1&1&3\end{bmatrix}\begin{bmatrix}1&3&1\\2&2&1\\1&1&3\end{bmatrix}\begin{bmatrix}1&3&1\\2&2&1\\1&1&3\end{bmatrix}\begin{bmatrix}5\\3\\2\end{bmatrix}=\begin{bmatrix}406\\408\\394\end{bmatrix}\)

\(\begin{bmatrix}-11&-1&1\\7&-1&1\\1&2&1\end{bmatrix}\begin{bmatrix}\left(-1\right)^3&0&0\\0&2^3&0\\0&0&5^3\end{bmatrix}\begin{bmatrix}-\frac1{18}&\frac1{18}&0\\\frac{-1}9&\frac{-2}9&\frac13\\\frac5{18}&\frac7{18}&\frac13\end{bmatrix}\begin{bmatrix}5\\3\\2\end{bmatrix}=\begin{bmatrix}406\\408\\394\end{bmatrix}\)


\(Find\;eigenvalues\;for\:\begin{bmatrix}4&4&2&3&-2\\\:0&1&-2&-2&2\\\:6&12&11&2&-4\\\:9&20&10&10&-6\\\:15&28&14&5&-3\end{bmatrix}:\)

\(\lambda=3\;with\;multiplicity\;of\;2,\;\lambda=5\;with\;multiplicity\;of\;2,\;\lambda=7\)

\(Eigenvectors\;for\;\lambda=3:\begin{bmatrix}4\\-3\\1\\2\\0\end{bmatrix},\begin{bmatrix}-2\\1\\1\\0\\2\end{bmatrix}\)

\(Eigenvectors\;for\;\lambda=5:\begin{bmatrix}-2\\2\\0\\-2\\2\end{bmatrix},\begin{bmatrix}0\\-1\\2\\0\\0\end{bmatrix}\)

\(Eigenvector\;for\;\lambda=7:\begin{bmatrix}1\\0\\0\\3\\3\end{bmatrix}\)



Diagonalizing a Matrix

\(Suppose\;an\;n\times n\;matrix\;A\;has\;n\;independent\;eigenvectors\;{\underline x}_1,{\underline x}_2,\cdots,{\underline x}_n.\)

\(\left\{\begin{array}{l}A{\underline x}_1=\lambda_1{\underline x}_1\\A{\underline x}_2=\lambda_2{\underline x}_2\\\vdots\\A{\underline x}_n=\lambda_n{\underline x}_n\end{array}\right.\)

\(A{\begin{bmatrix}{\underline x}_1&{\underline x}_2&\cdots&{\underline x}_n\end{bmatrix}}_{n\times n}=\begin{bmatrix}\lambda_1{\underline x}_1&\lambda_2{\underline x}_2&\cdots&\lambda_n{\underline x}_n\end{bmatrix}\)

\(={\begin{bmatrix}{\underline x}_1&{\underline x}_2&\cdots&{\underline x}_n\end{bmatrix}}_{n\times n}{\begin{bmatrix}\lambda_1&0&\cdots&0\\0&\lambda_2&\ddots&\vdots\\\vdots&\ddots&\ddots&0\\0&\cdots&0&\lambda_n\end{bmatrix}}_{n\times n}\)

\(=S\Lambda\)

\(\Rightarrow AS=S\Lambda\)

\(\Rightarrow S^{-1}AS=\Lambda\)

\(or\;A=S\Lambda S^{-1}\)

\(Then\;A^2=AA=(S\Lambda\cancel{S^{-1}})(\cancel S\Lambda S^{-1})=S\Lambda^2S^{-1}\)

\(Similarly\;A^k=AA\cdots A=(S\Lambda\cancel{S^{-1}})(\cancel S\Lambda\cancel{S^{-1}})\cdots(\cancel S\Lambda S^{-1})=S\Lambda^kS^{-1}\)


[Ex] \(A=\begin{bmatrix}0.7&0.2\\0.3&0.8\end{bmatrix}\)

\(\lambda_1=1,\;{\underline x}_1=\begin{bmatrix}2\\3\end{bmatrix}\)

\(\lambda_2=\frac12,\;{\underline x}_2=\begin{bmatrix}-1\\1\end{bmatrix}\)

\(S=\begin{bmatrix}2&-1\\3&1\end{bmatrix}\)

\(S^{-1}=\frac15\begin{bmatrix}1&-3\\1&2\end{bmatrix}^T=\begin{bmatrix}\frac15&\frac15\\\frac{-3}5&\frac25\end{bmatrix}\)

\(A^k=S\Lambda^kS^{-1}=\begin{bmatrix}2&-1\\3&1\end{bmatrix}\begin{bmatrix}1&0\\0&\frac12\end{bmatrix}^k\begin{bmatrix}\frac15&\frac15\\\frac{-3}5&\frac25\end{bmatrix}\)

\(=\begin{bmatrix}2&-1\\3&1\end{bmatrix}\begin{bmatrix}1^k&0\\0&{(\frac12)}^k\end{bmatrix}\begin{bmatrix}\frac15&\frac15\\\frac{-3}5&\frac25\end{bmatrix}\)

\(=\begin{bmatrix}2&-1\\3&1\end{bmatrix}\begin{bmatrix}1&0\\0&{(\frac12)}^k\end{bmatrix}\begin{bmatrix}\frac15&\frac15\\\frac{-3}5&\frac25\end{bmatrix}\)

\(=\begin{bmatrix}2&-1\\3&1\end{bmatrix}\begin{bmatrix}\frac15&\frac15\\\frac{-3}5\cdot{(\frac12)}^k&\frac25\cdot{(\frac12)}^k\end{bmatrix}\)

\(=\begin{bmatrix}\frac25+\frac35{(\frac12)}^k&\frac25-\frac25{(\frac12)}^k\\\frac35-\frac35{(\frac12)}^k&\frac35+\frac25{(\frac12)}^k\end{bmatrix}\)

\(As\;k\rightarrow\infty,\;{(\frac12)}^k\rightarrow0\)

\(A^k=\begin{bmatrix}2&-1\\3&1\end{bmatrix}\begin{bmatrix}1&0\\0&0\end{bmatrix}\begin{bmatrix}\frac15&\frac15\\\frac{-3}5&\frac25\end{bmatrix}=\begin{bmatrix}\frac25&\frac25\\\frac35&\frac35\end{bmatrix}\)

\(\begin{bmatrix}\frac25&\frac25\\\frac35&\frac35\end{bmatrix}\begin{bmatrix}a\\b\end{bmatrix}=\begin{bmatrix}\frac25(a+b)\\\frac35(a+b)\end{bmatrix}\)

Claim: If \(\lambda_1,\;\lambda_2,\;\cdots,\;\lambda_n\) are all distinct, then \({\underline x}_1,\;{\underline x}_2,\;\dots,\;{\underline x}_n\) are linearly independent.

Proof: Suppose \(c_1{\underline x}_1+c_2{\underline x}_2=\underline0\)

\(\Rightarrow A(c_1{\underline x}_1+c_2{\underline x}_2)=A\underline0=\underline0\)

\(\Rightarrow c_1\lambda_1{\underline x}_1+c_2\lambda_2{\underline x}_2=\underline0\)

\(We\;can\;also\;know\;c_1\lambda_2{\underline x}_1+c_2\lambda_2{\underline x}_2=\underline0\)

\(\Rightarrow c_1(\lambda_1-\lambda_2){\underline x}_1=\underline0\)

\(since\;\lambda_1\neq\lambda_2\;and\;{\underline x}_1\neq\underline0,\;c_1=0\)

\(similarly,\;c_2=0,\;therefore\;{\underline x}_1\;and\;{\underline x}_2\;are\;independent.\)

\(proof\;extends\;directly\;to\;n\;eigenvectors\)

\(suppose\;c_1{\underline x}_1+c_2{\underline x}_2+\cdots+c_n{\underline x}_n=\underline0\)

\(\Rightarrow c_1\lambda_1{\underline x}_1+c_2\lambda_2{\underline x}_2+\cdots+c_n\lambda_n{\underline x}_n=\underline0\)

\(removing\;{\underline x}_n\;first,\;and\;then\;{\underline x}_{n-1},\;eventually\;only\;{\underline x}_1\;is\;left.\)

\((\lambda_1-\lambda_2)(\lambda_1-\lambda_3)\cdots(\lambda_1-\lambda_n)c_1{\underline x}_1=\underline0\)

\(which\;forces\;c_1=0\)

\(similarly,\;every\;c_i=0\)

Remark An \(n\times n\) matrix that has n different eigenvalues (no repeated eigenvalues) must be diagonalizable.

Question: When does \(A^k\) become zero matrix?

Answer: All \(\left|\lambda\right|<1\)

\(A^k=S\Lambda^kS^{-1}=S\begin{bmatrix}\lambda_1^k&0&\cdots&0\\0&\lambda_2^k&\ddots&\vdots\\\vdots&\ddots&\ddots&0\\0&\cdots&0&\lambda_n^k\end{bmatrix}S^{-1}\)


Diagonalizable Matrices

Claim An \(n\times n\) matrix A is diagonalizable iff A has n linearly independent eigenvectors.

Proof \(''\Leftarrow''\;Proved\;previously\)

\(''\Rightarrow''\;If\;A\;is\;diagonalizable,\;then\;there\;exists\;an\;invertible\;matrix\;S\)

\(and\;a\;diagonal\;matrix\;\Lambda\;such\;that\;S^{-1}AS=\Lambda\)

\(S=\begin{bmatrix}{\underline x}_1&{\underline x}_2&\cdots&{\underline x}_n\end{bmatrix}\)

\(\Lambda=\begin{bmatrix}\lambda_1&0&\cdots&0\\0&\lambda_2&\ddots&\vdots\\\vdots&\ddots&\ddots&0\\0&\cdots&0&\lambda_n\end{bmatrix}\)

\(Then\;AS=S\Lambda\)

\(\Rightarrow A\begin{bmatrix}{\underline x}_1&{\underline x}_2&\cdots&{\underline x}_n\end{bmatrix}=\begin{bmatrix}{\underline x}_1&{\underline x}_2&\cdots&{\underline x}_n\end{bmatrix}\begin{bmatrix}\lambda_1&0&\cdots&0\\0&\lambda_2&\ddots&\vdots\\\vdots&\ddots&\ddots&0\\0&\cdots&0&\lambda_n\end{bmatrix}\)

\(\Rightarrow A{\underline x}_i=\lambda_i{\underline x}_i,\;for\;i=1,2,\cdots,n.\)

\(We\;can\;obtain\;that\;\lambda_i\;is\;an\;eigenvalue\;and\;{\underline x}_i\;is\;the\;associated\;eigenvector\;for\;i=1,2,\cdots,n.\)

\(Since\;S\;is\;invertible,\;{\underline x}_1,\;{\underline x}_2,\;\dots,\;{\underline x}_n\;are\;independent\)

Suppose \(\lambda\) is an eigenvalue.

(1) Algebraic Multiplicity (AM) count the repetitions of \(\lambda\) among the eigenvalues.

Look at all the roots of \(det(A-\lambda I)\)

(2) Geometric Multiplicity (GM) This is the dimension of \(\mathcal N(A-\lambda I)\)

\(\mathcal N(A-\lambda I)\) is called the eigenspace corresponding to \(\lambda\)

Claim: For every distinct eigenvalue, \(GM\leq AM\)

Claim: \(suppose\;A\;is\;n\times n\;and\;A\;has\;k\;disstinct\;eigenvalues\)

\(A\;is\;diagonalizable\;iff\)

(1) \(\sum_{i=1}^kAM_i=n\)

(2) \(For\;every\;distinct\;eigenvalue\;\lambda_i,\;GM_i=AM_i\)


[Ex] \(A=\begin{bmatrix}2&0&0\\0&4&0\\1&0&2\end{bmatrix},\;B=\begin{bmatrix}2&0&0\\-1&4&0\\-3&6&2\end{bmatrix}\)

\(det(A-\lambda I)=(4-\lambda){(2-\lambda)}^2=det(B-\lambda I)\)

\(\lambda=4,\;2,\;2\)

\(For\;A,\;\lambda_1=4,\;A-\lambda_1I=\begin{bmatrix}-2&0&0\\0&0&0\\1&0&-2\end{bmatrix}\)

\({\underline x}_1=\begin{bmatrix}0\\1\\0\end{bmatrix},\;GM_1=AM_1=1\)

\(\lambda_2=2,\;A-\lambda_2I=\begin{bmatrix}0&0&0\\0&2&0\\1&0&0\end{bmatrix},\;{\underline x}_2=\begin{bmatrix}0\\0\\1\end{bmatrix}\)

\(GM_2=1<AM_2=2\)

\(\therefore A\;is\;not\;diagonalizable\)

\(For\;B,\;\lambda_1=4,\;B-\lambda_1I=\begin{bmatrix}-2&0&0\\-1&0&0\\-3&6&-2\end{bmatrix},\;{\underline x}_1=\begin{bmatrix}0\\1\\3\end{bmatrix}\)

\(GM_1=1=AM_1\)

\(\lambda_2=2,\;B-\lambda_2I=\begin{bmatrix}0&0&0\\-1&2&0\\-3&6&0\end{bmatrix},\;{\underline x}_2=\begin{bmatrix}0\\0\\1\end{bmatrix},\;\begin{bmatrix}2\\1\\0\end{bmatrix}\)

\(GM_2=2=AM_2\)

\(\therefore B\;is\;diagonalizable\)


Solving Difference Equation

\(F_{k+2}=F_{k+1}+F_k,\;k=0\)

\(F_0=0,\;F_1=1\)

\(F_2=F_1+F_0=1\)

\(F_3=F_2+F_1=1+1=2\)

\(0,\;1,\;1,\;2,\;3,\;5,\;8,\;13,\;\cdots\)

Fibonacci numbers (sequence)

We would like to find \(F_k\) for general \(k\).

\(Let\;{\underline u}_k=\begin{bmatrix}F_{k+1}\\F_k\end{bmatrix}\)

The relation between \({\underline u}_{k+1}\) and \({\underline u}_k\) is

\({\underline u}_{k+1}=\begin{bmatrix}F_{k+2}\\F_{k+1}\end{bmatrix}=\begin{bmatrix}1&1\\1&0\end{bmatrix}\begin{bmatrix}F_{k+1}\\F_k\end{bmatrix}=A{\underline u}_k\)

\(G_k\;can\;be\;found\;as\;the\;nth\;component\;of\;{\underline u}_k\)

\(We\;want\;to\;find\;F_k\;for\;general\;k.\)

\(Let\;{\underline u}_k=\begin{bmatrix}F_{k+1}\\F_k\end{bmatrix}\)

\(The\;relation\;between\;{\underline u}_{k+1}=\begin{bmatrix}F_{k+2}\\F_{k+1}\end{bmatrix}\;and\;{\underline u}_k=\begin{bmatrix}F_{k+1}\\F_k\end{bmatrix}\;can\;be\;found\;as\)

\({\underline u}_{k+1}=\begin{bmatrix}F_{k+2}\\F_{k+1}\end{bmatrix}=\begin{bmatrix}F_{k+1}+F_k\\F_k\end{bmatrix}\)

\(=\underbrace{\begin{bmatrix}1&1\\1&0\end{bmatrix}}_A\begin{bmatrix}F_{k+1}\\F_k\end{bmatrix}\)

\(=A{\underline u}_k\)

\(Then\;{\underline u}_k=A{\underline u}_{k-1}\)

\(=AA{\underline u}_{k-2}\)

\(=A^2{\underline u}_{k-2}\)

\(=\cdots\)

\(=A^k{\underline u}_0\)

\(Let\;det(A-\lambda I)=\begin{bmatrix}1-\lambda&1\\1&-\lambda\end{bmatrix}\)

\(=\lambda^2-\lambda-1=0\)

\(\therefore\lambda=\frac{1\pm\sqrt5}2,\;\lambda_1=\frac{1+\sqrt5}2,\;\lambda_2=\frac{1-\sqrt5}2\)

\(For\;\lambda=\lambda_1,\)

\(A-\lambda_1I=\begin{bmatrix}\frac{1-\sqrt5}2&1\\1&-\frac{1+\sqrt5}2\end{bmatrix}\)

\(we\;can\;have\;{\underline x}_1=\begin{bmatrix}\frac{1+\sqrt5}2\\1\end{bmatrix}=\begin{bmatrix}\lambda_1\\1\end{bmatrix}\)

\(For\;\lambda=\lambda_2,\)

\(A-\lambda_2I=\begin{bmatrix}\frac{1+\sqrt5}2&1\\1&-\frac{1-\sqrt5}2\end{bmatrix}\)

\(we\;can\;have\;{\underline x}_2=\begin{bmatrix}\frac{1-\sqrt5}2\\1\end{bmatrix}=\begin{bmatrix}\lambda_2\\1\end{bmatrix}\)

\(Therefore\;A=S\Lambda S^{-1}\)

\(=\begin{bmatrix}\lambda_1&\lambda_2\\1&1\end{bmatrix}\begin{bmatrix}\lambda_1&0\\0&\lambda_2\end{bmatrix}\begin{bmatrix}\lambda_1&\lambda_2\\1&1\end{bmatrix}^{-1}\)

\(1.\;Write\;{\underline u}_0\;as\;a\;linear\;combination\;of\;{\underline x}_1\;and\;{\underline x}_2\)

\({\underline u}_0=\begin{bmatrix}F_1\\F_0\end{bmatrix}=\begin{bmatrix}1\\0\end{bmatrix}=c_1\begin{bmatrix}\lambda_1\\1\end{bmatrix}+c_2\begin{bmatrix}\lambda_2\\1\end{bmatrix}\)

\(=\begin{bmatrix}\lambda_1&\lambda_2\\1&1\end{bmatrix}\begin{bmatrix}c_1\\c_2\end{bmatrix}\)

\({\underline u}_0=S\begin{bmatrix}c_1\\c_2\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}c_1\\c_2\end{bmatrix}=S^{-1}{\underline u}_0\)

\(=\begin{bmatrix}\lambda_1&\lambda_2\\1&1\end{bmatrix}^{-1}\begin{bmatrix}1\\0\end{bmatrix}\)

\(=\frac{\begin{bmatrix}1&-\lambda_2\\-1&\lambda_1\end{bmatrix}}{\lambda_1-\lambda_2}\begin{bmatrix}1\\0\end{bmatrix}\)

\(=\frac1{\lambda_1-\lambda_2}\begin{bmatrix}1\\-1\end{bmatrix}\)

\({\underline u}_0=S\begin{bmatrix}c_1\\c_2\end{bmatrix}=\begin{bmatrix}{\underline x}_1&{\underline x}_2\end{bmatrix}\begin{bmatrix}c_1\\c_2\end{bmatrix}\)

\(=\begin{bmatrix}{\underline x}_1&{\underline x}_2\end{bmatrix}\frac1{\lambda_1-\lambda_2}\begin{bmatrix}1\\-1\end{bmatrix}\)

\(\therefore{\underline u}_0=\frac1{\lambda_1-\lambda_2}({\underline x}_1-{\underline x}_2)\)

\({\underline u}_k=A^k{\underline u}_0\)

\(=A^k\frac1{\lambda_1-\lambda_2}({\underline x}_1-{\underline x}_2)\)

\(=\frac{\lambda_1^k}{\lambda_1-\lambda_2}{\underline x}_1-\frac{\lambda_2^k}{\lambda_1-\lambda_2}{\underline x}_2\)

\(\therefore{\underline u}_k=\frac{\lambda_1^k}{\lambda_1-\lambda_2}\begin{bmatrix}\lambda_1\\1\end{bmatrix}-\frac{\lambda_2^k}{\lambda_1-\lambda_2}\begin{bmatrix}\lambda_2\\1\end{bmatrix}\)

\(=\begin{bmatrix}F_{k+1}\\F_k\end{bmatrix}\)

\(\Rightarrow F_k=\frac1{\lambda_1-\lambda_2}(\lambda_1^k-\lambda_2^k)\)

\[F_k=\frac1{\sqrt5}\lbrack{(\frac{1+\sqrt5}2)}^k-{(\frac{1-\sqrt5}2)}^k\rbrack,\;for\;k\geq0,\]

\(F_{100}=\frac1{\sqrt5}\lbrack{(\frac{1+\sqrt5}2)}^{100}-{(\frac{1-\sqrt5}2)}^{100}\rbrack\)

For example, \(\simeq3.54\cdot10^{20}\)

\(For\;large\;k,\;F_k\;is\;the\;nearest\;ineger\;to\;{(\frac{1+\sqrt5}2)}^k\)

\(\frac{F_{k+1}}{F_k}\;is\;very\;close\;to\;\frac{1+\sqrt5}2\approx1.618,\;when\;k\;is\;large\)

golden ratio or gloden mean



Fast matrix multiplication algorithm for large matrices



Strassen algorithm
Matrix multiplication tensor and algorithms


a, Tensor \({\mathcal T}_2\) representing the multiplication of two 2 × 2 matrices. Tensor entries equal to 1 are depicted in purple, and 0 entries are semi-transparent. The tensor specifies which entries from the input matrices to read, and where to write the result. For example, as c1 = a1b1 + a2b3, tensor entries located at (a1, b1, c1) and (a2, b3, c1) are set to 1. b, Strassen's algorithm2 for multiplying 2 × 2 matrices using 7 multiplications. c, Strassen's algorithm in tensor factor representation. The stacked factors U, V and W (green, purple and yellow, respectively) provide a rank-7 decomposition of \({\mathcal T}_2\). The correspondence between arithmetic operations (b) and factors (c) is shown by using the aforementioned colours.






using only 7 multiplications (one for each \(M_{k}\)) instead of 8.

\(({\color[rgb]{0.68, 0.46, 0.12}a}{\color[rgb]{0.0, 0.0, 1.0}b})({\color[rgb]{0.68, 0.46, 0.12}c}{\color[rgb]{0.0, 0.0, 1.0}d})\Rightarrow({\color[rgb]{0.68, 0.46, 0.12}a}\cdot10+{\color[rgb]{0.0, 0.0, 1.0}b})({\color[rgb]{0.68, 0.46, 0.12}c}\cdot10+{\color[rgb]{0.0, 0.0, 1.0}d})\)

\(={\color[rgb]{0.68, 0.46, 0.12}a}{\color[rgb]{0.68, 0.46, 0.12}c}\cdot10^2+\underbrace{({\color[rgb]{0.68, 0.46, 0.12}a}{\color[rgb]{0.0, 0.0, 1.0}d}+{\color[rgb]{0.0, 0.0, 1.0}b}{\color[rgb]{0.68, 0.46, 0.12}c})}_{=({\color[rgb]{0.68, 0.46, 0.12}a}+{\color[rgb]{0.0, 0.0, 1.0}b})({\color[rgb]{0.68, 0.46, 0.12}c}+{\color[rgb]{0.0, 0.0, 1.0}d})-{\color[rgb]{0.68, 0.46, 0.12}a}{\color[rgb]{0.68, 0.46, 0.12}c}-{\color[rgb]{0.0, 0.0, 1.0}b}{\color[rgb]{0.0, 0.0, 1.0}d}}\cdot10+{\color[rgb]{0.0, 0.0, 1.0}b}{\color[rgb]{0.0, 0.0, 1.0}d}\)

\(={\color[rgb]{0.68, 0.46, 0.12}a}{\color[rgb]{0.68, 0.46, 0.12}c}\cdot10^2+({\color[rgb]{0.68, 0.46, 0.12}a}{\color[rgb]{0.0, 0.0, 1.0}d}+{\color[rgb]{0.0, 0.0, 1.0}b}{\color[rgb]{0.68, 0.46, 0.12}c})\cdot10+{\color[rgb]{0.0, 0.0, 1.0}b}{\color[rgb]{0.0, 0.0, 1.0}d}\)

\(({\color[rgb]{0.68, 0.46, 0.12}a}+{\color[rgb]{0.0, 0.0, 1.0}b})({\color[rgb]{0.68, 0.46, 0.12}c}+{\color[rgb]{0.0, 0.0, 1.0}d})-{\color[rgb]{0.68, 0.46, 0.12}a}{\color[rgb]{0.68, 0.46, 0.12}c}-{\color[rgb]{0.0, 0.0, 1.0}b}{\color[rgb]{0.0, 0.0, 1.0}d}\)

Normal Algorithm: \(n^{3}\)

Volker Strassen (1969): \(n^{2.807}\)

Arnold Schönhage (1981): \(n^{2.522}\)

Alman, Williams (2020): \(n^{2.3728596}\)

Solving Difference Equations

In general, \(G_{k+n}=a_1G_{k+n-1}+a_2G_{k+n-2}+\cdots+a_nG_k\)

nth-order linear homogeneous difference equation with constant coefficients and initial conditions

\(G_{n-1},\;G_{n-2},\;\cdots,\;G_1,\;G_0\)

\(S_n=1+2+\cdots+n^2\)

\(S_n-S_{n-1}=n^2,\;S_1=1\)

\(Let\;{\underline u}_k=\begin{bmatrix}G_{k+n-1}\\\vdots\\G_{k+1}\\G_k\end{bmatrix}\)

\(Then\;{\underline u}_{k+1}=\begin{bmatrix}G_{k+n}\\\vdots\\G_{k+2}\\G_{k+1}\end{bmatrix}=\begin{bmatrix}a_1&a_2&\cdots&a_{n-1}&a_n\\1&0&\cdots&0&0\\0&1&\ddots&\vdots&0\\\vdots&\ddots&\ddots&0&\vdots\\0&\cdots&0&1&0\end{bmatrix}\begin{bmatrix}G_{k+n-1}\\\vdots\\G_{k+1}\\G_k\end{bmatrix}\)

\(=A{\underline u}_k,\;(A\;is\;n\times n)\)

[Ex] \(G_{k+3}=G_{k+2}+G_{k+1}+2G_k\)

\(Then\;{\underline u}_{k+1}=\begin{bmatrix}G_{k+3}\\G_{k+2}\\G_{k+1}\end{bmatrix}=\begin{bmatrix}1&1&2\\1&0&0\\0&1&0\end{bmatrix}\begin{bmatrix}G_{k+2}\\G_{k+1}\\G_k\end{bmatrix}\)

\(Diagonalize\;A\;(if\;A\;is\;diagonalizable)\)

\(A=S\Lambda S^{-1}\)

\(=\begin{bmatrix}{\underline x}_1&{\underline x}_2&\cdots&{\underline x}_n\end{bmatrix}\begin{bmatrix}\lambda_1&0&\cdots&0\\0&\lambda_2&\ddots&\vdots\\\vdots&\ddots&\ddots&0\\0&\cdots&0&\lambda_n\end{bmatrix}\begin{bmatrix}{\underline x}_1&{\underline x}_2&\cdots&{\underline x}_n\end{bmatrix}^{-1}\)

\(2.\;write\;{\underline u}_0=\begin{bmatrix}G_{n-1}\\G_{n-2}\\\vdots\\G_0\end{bmatrix}=c_1{\underline x}_1+c_2{\underline x}_2+\cdots+c_n{\underline x}_n,\;where\;\underline c=S^{-1}{\underline u}_0\)

\(c_1{\underline x}_1+c_2{\underline x}_2+\cdots+c_n{\underline x}_n=S\underline c\)

\(3.\;{\underline u}_k=A^k{\underline u}_0\)

\(=A^k(c_1{\underline x}_1+c_2{\underline x}_2+\cdots+c_n{\underline x}_n)\)

\(=c_1\lambda_1^k{\underline x}_1+c_2\lambda_2^k{\underline x}_2+\cdots+c_n\lambda_n^k{\underline x}_n\)

\(or\;{\underline u}_k=A^k{\underline u}_0\)

\(=(S\Lambda^kS^{-1}){\underline u}_0\)

\(=S\Lambda^k\underline c\)

\(=\begin{bmatrix}{\underline x}_1&{\underline x}_2&\cdots&{\underline x}_n\end{bmatrix}\begin{bmatrix}\lambda_1^k&0&\cdots&0\\0&\lambda_2^k&\ddots&\vdots\\\vdots&\ddots&\ddots&0\\0&\cdots&0&\lambda_n^k\end{bmatrix}\begin{bmatrix}c_1\\c_2\\\vdots\\c_n\end{bmatrix}\)

\(4.\;G_k\;can\;be\;found\;as\;the\;nth\;component\;of\;{\underline u}_k\)


\({\underline u}_{k+1}=A{\underline u}_k\;with\;{\underline u}_0\)

1. \(A=S\Lambda S^{-1}\)

\(=\begin{bmatrix}{\underline x}_1&{\underline x}_2&\cdots&{\underline x}_n\end{bmatrix}\begin{bmatrix}\lambda_1&0&\cdots&0\\0&\lambda_2&\ddots&\vdots\\\vdots&\ddots&\ddots&0\\0&\cdots&0&\lambda_n\end{bmatrix}\begin{bmatrix}{\underline x}_1&{\underline x}_2&\cdots&{\underline x}_n\end{bmatrix}^{-1}\)

2. \({\underline u}_0=c_1{\underline x}_1+c_2{\underline x}_2+\cdots+c_n{\underline x}_n\)

\({\underline u}_0=S\underline c\;\Rightarrow\;\underline c=S^{-1}{\underline u}_0\)

3. \({\underline u}_k=A^k{\underline u}_0\)

\(=S\Lambda^kS^{-1}{\underline u}_0\)

\(=S\Lambda^k\underline c\)

\(=\begin{bmatrix}{\underline x}_1&{\underline x}_2&\cdots&{\underline x}_n\end{bmatrix}\begin{bmatrix}\lambda_1^k&0&\cdots&0\\0&\lambda_2^k&\ddots&\vdots\\\vdots&\ddots&\ddots&0\\0&\cdots&0&\lambda_n^k\end{bmatrix}\begin{bmatrix}c_1\\c_2\\\vdots\\c_n\end{bmatrix}\)

\(=c_1\lambda_1^k{\underline x}_1+c_2\lambda_2^k{\underline x}_2+\cdots+c_n\lambda_n^k{\underline x}_n\)

Solving Differential Equations

\(\begin{bmatrix}\frac{dy}{dt}\\\frac{dz}{dt}\end{bmatrix}=\frac{d\underline u}{dt}=A\underline u\)

\(=\begin{bmatrix}-2&1\\1&-2\end{bmatrix}\begin{bmatrix}y\\z\end{bmatrix}\)

\(For\;t=0,\;{\underline u}_0=\begin{bmatrix}2\\0\end{bmatrix}\)

\(\left\{\begin{array}{l}\frac{dy}{dt}=-2y+z\\\frac{dz}{dt}=y-2z\end{array}\right.,\;y_0=2,\;z_o=0\)

\(suppose\;\lambda_1,\;\lambda_2\;are\;eigenvalues\;of\;A\;and\;{\underline x}_1,\;{\underline x}_2\;are\;the\;associated\;eigenvectors.\)

\(Then\;\frac d{dt}(e^{\lambda_1t}{\underline x}_1)=\lambda_1e^{\lambda_1t}{\underline x}_1=e^{\lambda_1t}(\lambda_1{\underline x}_1)\)

\(=e^{\lambda_1t}A{\underline x}_1\)

\(=A(e^{\lambda_1t}{\underline x}_1)\)

\(and,\;similarly,\;\frac d{dt}(e^{\lambda_2t}{\underline x}_2)=A(e^{\lambda_2t}{\underline x}_2)\)

\(Both\;e^{\lambda_1t}{\underline x}_1\;and\;e^{\lambda_2t}{\underline x}_2\;are\;solution\;to\;\frac{d\underline u}{dt}=A\underline u\)

\(\Rightarrow\underline u=c_1e^{\lambda_1t}{\underline x}_1+c_2e^{\lambda_2t}{\underline x}_2\)


\(det(A-\lambda I)=\begin{vmatrix}-2-\lambda&1\\1&-2-\lambda\end{vmatrix}=0\)

\(\Rightarrow\lambda^2+4\lambda-3=0\)

\(\Rightarrow(\lambda+1)(\lambda+3)=0\)

\(\Rightarrow\lambda=-1,-3\)

\(For\;\lambda_1=-1,\;A-\lambda_1I=\begin{bmatrix}-1&1\\1&-1\end{bmatrix},\;{\underline x}_1=\begin{bmatrix}1\\1\end{bmatrix}\)

\(For\;\lambda_2=-3,\;A-\lambda_2I=\begin{bmatrix}1&1\\1&1\end{bmatrix},\;{\underline x}_2=\begin{bmatrix}1\\-1\end{bmatrix}\)

\(\therefore\underline u=c_1e^{-t}\begin{bmatrix}1\\1\end{bmatrix}+c_2e^{-3t}\begin{bmatrix}1\\-1\end{bmatrix}\)

\(=\begin{bmatrix}1&1\\1&-1\end{bmatrix}\begin{bmatrix}e^{-t}&0\\0&e^{-3t}\end{bmatrix}\begin{bmatrix}c_1\\c_2\end{bmatrix}\)

\(If\;t=0,\;{\underline u}_0=\begin{bmatrix}2\\0\end{bmatrix}=c_1\begin{bmatrix}1\\1\end{bmatrix}+c_2\begin{bmatrix}1\\-1\end{bmatrix}\)

\(\Rightarrow c_1=c_2=1\)

\(\therefore\underline u=e^{-t}\begin{bmatrix}1\\1\end{bmatrix}+e^{-3t}\begin{bmatrix}1\\-1\end{bmatrix}\)

\(=\begin{bmatrix}e^{-t}+e^{-3t}\\e^{-t}-e^{-3t}\end{bmatrix}\)

\({\underline u}_0=\begin{bmatrix}2\\0\end{bmatrix}=c_1\begin{bmatrix}1\\1\end{bmatrix}+c_2\begin{bmatrix}1\\-1\end{bmatrix}=\underbrace{\begin{bmatrix}1&1\\1&-1\end{bmatrix}}_{=S}\begin{bmatrix}c_1\\c_2\end{bmatrix}\)

\(\underline c=S^{-1}{\underline u}_0\)

\(\underline u=\begin{bmatrix}e^{-t}+e^{-3t}\\e^{-t}-e^{-3t}\end{bmatrix}\;(=S\begin{bmatrix}e^{-t}&0\\0&e^{-3t}\end{bmatrix}\underline c)\)

\(=S\begin{bmatrix}e^{-t}&0\\0&e^{-3t}\end{bmatrix}S^{-1}{\underline u}_0\)


Recall \(e^x=1+x+\frac{x^2}{2!}+\frac{x^3}{3!}+\cdots\)

\(For\;an\;n\times n\;matrix\;A,\;define\)

\(e^{At}\triangleq I+At+\frac{{(At)}^2}{2!}+\frac{{(At)}^3}{3!}+\cdots\)

\(Then\;\frac d{dt}e^{At}=A+A^2t+\frac1{2!}A^3t^2+\frac1{3!}A^4t^3+\cdots\)

\(=A\lbrack I+At+\frac{{(At)}^2}{2!}+\frac{{(At)}^3}{3!}+\cdots\rbrack\)

\(=Ae^{At}\)

It can also be shown that \((e^{As})(e^{At})=e^{A(s+t)}\)

\((e^{At})(e^{-At})=I\)

\(e^{A0}=I+A\cdot0+\frac{{(A\cdot0)}^2}{2!}+\frac{{(A\cdot0)}^3}{3!}+\cdots\)

\(=I\)

\(Suppose\;\underline u=e^{At}\underline c\)

\(Then\;\frac{d\underline u}{dt}=A(e^{At}\underline c)=A\underline u\)

\(Hence\;\underline u=e^{At}\underline c\;solves\;\frac{d\underline u}{dt}=A\underline u\)

\(For\;t=0,\;\underline u={\underline u}_0=I\underline c=\underline c\)

\(\therefore\;\underline u=e^{At}{\underline u}_0\)

\(\frac{d\underline u}{dt}=A\underline u\;with\;{\underline u}_0\;when\;t=0\)

\(\therefore\;\underline u=e^{At}{\underline u}_0\)

\(If\;A\;is\;diagonalizable,\;A=S\Lambda S^{-1}\)

\(A^k=S\Lambda^kS^{-1}\)

\(e^{At}=I+At+\frac{{(At)}^2}{2!}+\frac{{(At)}^3}{3!}+\cdots\)

\(=I+S\Lambda S^{-1}t+\frac{S\Lambda^2S^{-1}}{2!}t^2+\frac{S\Lambda^3S^{-1}}{3!}t^3+\cdots\)

\(=S\lbrack I+\Lambda+\frac{\Lambda^2}{2!}t^2+\frac{\Lambda^3}{3!}t^3+\cdots\rbrack S^{-1}\)

\(=Se^{\Lambda t}S^{-1}\)

Therefore, \(\underline u=e^{At}{\underline u}_0=Se^{\Lambda t}S^{-1}{\underline u}_0\)

We can have \(e^{\Lambda t}=I+\Lambda+\frac{{(\Lambda t)}^2}{2!}+\frac{{(\Lambda t)}^3}{3!}+\cdots\)

\(=\begin{bmatrix}1&0&\cdots&0\\0&1&\ddots&\vdots\\\vdots&\ddots&\ddots&0\\0&\cdots&0&1\end{bmatrix}+\begin{bmatrix}\lambda_1t&0&\cdots&0\\0&\lambda_2t&\ddots&\vdots\\\vdots&\ddots&\ddots&0\\0&\cdots&0&\lambda_nt\end{bmatrix}\)

\(+\begin{bmatrix}\frac{\lambda_1^2t^2}{2!}&0&\cdots&0\\0&\frac{\lambda_2^2t^2}{2!}&\ddots&\vdots\\\vdots&\ddots&\ddots&0\\0&\cdots&0&\frac{\lambda_n^2t^2}{2!}\end{bmatrix}+\begin{bmatrix}\frac{\lambda_1^3t^3}{3!}&0&\cdots&0\\0&\frac{\lambda_2^3t^3}{3!}&\ddots&\vdots\\\vdots&\ddots&\ddots&0\\0&\cdots&0&\frac{\lambda_n^3t^3}{3!}\end{bmatrix}+\cdots\)

\(=\begin{bmatrix}e^{\lambda_1t}&0&\cdots&0\\0&e^{\lambda_2t}&\ddots&\vdots\\\vdots&\ddots&\ddots&0\\0&\cdots&0&e^{\lambda_nt}\end{bmatrix}\)

\(\therefore\underline u=Se^{\Lambda t}S^{-1}{\underline u}_0\)

\(=\begin{bmatrix}{\underline x}_1&{\underline x}_2&\cdots&{\underline x}_n\end{bmatrix}\begin{bmatrix}e^{\lambda_1t}&0&\cdots&0\\0&e^{\lambda_2t}&\ddots&\vdots\\\vdots&\ddots&\ddots&0\\0&\cdots&0&e^{\lambda_nt}\end{bmatrix}\begin{bmatrix}c_1\\c_2\\\vdots\\c_n\end{bmatrix}\)

\(=c_1e^{\lambda_1t}{\underline x}_1+c_2e^{\lambda_2t}{\underline x}_2+\cdots+c_ne^{\lambda_nt}{\underline x}_n\)

\(where\;\underline c=\begin{bmatrix}c_1\\c_2\\\vdots\\c_n\end{bmatrix}=S^{-1}{\underline u}_0\)

\(\frac{d\underline u}{dt}=A\underline u\;with\;{\underline u}_0\)


[Ex] \(y'''-3y''+2y'=0\)

Introduce \(v=y'\;and\;w=y''\)

\(\underline u=\begin{bmatrix}w\\v\\y\end{bmatrix}\)

\(w'=y'''=3y''-2y'=3w-2v\)

\(v'=y''=w\)

\(y'=v\)

\(\Rightarrow\frac{d\underline u}{dt}=\begin{bmatrix}w'\\v'\\y'\end{bmatrix}=\begin{bmatrix}3&-2&0\\1&0&0\\0&1&0\end{bmatrix}\begin{bmatrix}w\\v\\y\end{bmatrix}\)

Stability of Differentical Equations

\(If\;\lambda=a+ib,\;where\;i=\sqrt{-1}\)

\(e^{\lambda t}=e^{at}e^{ibt}\)

\(=e^{at}(\cos(bt)+i\sin(bt))\)

\(and\;\left|e^{\lambda t}\right|=e^{at}\)

The differential equation \(\frac{d\underline u}{dt}=A\underline u\) is

\(stable\;and\;e^{at}\rightarrow0\;(as\;t\rightarrow\infty)\)

\(whenever\;all\;\mathcal R(\lambda i)<0\)

\(neutral\;when\;all\;\mathcal R(\lambda i)\leq0\)

\(and\;some\;\mathcal R(\lambda i)=0\)

\(unstable\;and\;e^{At}\;is\;unbounded\;if\;any\;eigenvalue\;has\;\mathcal R(\lambda i)>0\)

stochastic matrix

A stochastic matrix is a square matrix where all entries are non-negative and represent probabilities, with each row summing to 1

\(P=\begin{bmatrix}P_{11}&P_{12}&\dots&P_{1j}&\dots&P_{1n}\\P_{21}&P_{22}&\dots&P_{2j}&\dots&P_{2n}\\\vdots&\vdots&\ddots&\vdots&\ddots&\vdots\\P_{i1}&P_{i2}&\dots&P_{ij}&\dots&P_{in}\\\vdots&\vdots&\ddots&\vdots&\ddots&\vdots\\P_{n1}&P_{n2}&\dots&P_{nj}&\dots&P_{nn}\end{bmatrix}\)

\(\sum_{j=1}^nP_{i,j}=1\)



\(A=\begin{bmatrix}0.6&0.2&0.4\\0.3&0.3&0.1\\0.1&0.5&0.5\end{bmatrix}\)

\(v=\begin{bmatrix}\frac{15}{11}\\\frac8{11}\\1\end{bmatrix},\;\lambda_1=1\)

\(v=\begin{bmatrix}\frac{-1+i}2\\\frac{-1-i}2\\1\end{bmatrix},\;\lambda_2=\frac{1-i}5\)

\(v=\begin{bmatrix}\frac{-1-i}2\\\frac{-1+i}2\\1\end{bmatrix},\;\lambda_3=\frac{1+i}5\)


\(A=\begin{bmatrix}0.2&0.5&0.3\\0.6&0.1&0.3\\0.2&0.4&0.4\end{bmatrix}\)

\(v=\begin{bmatrix}1\\1\\1\end{bmatrix},\;\lambda_1=1\)

\(v=\begin{bmatrix}2\\-3\\1\end{bmatrix},\;\lambda_2=-0.4\)

\(v=\begin{bmatrix}-0.5\\-0.5\\1\end{bmatrix},\;\lambda_3=0.1\)




  • Page \(\color{Red}A\) has \(3\) links, so it passes \(\frac 13\) of its importance to pages \(B,C,D\).
  • Page \(\color{blue}B\) has \(2\) links, so it passes \(\frac 12\) of its importance to pages \(C,D\).
  • Page \(\color{Green}C\) has one link, so it passes all of its importance to page \(A\).
  • Page \(\color{Purple}D\) has \(2\) links, so it passes \(\frac 12\) of its importance to pages \(A,C\).

In terms of matrices, if \(v = (a,b,c,d)\) is the vector containing the ranks \(a,b,c,d\) of the pages \(A,B,C,D\text{,}\) then

\[\left(\begin{array}{cccc}\color{Red}{0}&\color{blue}{0}&\color{Green}{1}&\color{Purple}{\frac{1}{2}} \\ \color{Red}{\frac{1}{3}}&\color{blue}{0}&\color{Green}{0}&\color{Purple}{0} \\ \color{Red}{\frac{1}{3}}&\color{blue}{\frac{1}{2}}&\color{Green}{0}&\color{Purple}{\frac{1}{2}} \\ \color{Red}{\frac{1}{3}}&\color{blue}{\frac{1}{2}}&\color{Green}{0}&\color{Purple}{0}\end{array}\right)\left(\begin{array}{c}a\\b\\c\\d\end{array}\right)=\left(\begin{array}{lr}{}&{c+\frac{1}{2}d} \\ {\frac{1}{3}a}&{}\\ {\frac{1}{3}a+\frac{1}{2}b}&{+\frac{1}{2}d} \\ {\frac{1}{3}a+\frac{1}{2}b}&{}\end{array}\right)=\left(\begin{array}{c}a\\b\\c\\d\end{array}\right).\nonumber\]

The matrix on the left is the importance matrix, and the final equality expresses the importance rule.

Example: Cat and mouse

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

∮ ∯ ∰ ∫∬ ∭⨌ > < ⋉ % § ↯

½, ↉, ⅓, ⅔, ¼, ¾, ⅕, ⅖, ⅗, ⅘, ⅙, ⅚, ⅐, ⅛, ⅜, ⅝, ⅞, ⅑, ⅒

Quaternion spin




good introduction