線性代數(Linear Algebra)

Mathematics   Yoshio    Mar 24th, 2024 at 8:00 PM    8    0   

線性代數

Linear Algebra

“If you can't explain it to a six year old, you don't understand it yourself.”



Linear algebra provides an invaluable tool kit for computation tasks such as solving a system of linear equations or finding the inverse of a square matrix. Given that these tasks usually involve a large number of calculations, it is often the case that the geometric interpretation is underappreciated. Linear algebra is a genuine oddity in that it is possible to understand this entire discipline with multiple, distinct perspectives, each of which has its own merits and its own way of understanding this vast and elegant subject. A criminally overlooked perspective for new students of linear algebra is the one in which we think of matrices as a way of transforming vectors, hence offering us a well-developed tool kit to start describing (linear) spatial geometry.

Vectors are special types of matrices which can be split into two categories: row vectors and column vectors. A row vector is a matrix of order \(1\times n\), which has \(1\) row and \(n\) columns, whereas a column vector is a matrix of order \(m\times1\) which has \(m\) rows and \(1\) column. Although convention differs from source to source, it is arguably most sensible to think only in terms of column vectors. There are two reasons for this: column vectors are used more often and can be related to row vectors by transposition if needed; and column vectors are possible to combine with matrices under matrix multiplication on the left-hand side (which is also simply a matter of arbitrary convention but for some reason it just feels better).

\(a=\begin{pmatrix}3\\2\end{pmatrix},\;b=\begin{pmatrix}-5\\2\end{pmatrix}\)

The matrix multiplication \(AB\) is well defined so long as \(A\) has order \(m\times n\) and \(B\) has order \(n\times p\) The resulting matrix is of order \(m\times p\).

uppose now that we were to consider the matrix products \(a'=Ma\) and \(b'=Mb\).

\(M=\begin{pmatrix}1&2\\-1&3\end{pmatrix}\)

\(a'=Ma=\begin{pmatrix}1&2\\-1&3\end{pmatrix}\begin{pmatrix}3\\2\end{pmatrix}=\begin{pmatrix}7\\3\end{pmatrix}\)

\(b'=Mb=\begin{pmatrix}1&2\\-1&3\end{pmatrix}\begin{pmatrix}-5\\2\end{pmatrix}=\begin{pmatrix}-1\\11\end{pmatrix}\)

\(M=\begin{pmatrix}2&-1\\1&1\end{pmatrix}\)

\(a'=Ma=\begin{pmatrix}2&-1\\1&1\end{pmatrix}\begin{pmatrix}3\\2\end{pmatrix}=\begin{pmatrix}4\\5\end{pmatrix}\)

\(b'=Mb=\begin{pmatrix}2&-1\\1&1\end{pmatrix}\begin{pmatrix}-5\\2\end{pmatrix}=\begin{pmatrix}-12\\-3\end{pmatrix}\)


Eigenvalues and Eigenvectors

The prefix eigen- is adopted from the German word eigen (cognate with the English word own) for 'proper', 'characteristic', 'own'.

\(A \vec{v} =\lambda \vec{v} \)

Notice we multiply a squarematrix by a vector and get the same result as when we multiply a scalar (just a number) by that vector.

We start by finding the eigenvalue. We know this equation must be true:

\(A \vec{v} =\lambda \vec{v} \)

Next we put in an identity matrix so we are dealing with matrix-vs-matrix:

\(A \vec{v} =\lambda {I} \vec{v} \)

Bring all to left hand side:

\(A \vec{v} -\lambda I \vec{v} = 0\)

If v is non-zero then we can (hopefully) solve for λ using just the determinant:

\(det(A - \lambda I) = 0\)

\(| A − \lambda I | = 0\)

trace of A

The matrix A has n eigenvalues (including each according to its multiplicity). The sum of the n eigenvalues of A is the same as the trace of A (that is, the sum of the diagonal elements of A). The product of the n eigenvalues of A is the same as the determinant of A.



Perpendicular Vector

While both involve multiplying the magnitudes of two vectors, the dot product results in a scalar quantity, which indicates magnitude but not direction, while the cross product results in a vector, which indicates magnitude and direction.


Inner product (Dot product)

Since the inner product generalizes the dot product, it is reasonable to say that two vectors are “orthogonal” (or “perpendicular”) if their inner product is zero. With this definition, we see from the preceding example that sin x and cosx are orthogonal (on the interval [−π, π]).

\(\overset\rightharpoonup A\cdot\overset\rightharpoonup B=\sum_{i=1}^n\;a_ib_i=a_1b_1+a_2b_2+\cdots+a_nb_n\)


Outer product (Cross product)

A cross product of any two vectors yields a vector which is perpendicular to both vectors .

\(\because\) for two vectors \(\overrightarrow A\) and \(\overrightarrow B\), if \(\overrightarrow C\) is the vector perpendicular to both.

\(\overrightarrow C=\overrightarrow A\times\overrightarrow B=\begin{bmatrix}\widehat i&\widehat j&\widehat k\\A_1&A_2&A_3\\B_1&B_2&B_3\end{bmatrix}\\=(A_2B_3-B_2A_3)\widehat i-(A_1B_3-B_1A_3)\widehat j+(A_1B_2-B_1A_2)\widehat k\)

Inserting given vectors we obtain

\(\overrightarrow C=\begin{bmatrix}\widehat i&\widehat j&\widehat k\\2&3&4\\1&2&3\end{bmatrix}\)

\(={(3\times3-2\times4)}\widehat i-{(2\times3-1\times4)}\widehat j+{(2\times2-1\times3)}\widehat k\)

\(=\widehat i-2\widehat j+\widehat k\)


Cross product trick

\( \vec{A} \times \vec{B}=\begin{vmatrix} \hat{i}&\hat{j}&\hat{k} \\ A_x & A_y & A_z \\ B_x&B_y&B_z \end{vmatrix} \)

\( \vec{A}\times\vec{B}=(A_y B_z- A_z B_y)\hat{i}+(A_z B_x - A_x B_z) \hat{j}+(A_x B_y- A_y B_x) \hat{k} \nonumber \)



Matrix multiplication

There are two basic types of matrix multiplication: inner (dot) product and outer product. The inner product results in a matrix of reduced dimensions, the outer product results in one of expanded dimensions. A helpful mnemonic: Expand outward, contract inward.


Eigenvalue Decomposition


\[\begin{bmatrix}\mathtt{\,\,\,\,0}&\mathtt{\,\,\,\,1}\\\mathtt{-2}&\mathtt{-3}\end{bmatrix}\begin{bmatrix}\mathtt{-1}&\mathtt{-1}\\\mathtt{\,\,\,\,2}&\mathtt{\,\,\,\,1}\end{bmatrix}=\begin{bmatrix}\mathtt{-1}&\mathtt{-1}\\\mathtt{\,\,\,\,2}&\mathtt{\,\,\,\,1}\end{bmatrix}\begin{bmatrix}\mathtt{-2}&\mathtt{\,\,\,\,0}\\\mathtt{\,\,\,\,0}&\mathtt{-1}\end{bmatrix}\]

\[\begin{bmatrix}\mathtt{\,\,\,\,0}&\mathtt{\,\,\,\,1}\\\mathtt{-2}&\mathtt{-3}\end{bmatrix}=\begin{bmatrix}\mathtt{-1}&\mathtt{-1}\\\mathtt{\,\,\,\,2}&\mathtt{\,\,\,\,1}\end{bmatrix}\begin{bmatrix}\mathtt{-2}&\mathtt{\,\,\,\,0}\\\mathtt{\,\,\,\,0}&\mathtt{-1}\end{bmatrix}\begin{bmatrix}\mathtt{\,\,\,\,1}&\mathtt{\,\,\,\,1}\\\mathtt{-2}&\mathtt{-1}\end{bmatrix}\]



Example (A)

As a real example of this, consider a simple model of the weather. Let’s say it either is sunny or it rains. If it’s sunny, there’s a 90% chance that it remains sunny, and a 10% chance that it rains the next day. If it rains, there’s a 40% chance it rains the next day, but a 60% chance of becoming sunny the next day. We can describe the time evolution of this process through a matrix equation:

\( \begin{bmatrix}{}_{Ps,n+1}\\{}_{Pr,n+1}\end{bmatrix}=\begin{bmatrix}0.9&0.6\\0.1&0.4\end{bmatrix}\begin{bmatrix}{}_{Ps,n}\\{}_{Pr,n}\end{bmatrix} \)

where \({}_{Ps,n}\), \({}_{Ps,n}\) are the probabilities that it is sunny or that it rains on day \(n\). (If you are curious, this kind of process is called a Markov process, and the matrix that I have defined is called the transition matrix. This is, by the way, not a very good model of the weather.) One way to see that this matrix does provide the intended result is to see what results from matrix multiplication:

\(p_{s,n+1}=0.9p_{s,n}+0.6p_{r,n}\)
\(p_{r,n+1}=0.1p_{s,n}+0.4p_{r,n}\)

This is just the law of total probability! To get a sense of the long run probabilities of whether it rains or not, let’s look at the eigenvectors and eigenvalues.

\(\begin{bmatrix}0.986394\\0.164399\end{bmatrix}\) with eigenvalue 1, \(\begin{bmatrix}‐0.707\\0.707\end{bmatrix}\) with eigenvalue 0.3

Now, this matrix ends up having an eigenvector of eigenvalue 1 (in fact, all valid Markov transition matrices do), while the other eigenvector is both unphysical (probabilities cannot be negative) and has an eigenvalue less than 1. Therefore, in the long run, we expect the probability of it being sunny to be 0.986 and the probability of it being rainy to be 0.164 in this model. This is known as the stationary distribution. Any component of the vector in the direction of the unphysical eigenvector will wither away in the long run (its eigenvalue is 0.3, and \(0.3^k\rightarrow0\;as\;k\rightarrow\infty\), while the one with eigenvalue 1 will survive and dominate.

Eigenspaces

The set of all eigenvectors of T corresponding to the same eigenvalue, together with the zero vector, is called an eigenspace, or the characteristic space of T associated with that eigenvalue. If a set of eigenvectors of T forms a basis of the domain of T, then this basis is called an eigenbasis.

Eigenvectors of Symmetric Matrices Are Orthogonal

軸元 Pivot

軸元(英語:pivot或pivot element)是矩陣、陣列或是其他有限集合的一個演算元素,算法(如高斯消去法、快速排序、單體法等等)首先選出軸元,用於特定計算。

The pivot or pivot element is the element of a matrix, or an array, which is selected first by an algorithm (e.g. Gaussian elimination, simplex algorithm, etc.), to do certain calculations.



Markov chain

A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happens next depends only on the state of affairs now." A countably infinite sequence, in which the chain moves state at discrete time steps, gives a discrete-time Markov chain (DTMC). A continuous-time process is called a continuous-time Markov chain (CTMC). It is named after the Russian mathematician Andrey Markov.

Markov chains have many applications as statistical models of real-world processes, such as studying cruise control systems in motor vehicles, queues or lines of customers arriving at an airport, currency exchange rates and animal population dynamics.


Example (B)

The Linear Algebra View of the Fibonacci Sequence


The Fibonacci sequence first appears in the book Liber Abaci (The Book of Calculation, 1202) by Fibonacci where it is used to calculate the growth of rabbit populations.

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, ....

Each term in the Fibonacci sequence equals the sum of the previous two. That is, if we let Fn denote the nth term in the sequence we can write

\(F_n=F_{n-1}+F_{n-2}\)

To make this into a linear system, we need at least one more equation. Let’s use an easy one:

\(F_{n-1}=F_{n-1}+\left(0\right)\times F_{n-2}\)

Putting these together, here’s our system of linear equations for the Fibonacci sequence:

\(F_n=F_{n-1}+F_{n-2}\)
\(F_{n-1}=F_{n-1}+0\)

We can write this in matrix notation as the following:

\(\begin{bmatrix}F_n\\F_{n-1}\end{bmatrix}=\begin{bmatrix}1&1\\1&0\end{bmatrix}\begin{bmatrix}F_{n-1}\\F_{n-2}\end{bmatrix}\)

Here’s how we do that:

\(\begin{vmatrix}1-\lambda&1\\1&-\lambda\end{vmatrix}=0\)

\((1-\lambda)(-\lambda)-1=0\)

\(\lambda^2-\lambda-1=0\)

get the following two solutions, which are our two eigenvalues:

\(\lambda_1=\frac{1+\sqrt5}2\)
\(\lambda_2=\frac{1-\sqrt5}2\)

The next step is to find the two eigenvectors that correspond to the eigenvalues.

\(\left(A-\lambda I\right)x=0\)

we have the eigenvalues and eigenvectors, we can write the \(\mathrm S\) and \(\Lambda\) matrices as follows:

\(\mathrm S=\begin{bmatrix}\frac{1+\sqrt5}2&\frac{1-\sqrt5}2\\1&1\end{bmatrix}\)

\(\mathrm\Lambda=\begin{bmatrix}\frac{1+\sqrt5}2&0\\0&\frac{1-\sqrt5}2\end{bmatrix}\)

\(\mathrm S^{-1}=\frac1{\sqrt5}\begin{bmatrix}1&\frac{\sqrt5-1}2\\-1&\frac{\sqrt{5+1}}2\end{bmatrix}=\begin{bmatrix}\frac1{\sqrt5}&\frac{\sqrt5-1}{2\sqrt5}\\-\frac1{\sqrt5}&\frac{\sqrt{5+1}}{2\sqrt5}\end{bmatrix}\)


\({38}_{th}=39088169\), and \({39}_{th}\) is ?
\(39088169\times\frac{1+\sqrt5}2=39088169\times1.61803398875=63245986\)
\(39088169\div\frac{1-\sqrt5}2=39088169\div-0.61803398875=-63245986\),
\(\left|-63245986\right|=63245986\)

\({57}_{th}=365435296162\), and \({56}_{th}\) is ?
\(365435296162\div\frac{1+\sqrt5}2=365435296162\div1.61803398875=225851433717\)
\(365435296162\times\frac{1-\sqrt5}2=365435296162\times-0.61803398875=-225851433717\),
\(\left|-225851433717\right|=225851433717\)



Every beehive has one female queen bee who lays all the eggs. If an egg is fertilized by a male bee, then the egg produces a female worker bee, who doesn’t lay any eggs herself. If an egg is not fertilized it eventually hatches into a male bee, called a drone.


Fibonacci sequence also occur in real populations, here honeybees provide an example. A look at the pedigree of family tree of honey bee suggests that mating tendencies follow the Fibonacci numbers.

In a colony of honeybees there is one distinctive female called the queen. The other females are worker bees who, unlike the queen bee, produce no eggs. Male bees do no work and are called drone bees. When the queen lays eggs, the eggs could be either fertilised or unfertilised. The unfertilised eggs would result in drone (male bee) which carry 100% of their genome and fertilised eggs will result in female worker bees which carry 50% of their genome.

Males are produced by the queen’s unfertilised eggs, so male bees only have a mother but no father. All the females are produced when the queen has mated with a male and so have two parents. Females usually end up as worker bees but some are fed with a special substance called royal jelly which makes them grow into queens ready to go off to start a new colony when the bees form a swarm and leave their home (a hive) in search of a place to build a new nest.

Vector and Linear combination

\(\underline v=\begin{bmatrix}v_1\\v_2\end{bmatrix},\;column\;vector\)

Vector Addition

\(\underline v=\begin{bmatrix}v_1\\v_2\end{bmatrix}\;and\;\underline w=\begin{bmatrix}w_1\\w_2\end{bmatrix}\;\Rightarrow\underline v+\underline w=\begin{bmatrix}v_1+w_1\\v_2+w_2\end{bmatrix}\)

Scalar Multiplication

\(2\underline v=\begin{bmatrix}2v_1\\2v_2\end{bmatrix}\;\Rightarrow\;c\underline v=\begin{bmatrix}cv_1\\cv_2\end{bmatrix}\)

Linear Combination

\(c\underline v+d\underline w=\begin{bmatrix}cv_1+dw_1\\cv_2+dw_2\end{bmatrix},\;\{c,\;d\}\in\mathfrak R\)



derivative is linear

\(f(x)\rightarrow f'(x)\)

\(\alpha\cdot f(x)\rightarrow\alpha\cdot f'(x)\)

\(\alpha\cdot f(x)+\beta\cdot g(x)\rightarrow\alpha\cdot f'(x)+\beta\cdot g'(x)\)

integral is linear

\(f(x)\rightarrow\int_a^bf(x)dx\)

\(\alpha\cdot f(x)\rightarrow\alpha\cdot\int_a^bf(x)dx\)

\(\alpha\cdot f(x)+\beta\cdot g(x)\rightarrow\alpha\cdot\int_a^bf(x)dx+\beta\cdot\int_a^bg(x)dx\)

The Fourier Transform, Laplace transform, and Wavelet transforms are linear operators.


Linear Equations

\(x-2y=1\)

\(3x+2y=11\)

\(\begin{bmatrix}1&-2\\3&2\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}1\\11\end{bmatrix}\)

\(A\underline x=\underline b\)

\(A\in matrix\)

row picture:

solution: x=3, y=1

column picture:

\(x\begin{bmatrix}1\\3\end{bmatrix}+y\begin{bmatrix}-2\\2\end{bmatrix}=\begin{bmatrix}1\\11\end{bmatrix}\)

find the linear combination of the vectors on the left side that equals the vector on the right side.







\(A\underline x=\underline b\)

raw picture : different there planes.

column picture : the same three column vectors.


Question: can we solve \(A\underline x=\underline b\) for every \(\underline b\) ?

Do the linear combinations of the columns fill the 3-dimentional space?

Yes for \(\begin{bmatrix}2&1&1\\4&-6&0\\-2&7&2\end{bmatrix}\)

No when the three column vectors lie in the same plane.


Gaussian Elimination

In mathematics, Gaussian elimination, also known as row reduction, is an algorithm for solving systems of linear equations. It consists of a sequence of row-wise operations performed on the corresponding matrix of coefficients. This method can also be used to compute the rank of a matrix, the determinant of a square matrix, and the inverse of an invertible matrix.



pivot. forward elimination and backward substitution



In general, \(\begin{bmatrix}A&\vert&\underline b\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}\;U&\vert&\underline c\;\end{bmatrix}\) upper triangular matrix

\(A\underline x=\underline b\Rightarrow U\underline x=\underline c\)

In order to solve the unknowns, pivots can not be zero.

When would the process break down?

(i) sometimes rows should be exchanged. nonsingular case

(ii) These equations may be solvable or unsolvable. singular case


Triangular matrix

Two particularly important types of triangular matrices are termed upper triangular and lower triangular—these are matrices that have, respectively, all 0 elements below and above the diagonal:

\(L=\left[\begin{array}{lccc}a_{11}&0&\cdots&0\\a_{21}&a_{22}&\cdots&0\\\vdots&\vdots&\ddots&\vdots\\a_{n1}&a_{n2}&\cdots&a_{nn}\end{array}\right]\)

\(\operatorname{det}{(L)}=\prod_{k=1}^na_{kk}\)

\(U=\left[\begin{array}{llll}a_{11}&a_{12}&\cdots&a_{1n}\\0&a_{22}&\cdots&a_{2n}\\\vdots&\vdots&\ddots&\vdots\\0&0&\cdots&a_{nn}\end{array}\right]\)

\(\operatorname{det}{(U)}=\prod_{k=1}^na_{kk}\)

The determinant and permanent of a triangular matrix equal the product of the diagonal entries.

Inverse Matrices

identity matrix \(I_n\)

\(I=\left[ \begin{array}{c}1\end{array} \right] , \left[ \begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array} \right] ,\left[ \begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array} \right] ,\left[ \begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array} \right]\)

\(\delta _{ij}=\left\{ \begin{array}{c} 1 \text{ if }i=j \\ 0\text{ if }i\neq j \end{array} \right.\nonumber\)

\(AI=IA=A\) for any \(n\times n\) matrix \(A\)

Def \(A_n\;n\times n\;matrix\;A\;is\;invertible\;if\;there\;exists\;a\;matrix\;B\;such\;that\;BA=I\;and\;AB=I\)

\(B\;is\;called\;\) an \(\;inverse\;of\;A\)

Claim: Suppose \(A\) is invertible. Then its inverse is unique.

Proof:

Suppose \(A\) has two inverses \(B\) and \(C\)

Then \(BA=I\) and \(AC=I\)

We have \(B=BI=B(AC)=(BA)C=IC=C\)

\(B=IB=(CA)B=C(AB)=CI=C\)

Remark: The inverse of \(A\) is denoted as \(A^{-1}\)

Remark: The proof actually shows that if \(BA=AC=I\) then \(B=C\)

"left inverse" = "right inverse" = "inverse"

Claim: the inverse of \(A^{-1}\) is \(A\) itself

Proof: \(AA^{-1}=I\;and\;A^{-1}A=I\) since \(A^{-1}\) is the inverse of \(A\)

Therefor \(A\) is the inverse of \(A^{-1}\)

Claim: If \(A\) is invertible the one and the only one solution to \(A\underline x=\underline b\) is

\(\underline x=A^{-1}\underline b\)

Proof: \(A^{-1}A\underline x=A^{-1}\underline b\)

\(\Leftrightarrow I\underline x=A^{-1}\underline b\)

\(\Leftrightarrow\underline x=A^{-1}\underline b\)

Claim: Suppose there is a nonzero solution \(\underline x\;to\;A\underline x=\underline0\;then\;A\;cannot\;have\;an\;inverse\)

Proof: If \(A\) is invertible, \(\Rightarrow(A^{-1}A)\underline x=A^{-1}\underline0\)

\(\therefore\underline x=\underline0\)

Definition: A square \(n \times n\) matrix \(D\) is a Diagonal Matrix if all entries off the main diagonal are zero, that is for all \(i \neq j\), \(a_{ij} = 0\).

\[\begin{split}\boldsymbol D = \begin{bmatrix} d_1 & 0 & 0 & ... & 0 \\ 0 & d_2 & 0 & ... & 0 \\ 0 & 0 & d_3 & ... & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & ... & d_n \end{bmatrix}\end{split}\]

\[\begin{split}\boldsymbol D^{-1} = \begin{bmatrix} \frac{1}{d_1} & 0 & 0 & ... & 0 \\ 0 & \frac{1}{d_2} & 0 & ... & 0 \\ 0 & 0 & \frac{1}{d_3} & ... & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & ... & \frac{1}{d_n} \end{bmatrix}\end{split}\]

Claim: A diagonal matrix has an inverse prodived no diagonal entries are zero.

Proof: From the diagonal matrix inverse, each diagonal element is the inverse of diagonal matrix diagonal element.

Claim: If A and B are invertible matrices with same size, then so is (AB). \({(AB)}^{-1}=B^{-1}A^{-1}\)

Proof: \((B^{-1}A^{-1})(AB)=B^{-1}(A^{-1}A)B=B^{-1}IB=B^{-1}B=I\)

\((AB)(B^{-1}A^{-1})=A(BB^{-1})A^{-1}=AIA^{-1}=AA^{-1}=I\)

Claim: \({(ABC)}^{-1}=C^{-1}B^{-1}A^{-1}\)


[Ex] \(E=\begin{bmatrix}1&0&0\\-4&1&0\\0&0&1\end{bmatrix},\;E^{-1}=\begin{bmatrix}1&0&0\\4&1&0\\0&0&1\end{bmatrix}\)

[Ex] Permutation Matrix

row exchange matrix \(P_{32}=\begin{bmatrix}1&0&0\\0&0&1\\0&1&0\end{bmatrix},\;P_{32}^{-1}=\begin{bmatrix}1&0&0\\0&0&1\\0&1&0\end{bmatrix}=P_{32}\)


What is Block Diagonal Matrix?

A matrix which is split into blocks is called a block matrix. In such type of square matrix, off-diagonal blocks are zero matrices and main diagonal blocks square matrices. Here, the non-diagonal blocks are zero. Dij = 0 when i is not equal to j, then D is called a block diagonal matrix.

n x n order block diagonal matrix

\(\begin{array}{l}\begin{bmatrix} a_{11} & 0 & 0 & . & . & 0\\ 0 & a_{22} & 0 & .& . &0 \\ 0& 0 & a_{33} & .& . & 0\\ .& . & . &. & .&. \\ .& . & . & . & . &. \\ 0& 0 & 0 &0 &0 & a_{nn} \end{bmatrix}\end{array} \)

Inverse of a Diagonal Matrix

If the elements on the main diagonal are the inverse of the corresponding element on the main diagonal of the D, then D is a diagonal matrix.

\(\begin{array}{l}Let\ D = \begin{bmatrix} a_{11} & 0& 0\\ 0 & a_{22} & 0\\ 0& 0 & a_{33} \end{bmatrix}\end{array} \)

Determinants of the above matrix:

\(\begin{array}{l}\left | D \right | = a_{11}\begin{bmatrix} a_{22} & 0\\ 0 & a_{33} \end{bmatrix}+ 0 \begin{bmatrix} 0& 0\\ 0 & a_{33} \end{bmatrix} + 0 \begin{bmatrix} 0 & a_{22} & \\ 0 & 0 \end{bmatrix}\end{array} \)

= a11a22a33

\(\begin{array}{l}Adj D = \begin{bmatrix} a_{22}a_{33} & 0& 0\\ 0 & a_{11}a_{33} & 0\\ 0& 0 & a_{11}a_{22} \end{bmatrix}\end{array} \)

\(\begin{array}{l}D^{-1} = \frac{1}{\left | D \right |} adj D\end{array} \)

\(=\begin{array}{l}\frac{1}{a_{11}a_{22}a_{33}} \begin{bmatrix} a_{22}a_{33} & 0& 0\\ 0 & a_{11}a_{33} & 0\\ 0& 0 & a_{11}a_{22} \end{bmatrix}\end{array} \)

\(=\begin{array}{l}\begin{bmatrix} \frac{1}{a_{11}} &0 & 0\\ 0 & \frac{1}{a_{22}} &0 \\ 0& 0 & \frac{1}{a_{33}} \end{bmatrix}\end{array} \)

Anti Diagonal Matrix

If the entries in the matrix are all zero except the ones on the diagonals from lower left corner to the other upper side(right) corner are not zero, it is anti diagonal matrix.

It is of the form:

\(\begin{array}{l}\begin{bmatrix} 0 & 0 &a_{13} \\ 0 & a_{22}&0 \\ a_{31}& 0 & 0 \end{bmatrix}\end{array} \)

Gauss-Jordan Elimination

Given \(A\), we want to find its inverse \(A^{-1}\)

\(AA^{-1}=I\)

\(\begin{bmatrix}\underline{x_1}&\underline{x_2}&\underline{x_3}\end{bmatrix}=A^{-1}\)

\(A\begin{bmatrix}\underline{x_1}&\underline{x_2}&\underline{x_3}\end{bmatrix}=\begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix}=\begin{bmatrix}\underline{e_1}&\underline{e_2}&\underline{e_3}\end{bmatrix}\)

\(\underline{e_1}=\begin{bmatrix}1\\0\\0\end{bmatrix},\;\underline{e_2}=\begin{bmatrix}0\\1\\0\end{bmatrix},\;\underline{e_3}=\begin{bmatrix}0\\0\\1\end{bmatrix},\)

\(A\underline{x_1}=\underline{e_1}\)

\(A\underline{x_2}=\underline{e_2}\)

\(A\underline{x_3}=\underline{e_3}\)

\(A^{-1}\cdot\begin{bmatrix}A&I\end{bmatrix}=\begin{bmatrix}A^{-1}A&A^{-1}I\end{bmatrix}=\begin{bmatrix}I&A^{-1}\end{bmatrix}\)

\(\begin{bmatrix}2&-1&0&1&0&0\\-1&2&-1&0&1&0\\0&-1&2&0&0&1\end{bmatrix}\)

\(\begin{bmatrix}2&-1&0&1&0&0\\0&\frac32&-1&\frac12&1&0\\0&-1&2&0&0&1\end{bmatrix}\)

\(\begin{bmatrix}2&-1&0&1&0&0\\0&\frac32&-1&\frac12&1&0\\0&0&\frac43&\frac13&\frac23&1\end{bmatrix}\)

above - Gauss

\(\begin{bmatrix}2&-1&0&1&0&0\\0&\frac32&0&\frac34&\frac32&\frac34\\0&0&\frac43&\frac13&\frac23&1\end{bmatrix}\)

\(\begin{bmatrix}2&0&0&\frac32&1&\frac12\\0&\frac32&0&\frac34&\frac32&\frac34\\0&0&\frac43&\frac13&\frac23&1\end{bmatrix}\)

\(\begin{bmatrix}1&0&0&\frac34&\frac12&\frac14\\0&1&0&\frac12&1&\frac12\\0&0&1&\frac14&\frac12&\frac34\end{bmatrix}\)

above - Jordan

\(\therefore\underline{x_1}=\begin{bmatrix}\frac34\\\frac12\\\frac14\end{bmatrix},\;\underline{x_2}=\begin{bmatrix}\frac12\\1\\\frac12\end{bmatrix},\;\underline{x_3}=\begin{bmatrix}\frac14\\\frac12\\\frac34\end{bmatrix}\)

\(A^{-1}=\begin{bmatrix}\frac34&\frac12&\frac14\\\frac12&1&\frac12\\\frac14&\frac12&\frac34\end{bmatrix}\)


\(\lbrack\;X\;\vert\;I\;\rbrack\)

\(\lbrack\;X\;A_1\;\vert\;A_1\;\rbrack\)

\(\lbrack\;X\;A_1\;\cdots\;A_n\;\vert\;A_1\;\cdots\;A_n\;\rbrack\)

Let’s use \(Y\;=\;A_1\;\cdots\;A_n\) for convenience. So we have \(\lbrack\;X\;Y\;\vert\;Y\;\rbrack\)

\(X\;Y\;=\;I\)

That can only mean that \(Y\;=\;X^{-1}\)


Def: a \(n\times n\) matrix is nonsingular if it has a full set of n (nonzero) pivots.

Claim: A matrix is invertible if and only if (iff) it is nonsingular.

[Proof]

Suppose A is nonsingular, i.e. it has a full set of n pivots.

Then by Gauss-Jordan elimination, we can find a matrix B such that \(AB=I\).

\(since\;A\underline{x_i}=\underline{e_i}\;is\;solvable\;for\;all\;i=1,\;2,\;...,\;n\)

On the other hand, Gauss-Jordan elimination is really a sequence of multiplications by elementary matrices on the left.

\((D^{-1}..E..P..E)A=I\)

\(where\;E_{ij}:\;to\;substract\;amultiple\;of\;l_{ij}\;of\;row\;j\;from\;row\;i\)

\(P_{ij}:\;to\;exchange\;rows\;i\;and\;j\)

\(D^{-1}:\;to\;divide\;all\;rows\;by\;their\;pivots\)

\(That\;is,\;there\;is\;a\;matrix\;C\;such\;that\;C=(D^{-1}..E..P..E),\;CA=I\)

\(Therefore,\;B=C=A^{-1}\;and\;A\;is\;invertible\)

\(If\;A\;does\;not\;have\;n\;pivots,\;elimination\;will\;lead\;to\;a\;all\;zero\;row.\)

\(There\;is\;an\;invertible\;M\;such\;that\;a\;row\;of\;MA\;is\;zero.\)

\(If\;AC=I\;is\;possible,\;then\;MAC=M\)

\(MAC=M,\;(MA)\;is\;with\;a\;zero\;row,\;M\;is\;with\;a\;zero\;row\)

\(Hence\;M\;must\;have\;a\;zero\;row,\;which\;reaches\;a\;contradiction\;\sin ce\;M\;is\;invertible.\)

\(Therefor\;A\;is\;not\;invetible.\)


In another viepoint

\(\lbrack A\;I\rbrack\Rightarrow\lbrack I\;A^{-1}\rbrack\) by performing elementary row operations

\(A^{-1}\lbrack A\;I\rbrack=\lbrack I\;A^{-1}\rbrack\)


Elimination=Factorization \(A=LU\)

[Ex] \(\begin{bmatrix}2&1&1\\4&-6&0\\-2&7&2\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}2&1&1\\0&-8&-2\\-2&7&2\end{bmatrix}\Rightarrow\begin{bmatrix}2&1&1\\0&-8&-2\\0&8&3\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}2&1&1\\0&-8&-2\\0&0&1\end{bmatrix}\)

\(\begin{bmatrix}1&0&0\\0&1&0\\0&1&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&1&0\\1&0&1\end{bmatrix}\begin{bmatrix}1&0&0\\-2&1&0\\0&0&1\end{bmatrix}\begin{bmatrix}2&1&1\\4&-6&0\\-2&7&2\end{bmatrix}=\begin{bmatrix}2&1&1\\0&-8&-2\\0&0&1\end{bmatrix}\)

\((E_{32}E_{31}E_{21})A=U\) Upper trianglular

\(E_{21}=\begin{bmatrix}1&0&0\\-2&1&0\\0&0&1\end{bmatrix}\)

\(E_{31}=\begin{bmatrix}1&0&0\\0&1&0\\1&0&1\end{bmatrix}\)

\(E_{32}=\begin{bmatrix}1&0&0\\0&1&0\\0&1&1\end{bmatrix}\)

\(E_{21}^{-1}=\begin{bmatrix}1&0&0\\2&1&0\\0&0&1\end{bmatrix}\)

\(E_{31}^{-1}=\begin{bmatrix}1&0&0\\0&1&0\\-1&0&1\end{bmatrix}\)

\(E_{32}^{-1}=\begin{bmatrix}1&0&0\\0&1&0\\0&-1&1\end{bmatrix}\)

\(\begin{bmatrix}1&0&0\\l_{21}&1&0\\0&0&1\end{bmatrix},\;\begin{bmatrix}1&0&0\\0&1&0\\l_{31}&0&1\end{bmatrix},\;\begin{bmatrix}1&0&0\\0&1&0\\0&l_{32}&1\end{bmatrix}\)

Inverse \(\begin{bmatrix}1&0&0\\-l_{21}&1&0\\0&0&1\end{bmatrix},\;\begin{bmatrix}1&0&0\\0&1&0\\-l_{31}&0&1\end{bmatrix},\;\begin{bmatrix}1&0&0\\0&1&0\\0&-l_{32}&1\end{bmatrix}\)

\((E_{32}E_{31}E_{21})A=U\)

\({(E_{32}E_{31}E_{21})}^{-1}(E_{32}E_{31}E_{21})A={(E_{32}E_{31}E_{21})}^{-1}U\)

\(A=(E_{21}^{-1}E_{31}^{-1}E_{32}^{-1})U\)

\(E_{21}^{-1}E_{31}^{-1}E_{32}^{-1}=\begin{bmatrix}1&0&0\\2&1&0\\0&0&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&1&0\\-1&0&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&1&0\\0&-1&1\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&0&0\\2&1&0\\0&0&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&1&0\\-1&-1&1\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&0&0\\2&1&0\\-1&-1&1\end{bmatrix}\)

\(\begin{bmatrix}1&0&0\\\boxed2&1&0\\0&0&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&1&0\\\boxed{-1}&0&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&1&0\\0&\boxed{-1}&1\end{bmatrix}\Rightarrow\begin{bmatrix}1&0&0\\\boxed2&1&0\\\boxed{-1}&\boxed{-1}&1\end{bmatrix}\)

In general,

\(E_{21}^{-1}E_{31}^{-1}E_{32}^{-1}=\begin{bmatrix}1&0&0\\-l_{21}&1&0\\0&0&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&1&0\\-l_{31}&0&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&1&0\\0&-l_{32}&1\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&0&0\\-l_{21}&1&0\\0&0&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&1&0\\-l_{31}&-l_{32}&1\end{bmatrix}\)

\(\Rightarrow\begin{bmatrix}1&0&0\\-l_{21}&1&0\\-l_{31}&-l_{32}&1\end{bmatrix}\)=L, lower triangular matrix

\(Note:\;E_{32}E_{31}E_{21}\neq\begin{bmatrix}1&0&0\\l_{21}&1&0\\l_{31}&l_{32}&1\end{bmatrix}\)

\(A=LU\) if no row exchanges are required

\(\begin{bmatrix}2&1&1\\4&-6&0\\-2&7&2\end{bmatrix}=\begin{bmatrix}1&0&0\\2&1&0\\-1&-1&1\end{bmatrix}\begin{bmatrix}2&1&1\\0&-8&-2\\0&0&1\end{bmatrix}\)

\(\begin{bmatrix}\boxed2&1&1\\0&\boxed{-8}&-2\\0&0&\boxed1\end{bmatrix}\Rightarrow2,\;-8,\;1\;are\;pivots\)

\(\begin{bmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{bmatrix}=\begin{bmatrix}\ell_{11}&0&0\\\ell_{21}&\ell_{22}&0\\\ell_{31}&\ell_{32}&\ell_{33}\end{bmatrix}\begin{bmatrix}u_{11}&u_{12}&u_{13}\\0&u_{22}&u_{23}\\0&0&u_{33}\end{bmatrix}.\)


We can further split U into

\(\begin{bmatrix}2&1&1\\0&-8&-2\\0&0&1\end{bmatrix}=\begin{bmatrix}2&0&0\\0&-8&0\\0&0&1\end{bmatrix}\begin{bmatrix}1&\frac12&\frac{1}{2}\\0&1&\frac{1}4\\0&0&1\end{bmatrix}\)

\(\therefore A=LDU\) if no row exchanges are required.

Let A be a square matrix. When a factorization A = LDU exists where
1. L is a lower triangular matrix with all main diagonal entries 1, (which records the steps of elimination.)
2. U is an upper triangular matrix with all diagonal entries 1, and
3. D is a diagonal matrix with all main diagonal entries nonzero, (with pivots on the diagonal.)
It is unique.

\(\begin{bmatrix}2&1&1\\4&-6&0\\-2&7&2\end{bmatrix}=\begin{bmatrix}1&0&0\\2&1&0\\-1&-1&1\end{bmatrix}\begin{bmatrix}2&1&1\\0&-8&-2\\0&0&1\end{bmatrix}\)

\(=\begin{bmatrix}1&0&0\\2&1&0\\-1&-1&1\end{bmatrix}\begin{bmatrix}2&0&0\\0&-8&0\\0&0&1\end{bmatrix}\begin{bmatrix}1&\frac12&\frac{1}{2}\\0&1&\frac{1}4\\0&0&1\end{bmatrix}\)

\(=LDU\)


Claim:

\(If\;A=L_1D_1U_1\;and\;A=L_2D_2U_2\)

\(Then\;L_1=L_2,\;D_1=D_2,\;U_1=U_2.\;If\;it\;is\;decomposable\)


Uniqueness Proof:

Assume \(LDU = L_1D_1U_1\). Objective: Prove \(X = X_1\) for each \(X = L, D, U\).

In keeping with the hint, \(\color{green}{L_1^{-1}}LDU\color{#D555D1}{U^{-1}} = \color{green}{L_1^{-1}}L_1D_1U_1\color{#D555D1}{U^{-1}} \iff \color{green}{L_1^{-1}}LD = D_1U_1\color{#D555D1}{U^{-1}}\) \(\Longrightarrow (\text{Lower triangular})D = D_1(\text{Upper triangular}).\)

\(\color{red}{\bigstar} \) So both sides are diagonal. \( \; L,U,L_1,U_1\) have diagonal \(1\)’s so \(D = D_1\). Then \(\color{green}{L_1^{-1}}L = I\) and \(U_1\color{#D555D1}{U^{-1}} = I. \blacksquare\)

First notice that \(L_1^{-1}L\) is a lower triangular matrix and has diagonal 1's, since both \(L_1^{-1}\) and \(L\) are lower triangular and have diagonal 1's.

Nonzero entries below the diagonal of \(L_1^{-1}L\) will cause nonzero entries below the diagonal of \(L_1^{-1}LD\) (since \(D\) is also diagonal). And if that happens, \(L_1^{-1}LD\) cannot equal to \(D_1U_1U^{-1}\), thus entries below the diagonal of \(L_1^{-1}L\) are all 0's.

Now we can conclude \(L_1^{-1}L=I\). Similarly, \(U_1U^{-1}=I\).



LU decomposition

Solving linear systems:

LU decomposition, short for Lower-Upper decomposition, is a matrix factorization technique used to break down a square matrix into the product of a lower triangular matrix (L) and an upper triangular matrix (U). It's commonly employed to simplify solving systems of linear equations and calculating determinants. It can be done by first solving a system with the lower triangular matrix L and then solving a system with the upper triangular matrix U.


One Square System = Two Triangular Systems

\(A\underline x=\underline b\)

Suppose elimination requires no row exchanges

\(A=LU\)

\(\Rightarrow LU\underline x=\underline b\)

\(U\underline x=L^{-1}\underline b=\underline c\)

We have \(U\underline x=\underline c\;where\;L\underline c=\underline b\)

1. Factor A=LU by Gaussian elimination.

2. Solve \(\underline c\;from\;L\underline c=\underline b\) (forward elimination)

and then solve \(U\underline x=\underline c\) (back substitution)


[Ex]

\(\begin{bmatrix}2&1&1\\4&-6&0\\-2&7&2\end{bmatrix}\begin{bmatrix}x\\y\\z\end{bmatrix}=\begin{bmatrix}5\\-2\\9\end{bmatrix}\)

\(A\underline x=\underline b\)

\(\begin{bmatrix}2&1&1\\4&-6&0\\-2&7&2\end{bmatrix}=\begin{bmatrix}1&0&0\\2&1&0\\-1&-1&1\end{bmatrix}\begin{bmatrix}2&1&1\\0&-8&-2\\0&0&1\end{bmatrix}\)

A=LU

\(\begin{bmatrix}1&0&0\\2&1&0\\-1&-1&1\end{bmatrix}\begin{bmatrix}c_1\\c_2\\c_3\end{bmatrix}=\begin{bmatrix}5\\-2\\9\end{bmatrix}\)

\(L\underline c=\underline b\)

\(\underline c=\begin{bmatrix}c_1\\c_2\\c_3\end{bmatrix}=\begin{bmatrix}5\\-12\\2\end{bmatrix}\)

\(\begin{bmatrix}2&1&1\\0&-8&-2\\0&0&1\end{bmatrix}\begin{bmatrix}x\\y\\z\end{bmatrix}=\begin{bmatrix}5\\-12\\2\end{bmatrix}\)

\(U\underline x=\underline c\)

\(\underline x=\begin{bmatrix}x\\y\\z\end{bmatrix}=\begin{bmatrix}1\\1\\2\end{bmatrix}\)



Complexity of Elimination

1. solve \(A\underline x=\underline b\)

\(A=\left[\begin{array}{lccc}a_{11}&a_{12}&\cdots&a_{1n}\\a_{21}&a_{22}&\cdots&a_{2n}\\\vdots&\vdots&\ddots&\vdots\\a_{n1}&a_{n2}&\cdots&a_{nn}\end{array}\right]=LU\)

\(L=\left[\begin{array}{lccc}a_{11}&0&\cdots&0\\a_{21}&a_{22}&\cdots&0\\\vdots&\vdots&\ddots&\vdots\\a_{n1}&a_{n2}&\cdots&a_{nn}\end{array}\right],\;U=\left[\begin{array}{lccc}a_{11}&a_{12}&\cdots&a_{1n}\\0&a_{22}&\cdots&a_{2n}\\\vdots&\vdots&\ddots&\vdots\\0&0&\cdots&a_{nn}\end{array}\right]\)

[Elimination]: 1st stage n(n-1), (# of multiplications and # of additions) \(n(n-1)\approx n^2\)

2nd stage \((n-1)(n-2)\approx{(n-1)}^2\)

3rd stage \(\approx{(n-2)}^2\)

\(n^2+{(n-1)}^2+...+1^2=\frac13n(n+\frac12)(n+1)\)

\(\approx\frac13n^3\)


\(L\underline c=\underline b\)

\(\left[\begin{array}{lccc}a_{11}&0&\cdots&0\\a_{21}&a_{22}&\cdots&0\\\vdots&\vdots&\ddots&\vdots\\a_{n1}&a_{n2}&\cdots&a_{nn}\end{array}\right]\begin{bmatrix}c_1\\c_2\\\vdots\\c_n\end{bmatrix}=\begin{bmatrix}b_1\\b_2\\\vdots\\b_n\end{bmatrix}\)

\((n-1)+(n-2)+...+1=\frac12n(n-1)\approx\frac12n^2\)


\(U\underline x=\underline c\)

\(\left[\begin{array}{lccc}p_1&a_{12}&\cdots&a_{1n}\\0&p_2&\cdots&a_{2n}\\\vdots&\vdots&\ddots&\vdots\\0&0&\cdots&p_n\end{array}\right]\begin{bmatrix}x_1\\x_2\\\vdots\\x_n\end{bmatrix}=\begin{bmatrix}c_1\\c_2\\\vdots\\c_n\end{bmatrix}\)

\(1+2+3+...+n=\frac12n(n+1)\approx\frac12n^2\)

\(\#=\frac12n^2+\frac12n^2=n^2\ll\frac13n^3\)

\(Total\;\#=\frac13n^3\)

Above is for one column solution. The complexity is to calculate one column only!


Compute \(A^{-1}\)

As \(A^{-1}\) is a \(N\times N\) matrix, we need solve N column system. Please don't confuse the complexity with previous section.

\(A\begin{bmatrix}\underline{x_1}&\underline{x_2}&\cdots&\underline{x_n}\end{bmatrix}=\begin{bmatrix}\underline{e_1}&\underline{e_2}&\cdots&\underline{e_n}\end{bmatrix}\)

\(where\;\underline{e_i}=\begin{bmatrix}0\\0\\1\\\vdots\\0\end{bmatrix}\Rightarrow i_{th}\;component\;=\;1\)

solve \(A\underline x=\underline b,\;\#=\frac13n^3\)

\(L\underline{c_i}=\;\underline{e_i}=\begin{bmatrix}0\\0\\1\\\vdots\\0\end{bmatrix}\)

\(\#=\frac12{(n-i+1)}^2\)

\(Since\;1\;\leq i\;\leq\;n,\;\#=\frac{n^2}2+\frac{{(n-1)}^2}2+...+\frac{2^2}2+\frac{1^2}2\approx\frac{n^3}6\)


\(U\underline{x_i}=\underline{c_i}\)

\(\#=\frac12n^2\)

\(Since\;1\;\leq i\;\leq\;n,\;\#=n\cdot\frac{n^2}2=\frac{n^3}2\)

\(Total\;\#=\frac{n^3}3+\frac{n^3}6+\frac{n^3}2=n^3\)

\(compared\;with\;\#\;for\;A^2=A\cdot A\Rightarrow n\cdot n^2=n^3\)



LU decompositions are NOT unique

\(A=LU=\begin{bmatrix}l_{11}&0&0\\l_{21}&l_{22}&0\\l_{31}&l_{32}&l_{33}\end{bmatrix}\begin{bmatrix}1&u_{12}&u_{13}\\0&1&u_{23}\\0&0&1\end{bmatrix}\)

\(=\begin{bmatrix}1&0&0\\\frac{l_{21}}{l_{11}}&1&0\\\frac{l_{31}}{l_{11}}&\frac{l_{32}}{l_{22}}&1\end{bmatrix}\begin{bmatrix}l_{11}&0&0\\0&l_{22}&0\\0&0&l_{33}\end{bmatrix}\begin{bmatrix}1&u_{12}&u_{13}\\0&1&u_{23}\\0&0&1\end{bmatrix}\)

\(=\begin{bmatrix}1&0&0\\\frac{l_{21}}{l_{11}}&1&0\\\frac{l_{31}}{l_{11}}&\frac{l_{32}}{l_{22}}&1\end{bmatrix}\begin{bmatrix}l_{11}&l_{11}u_{12}&l_{11}u_{13}\\0&l_{22}&l_{22}u_{23}\\0&0&l_{33}\end{bmatrix}\)

LDU decomposition

If a square, invertible matrix has an LDU (factorization with all diagonal entries of L and U equal to 1), then the factorization is unique.

Transposes and Permutations

\(A=\begin{bmatrix}1&2&3\\4&5&6\\7&8&9\end{bmatrix},\;A^T=\begin{bmatrix}1&4&7\\2&5&8\\3&6&9\end{bmatrix}\;\rightarrow\;transpose\;of\;A\)

\(A=\begin{bmatrix}1&2&3\\0&0&4\end{bmatrix},\;A^T=\begin{bmatrix}1&0\\2&0\\3&4\end{bmatrix}\)

\({(A^T)}_{ij}=A_{ji}\)

\(claim:\;{(A+B)}^T=A^T+B^T\)

\(claim:\;{(AB)}^T=B^TA^T,\;A:n\times m,\;B:m\times l\)

Proof: \({\lbrack{(AB)}^T\rbrack}_{ij}={(AB)}_{ji}=\overset m{\underset{k=1}\sum}A_{jk}B_{ki}=\sum_{k=1}^m{(A^T)}_{kj}{(B^T)}_{ik}\)

\(=\sum_{k=1}^m{(B^T)}_{ik}{(A^T)}_{kj}=B^T{A^T}_{ij}\)

Remark \({(ABC)}^T=C^TB^TA^T\)

\(claim:\;{(A^{-1})}^T={(A^T)}^{-1}\)

Proof: \(AA^{-1}=I\Rightarrow{(A^{-1})}^TA^T={(AA^{-1})}^T=I^T=I\)

\(A^{-1}A=I\Rightarrow A^T{(A^{-1})}^T={(A^{-1}A)}^T=I^T=I\)

\(\Rightarrow{(A^{-1})}^T={(A^T)}^{-1}\)

Def \(An\;n\times n\;matrix\;A\;is\;symmetric\;if\;A^T=A\)

Remark \(A_{ij}=A_{ji}\;if\;A\;is\;symmetric\)

\(A=\begin{bmatrix}1&2\\2&5\end{bmatrix}=A^T,\;D=\begin{bmatrix}1&0\\0&10\end{bmatrix}=D^T\)

\(claim:\;Given\;any\;matrix\;R_{n\times m},\;R_{m\times n}^TR_{n\times m}\;is\;symmetric\)

Proof: \({(R^TR)}^T=R^T{(R^T)}^T=R^TR\)

Remark \(RR^T\;is\;also\;symmetric\)

\(A=\begin{bmatrix}1&2\\2&7\end{bmatrix}\rightarrow symmetric\)

\(\begin{bmatrix}1&0\\-2&1\end{bmatrix}\begin{bmatrix}1&2\\2&7\end{bmatrix}=\begin{bmatrix}1&2\\0&3\end{bmatrix}\)

\(\begin{bmatrix}1&2\\2&7\end{bmatrix}=\begin{bmatrix}1&0\\2&1\end{bmatrix}\begin{bmatrix}1&2\\0&3\end{bmatrix}\)

\(=\begin{bmatrix}1&0\\2&1\end{bmatrix}\begin{bmatrix}1&0\\0&3\end{bmatrix}\begin{bmatrix}1&2\\0&1\end{bmatrix}\)

\(A=LDU\Rightarrow LDL^T\;as\;U=L^T\)

\(claim:\;If\;a\;symmetric\;matrix\;is\;factored\;into\;LDU\;with\;no\;row\;exchanges,\;then\;U=L^T\)

\(A=LDL^T\)

Proof: \(A=LDU\Rightarrow A^T=U^TD^TL^T=U^TDL^T\)

\(Since\;A\;is\;symmetric,\;A=LDU=U^TDL^T\)

\(Recall\;that\;this\;factorization\;is\;unique.\;We\;then\;have\;U=L^T\)


Permutation Matrix

Def: A permutation matrix has the rows of the Identity I in any order.

There are six 3x3 permutation matrices.



\(\therefore\;There\;are\;n!\;permutation\;matrices\;of\;order\;n\)

\(claim:\;If\;P\;is\;a\;permutation\;matrix,\;then\;P^{-1}=P^T\)

Proof: \(P=\left[0\begin{array}{ccccc}1&0&0&\cdots&0\end{array}\right]\;(i_{th}\;row),\;P^T=\begin{bmatrix}0\\1\\0\\0\\\vdots\\0\end{bmatrix}\;(i_{th}\;column)\)

\({(PP^T)}_{ii}=1\;for\;all\;i\)

\({(PP^T)}_{ij}\neq1\;for\;all\;i\neq j\)

\(\therefore PP^T=I\)

similarly, \(P^TP=I\)


\(Recall\;if\;no\;row\;exchanges\;are\;required,\;then\;A=LU\)

\(If\;row\;exchange\;are\;neede,\;we\;then\;have\)

\((\cdots P_1\cdots E_2\cdots P_3\cdots E_4)A=U\)

\(\Rightarrow A=(\cdots E_4^{-1}\cdots P_3^{-1}\cdots E_2^{-1}\cdots P_1^{-1})U\)

If row exchanges are needed during elimination, we can do them in advance.

The product \(PA\) will put the rows in the right order so that no row exchanges are needed for \(PA\),

Hence \(PA=LU\)

\(\begin{bmatrix}0&1&1\\1&2&1\\2&7&9\end{bmatrix}\Rightarrow\begin{bmatrix}1&2&1\\0&1&1\\2&7&9\end{bmatrix}\Rightarrow\begin{bmatrix}1&2&1\\0&1&1\\0&3&7\end{bmatrix}\Rightarrow\begin{bmatrix}1&2&1\\0&1&1\\0&0&4\end{bmatrix}\)

\(\begin{bmatrix}0&1&0\\1&0&0\\0&0&1\end{bmatrix}\begin{bmatrix}0&1&1\\1&2&1\\2&7&9\end{bmatrix}=\begin{bmatrix}1&0&0\\0&1&0\\2&3&1\end{bmatrix}\begin{bmatrix}1&2&1\\0&1&1\\0&0&4\end{bmatrix}\)

\(PA=LU\)

If we hold row exchanges until after elimination, we then have \(A=LPU\)


Elimination with only the necessary row exchanges will produce the triangular factorization \(A=LPU\), with the (unique) permutation P in the middle. The entries in L are reordered in comparison with the more familiar \(A=PLU\) (where P is not unique).


[Ex]

\(A=\begin{bmatrix}2&1&1\\2&1&-1\\1&3&1\end{bmatrix}\)

\(P_{23}A=\begin{bmatrix}2&1&1\\1&3&1\\2&1&-1\end{bmatrix}\)

\(P_{23}A=LU=\begin{bmatrix}1&0&0\\\frac12&1&0\\1&0&1\end{bmatrix}\begin{bmatrix}2&1&1\\0&\frac52&\frac12\\0&0&-2\end{bmatrix}\)

\(A=P_{23}^{-1}LU\)


\(E_1A=\begin{bmatrix}1&0&0\\-1&1&0\\-\frac12&0&1\end{bmatrix}A=\begin{bmatrix}2&1&1\\0&0&-2\\0&\frac52&\frac12\end{bmatrix}\)

\(P_{23}E_1A=\begin{bmatrix}2&1&1\\0&\frac52&\frac12\\0&0&-2\end{bmatrix}=U'\)

\(E_1A=P_{23}^{-1}U'=\begin{bmatrix}1&0&0\\0&0&1\\0&1&0\end{bmatrix}\begin{bmatrix}2&1&1\\0&\frac52&\frac12\\0&0&-2\end{bmatrix}\)

\(A=E_1^{-1}P_{23}^{-1}U'=\begin{bmatrix}1&0&0\\1&1&0\\\frac12&0&1\end{bmatrix}\begin{bmatrix}1&0&0\\0&0&1\\0&1&0\end{bmatrix}\begin{bmatrix}2&1&1\\0&\frac52&\frac12\\0&0&-2\end{bmatrix}\)

\(A=LPU\)

Vector Space and Subspaces

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

Matrix

Singular Value Decomposition



Example (C) Google PageRank

Let's assume the Web contains 6 pages only. Page 2 provides the links connecting to page 3 and page 4. The links between these and the other pages in this simple web are summarised in this diagram.



Their task was to find the "most important" page for a particular search query. The higher priority page would return at the top of the search result.

Google's use of eigenvalues and eigenvectors. Considering Page 1, it has 4 outgoing links (to pages 2, 4, 5, and 6).So in the first column of our "links matrix", we place value \(\frac14\) in each of rows 2, 4, 5 and 6.

\(A=\begin{bmatrix}0&0&0&0&\frac12&0\\\frac14&0&0&0&0&0\\0&\frac12&0&0&0&0\\\frac14&\frac12&0&0&\frac12&0\\\frac14&0&1&1&0&1\\\frac14&0&0&0&0&0\end{bmatrix}\)

Next, to find the eigenvalues. We have

\(\left|A-\lambda I\right|=\begin{bmatrix}-\lambda&0&0&0&\frac12&0\\\frac14&-\lambda&0&0&0&0\\0&\frac12&-\lambda&0&0&0\\\frac14&\frac12&0&-\lambda&\frac12&0\\\frac14&0&1&1&-\lambda&1\\\frac14&0&0&0&0&-\lambda\end{bmatrix}\)

\(=\lambda^6-\frac{5\lambda^4}8-\frac{\lambda^3}4-\frac{\lambda^2}8\)

This expression is zero for \(\lambda=-0.72031,\;-0.13985,\;\pm0.39240j,\;0,\;1\).

We can only use non-negative, real values of \(\lambda\;is\;1\)

we find the corresponding eigenvector is: \(V_1=\begin{bmatrix}4&1&0.5&5.5&8&1\end{bmatrix}^T\)

As Page 5 has the highest PageRank, we conclude it is the most "important", and it will appear at the top of the search results.

We often normalize this vector so the sum of its elements is 1.We just add up the amounts and divide each amount by that total. It is called normalized vector P for "PageRank".

\(P=\begin{bmatrix}0.2&0.05&0.025&0.275&0.4&0.05\end{bmatrix}^T\)

Visualization of Singular Value decomposition of a Symmetric Matrix

The Singular Value Decomposition of a matrix A satisfies

\(\mathbf A = \mathbf U \mathbf \Sigma \mathbf V^\top\) The visualization of it would look like



But when \(\mathbf A\) is symmetric we can do:

\( \begin{align*} \mathbf A\mathbf A^\top&=(\mathbf U\mathbf \Sigma\mathbf V^\top)(\mathbf U\mathbf \Sigma\mathbf V^\top)^\top\\ \mathbf A\mathbf A^\top&=(\mathbf U\mathbf \Sigma\mathbf V^\top)(\mathbf V\mathbf \Sigma\mathbf U^\top) \end{align*} \)

and since \(\mathbf V\) is an orthogonal matrix (\(\mathbf V^\top \mathbf V=\mathbf I\)), so we have:

\(\mathbf A\mathbf A^\top=\mathbf U\mathbf \Sigma^2 \mathbf U^\top\)


singular value decomposition



Multilinear singular value decomposition and low multilinear rank approximation

\[{\left[\begin{array}{cc}{\mathbf A}_{11}&{\mathbf A}_{12}\\{\mathbf A}_{21}&{\mathbf A}_{22}\end{array}\right]}{\left[\begin{array}{c}{\mathbf x}_1\\{\mathbf x}_2\end{array}\right]}={\left[\begin{array}{c}{\mathbf b}_1\\{\mathbf b}_2\end{array}\right]}\]

\[\begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix}\]



tricks for computing eigenvalues



Matrix 2x2

\(\begin{bmatrix}a&b\\c&d\end{bmatrix}\xrightarrow{quick}\lambda_1,\;\lambda_2\)

1) \(\frac12tr\left(\begin{bmatrix}\boxed a&b\\c&\boxed d\end{bmatrix}\right)=\frac{a+d}2=\frac{\lambda_1+\lambda_2}2=m\;(mean)\)

2) \(det\left(\begin{bmatrix}a&b\\c&d\end{bmatrix}\right)=ad-bc=\lambda_1\lambda_2=p\;(product)\)

\(p\;=\;m^2\;-\;d^2\;=(m\;+\;d)(m\;-\;d)\)

\(d^2\;=\;m^2\;-\;p\)

3) \(\lambda_1,\;\lambda_2\;=\;m\pm\sqrt{m^2-p}\;\)


Matrix 3x3

\(A=\begin{bmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{bmatrix}\)

\(\lambda^3-\beta_1\lambda^2+\beta_2\lambda-\beta_3=0\)

\(\beta_1=Trace\;of\;Matrix\;=\;Sum\;of\;diagonal\;elements\)

\(\beta_1=a_{11}+a_{22}+a_{33}\)

\(\beta_2=Sum\;of\;Minors\;of\;diagonal\;elements\)

\(\begin{vmatrix}\left(a_{11}\right)&a_{12}&a_{13}\\a_{21}&\left(a_{22}\right)&a_{23}\\a_{31}&a_{32}&\left(a_{33}\right)\end{vmatrix}\)

a11 a12 a13 a21 a22 a23 a31 a32 a33   a11 a12 a13 a21 a22 a23 a31 a32 a33   a11 a12 a13 a21 a22 a23 a31 a32 a33

\(\beta_2=\begin{vmatrix}a_{11}&a_{12}\\a_{21}&a_{22}\end{vmatrix}+\begin{vmatrix}a_{22}&a_{23}\\a_{32}&a_{33}\end{vmatrix}+\begin{vmatrix}a_{11}&a_{13}\\a_{31}&a_{33}\end{vmatrix}\)

\(\beta_3=det(A)\)

\(\beta_3=\begin{vmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{vmatrix}\)

\(det(A)=\begin{vmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{vmatrix} = a_{11} \begin{vmatrix} a_{22} & a_{23} \\ a_{32} & a_{33} \end{vmatrix} - a_{12} \begin{vmatrix} a_{21} & a_{23} \\ a_{31} & a_{33} \end{vmatrix} + a_{13} \begin{vmatrix} a_{21} & a_{22} \\ a_{31} & a_{32} \end{vmatrix}\)

\(del(A)=\begin{array}{c}a_{13}\\\;\\a_{33}\end{array}\begin{vmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{vmatrix}\begin{array}{c}a_{11}\\\;\\a_{31}\end{array}\)

\(=+\left[\begin{array}{c}\bcancel{a_{13}}\\\;\\a_{33}\\\;\end{array}\begin{vmatrix}\bcancel{a_{11}}&\bcancel{a_{12}}&a_{13}\\\bcancel{a_{21}}&\bcancel{a_{22}}&\bcancel{a_{23}}\\a_{31}&\bcancel{a_{32}}&\bcancel{a_{33}}\\\;&\;&i\end{vmatrix}\begin{array}{cc}a_{11}&\;\\\;&\;\\\bcancel{a_{31}}&\;\\j&k\end{array}\right]-\left[\begin{array}{cc}\;&a_{13}\\\;&\;\\\;&\cancel{a_{33}}\\x&y\end{array}\begin{vmatrix}a_{11}&\cancel{a_{12}}&\cancel{a_{13}}\\\cancel{a_{21}}&\cancel{a_{22}}&\cancel{a_{23}}\\\cancel{a_{31}}&\cancel{a_{32}}&a_{33}\\z&\;&\;\end{vmatrix}\begin{array}{c}\cancel{a_{11}}\\\;\\a_{31}\\\;\end{array}\right]\)

\(=(i\;+\;j\;+\;k)\;–\;(x\;+\;y\;+\;z)\)


Example (D)

Humans and Zombies

Let me a give a toy example. Suppose that I have a population of humans and zombies. We’ll check in on them once every few days (what’s the worst that can happen?). Suppose that every human has an 80% chance of surviving those few days, and a 20% chance of turning into a zombie. Thankfully, the humans aren’t completely defenseless—they have found a cure! Unfortunately, it’s very difficult to apply the cure, so every zombie has a 99% chance of remaining a zombie, and a 1% chance of actually being cured. Let’s write this information down in matrix form:

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}\)

Here is why this is useful: if I write the number of humans \(H\) and the number of zombies \(Z\) as a vector\((H, Z)\), then multiplying by the above matrix will exactly tell me how many humans and how many zombies I should expect in a couple of days. For instance, suppose that I start with one zombie and 99 humans, and check back in two days. Well,

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}\begin{pmatrix}99\\1\end{pmatrix}=\begin{pmatrix}79.21\\20.79\end{pmatrix}\)

so I’m expecting that there should be about 79 humans and 21 zombies at this point. What if I check back in another two days? Well,

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}\begin{pmatrix}79.21\\20.79\end{pmatrix}\approx\begin{pmatrix}63.58\\36.42\end{pmatrix}\)

so I’m expecting that there should be about 64 humans and 36 zombies. I can keep going: after six days, there will be 51 humans and 49 zombies; after eight days, there will be 41 humans and 59 zombies; after ten days, there will be 34 humans and 66 zombies…

This is starting to look worrisome. Are the humans going to be wiped out? It’s certainly possible, but we can use some techniques from linear algebra and calculus to figure out what will happen in the long term. To wit, what we have shown is that the number of humans and zombies after \(2n\) days will be

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}^n\begin{pmatrix}99\\1\end{pmatrix}=\begin{pmatrix}H\\Z\end{pmatrix}\)

On the other hand, one can work out (look up Jordan decomposition if you are interested—you may need to brush up on some linear algebra first) that

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}\approx\begin{pmatrix}-0.05&-0.71\\-1.00&0.71\end{pmatrix}\begin{pmatrix}1&0\\0&0.79\end{pmatrix}\begin{pmatrix}-0.05&-0.71\\-1.00&0.71\end{pmatrix}^{-1}\)

from which it follows that

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}^n=\begin{pmatrix}-0.05&-0.71\\-1.00&0.71\end{pmatrix}\begin{pmatrix}1&0\\0&0.79\end{pmatrix}^n\begin{pmatrix}-0.05&-0.71\\-1.00&0.71\end{pmatrix}^{-1}\)

but since

\(\begin{pmatrix}1&0\\0&0.79\end{pmatrix}^{n\rightarrow\infty}\rightarrow\begin{pmatrix}1&0\\0&0\end{pmatrix}\)

as \(n→∞\) (that is, as \(n\) gets very large, the matrix that we multiply by itself over and over again gets closer and closer to the matrix on the right hand side), it follows that

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}^n=\begin{pmatrix}-0.05&-0.71\\-1.00&0.71\end{pmatrix}\begin{pmatrix}1&0\\0&0\end{pmatrix}\begin{pmatrix}-0.05&-0.71\\-1.00&0.71\end{pmatrix}^{-1}=\begin{pmatrix}0.05&0.05\\0.95&0.95\end{pmatrix}\)

Thus, if we start with 99 humans and 1 zombie, the expectation is that after a long period of time, we’ll have

\(\begin{pmatrix}0.05&0.05\\0.95&0.95\end{pmatrix}\begin{pmatrix}99\\1\end{pmatrix}\approx\begin{pmatrix}4.76\\95.24\end{pmatrix}\)

—that is, about 5 humans and 95 zombies. The good news is that the humans survive; the bad news is that they only just barely survive.

Naturally, zombies aren’t very “real life,” but the spread of genes through a population is, and Markov chains can be applied to that perfectly well. Figuring out how to weigh different websites based on how many other sites link to them can also be modeled as a Markov process—this is the fundamental observation behind PageRank, Google’s algorithm for figuring out what websites it should display first. Predictive text works on basically the same principle. In fact, I strongly recommend looking at the Wikipedia page for applications of Markov chains, because there are a lot of them, coming from all sorts of different disciplines.


Zombie outbreak

Let \(y_t, z_t\) represent the number of human and zombies (respectively) \(t\) months after an initial outbreak. Suppose the outbreak obeys the predator-prey model

\(\begin{bmatrix}y_{t+1}\\z_{t+1}\end{bmatrix}=\begin{bmatrix}1.02&-0.1\\0.08&0.83\end{bmatrix}\begin{bmatrix}y_t\\z_t\end{bmatrix}\)

Prove that any number of zombies will eventually destroy humanity

Before we get to the solution, it's probably important to say a bit more about what this “predator-prey model” means. This matrix equation is equivalent to a system of two linear equations:

\(y_{t+1}=1.02y_t-0.1z_t\)

\(z_{t+1}=0.08y_t+0.83z_t\)

The first equation says that the human population \((y_t)\) would grow exponentially 2% per month in the absence of zombies, but the zombies kill off one-tenth of their number in humans. The second equation says the zombie population decreases in the absence of humans (they require humans as prey) but that most of the humans they kill turn into more zombies. Details and exact numbers aside, this is a reasonable approximation of how one could model a predator-prey relationship. We could replace humans and zombies with the more traditional rabbits and foxes or whatever. The math doesn't care.

Solution outline: use eigenvalues! You can calculate them manually or by calculator, but they are approximately the following:

\(\lambda_1\approx.957,\;\lambda_2\approx.893\)

According to the theorem above, this means that regardless of the initial populations, we have a general solution of the form

\(\begin{bmatrix}y_t\\z_t\end{bmatrix}=c_1\left(.957\right)^tv_1+c_2\left(.893\right)^tv_2\)

Since both eigenvalues are less than 1, this population vector converges to zero as t increases. That is, humanity slowly perishes (and the zombies with them).

We could even be a bit more precise about how they perish — in the long run, the second term is asymptotically smaller in magnitude than the first. We can say that

\(\lim_{t\rightarrow\infty}\begin{bmatrix}y_t\\z_t\end{bmatrix}\lambda_1^t=c_1v_1\)

so the population decays exponentially along the \(λ_1\) eigenvector \(v_1\), which in this case represents a ratio of approximately 846 humans to 533 zombies. So the leading eigenvalue tells you the rate of growth/decay, and the leading eigenvector tells you the limiting population ratios as they grow or decay.

I could go on or do a more complex example, but hopefully this conveys what eigenvalues and eigenvectors tell you at least for the predator/prey linear dynamical system model.

In general, eigenvectors and eigenvalues are crucial for determining the long-term behavior of all kinds of models. A partial list of applications includes the following:

Population Dynamics (biology/epidemiology)

Stabilizing a system e.g. antilock brakes (control theory/engineering)

Finding resonant frequencies (engineering, physics)

Ranking internet pages by importance (computer science)

Principal component analysis (statistics)

Finding stable axes of rotation (physics)

Example (E)

Truck Rental Company

A truck rental company has locations all over Vancouver, where you can rent moving trucks. You can return them to any other location. For simplicity, pretend that there are three locations, and that every customer returns their truck the next day. Let \(v_t\) be the vector whose entries \(x_t, y_t, z_t\) are the number of trucks in locations \(1, 2\), and \(3\), respectively. Let \(A\) be the matrix whose \(i, j\) -entry is the probability that a customer renting a truck from location \(j\) returns it to location \(i\). For example, the matrix

\(A=\begin{pmatrix}.3&.4&.5\\.3&.4&.3\\.4&.2&.2\end{pmatrix}\)

\(AQ=Q\Lambda\)

\( \begin{pmatrix}.3&.4&.5\\.3&.4&.3\\.4&.2&.2\end{pmatrix}\begin{pmatrix}1.4&-1&0.5\\1.2&0&-1.5\\1&1&1\end{pmatrix}=\begin{pmatrix}1.4&-1&0.5\\1.2&0&-1.5\\1&1&1\end{pmatrix}\begin{pmatrix}1&0&0\\0&-0.2&0\\0&0&0.1\end{pmatrix} \)


Therefore, the Truck numbers in location are \(1, 2\), and \(3\) is \(1.4:1.2:1\). That is \(7:6:5\).


Eigenvalue Decomposition


Orthogonal projectors

Truck Rental Company

P is an orthogonal projector if and only if \(P^2=P\) and \(AP\) is symmetric.

In linear algebra, an \(Orthogonal\) \(Matrix\), or \(Orthonormal\) \(Matrix\), is a real square matrix whose columns and rows are orthonormal vectors. One way to express this is

\(Q^TQ=QQ^T=I,\)

where \(QT\) is the transpose of \(Q\), and \(I\) is the identity matrix.

This leads to the equivalent characterization: a matrix \(Q\) is orthogonal if its transpose is equal to its inverse:

\(Q^T=Q^{-1},\)

where \(Q^{-1}\) is the inverse of \(Q\).

An orthogonal matrix Q is necessarily invertible (with inverse \(Q^{-1}=Q^T\)), unitary \((Q^{-1}=Q^*)\), where \(Q^∗\) is the Hermitian adjoint (conjugate transpose) of \(Q\), and therefore normal \((Q^∗Q = QQ^∗)\) over the real numbers.
The determinant of any orthogonal matrix is either +1 or -1. As a linear transformation, an orthogonal matrix preserves the inner product of vectors, and therefore acts as an isometry of Euclidean space, such as a rotation, reflection or rotoreflection. In other words, it is a unitary transformation.



Orthogonal Matrix


Determinant

“negative area” in the case of 2x2 matrix or “negative volume” in the case of 3x3 matrix. It is related to the relative orientation of vectors. You maybe heard of the “right hand rule”






Why determinant of a 2 by 2 matrix is the area of a parallelogram?

Parallelogram = Rectangle - Extra Stuff.

\((c+a)(b+d)-2ad-cd-ab\)

Also interesting to note that if you swap vectors places then you get a negative(opposite of what \(bc−ad\) would produce) area, which is basically:

-Parallelogram = Rectangle - (2*Rectangle - Extra Stuff)

Or more concretely:

\((c+a)(b+d) - [2*(c+a)(b+d) - (2ad+cd+ab)]\)

Also it's \(ad−bc\), when simplified.


so with a little algebra we find that the base length is \(a-b\times\frac{c}{d}\)

Hence the area is just \(d(a-\frac{bc}{d}) = ad - bc\)




Determinant of a \(3×3\) matrix

Proof that the determinant of a 3×3 matrix is the volume of the parallelepiped spanned by the columns.



General Formula for the Inverse of a \(2×2\) matrix

The inverse of \(A\) is \(A^{-1}\) only when:\(AA^{-1} = A^{-1}A = I\). Sometimes there is no inverse at all.

\(A=\begin{bmatrix}a&b\\c&d\end{bmatrix}\)

\(det(A)=ad-bc\)

\(A^{-1}=\begin{bmatrix}a&b\\c&d\end{bmatrix}^{-1}=\frac1{det(A)}\begin{bmatrix}d&-b\\-c&a\end{bmatrix}=\frac1{ad-bc}\begin{bmatrix}d&-b\\-c&a\end{bmatrix}\)


General Formula for the Inverse of a \(3×3\) matrix

\(let M=\begin{bmatrix}a&b&c\\d&e&f\\g&h&i\end{bmatrix},\)

Now produce the matrix of minors

\(\begin{bmatrix}ei-fh&di-fg&dh-eg\\bi-ch&ai-cg&ah-bg\\bf-ce&af-cd&ae-bd\end{bmatrix},\)

Use the alternating law of signs to produce the matrix of cofactors \(M^\backprime \)

\(M^\backprime =\begin{bmatrix}ei-fh&fg-di&dh-eg\\ch-bi&ai-cg&bg-ah\\bf-ce&cd-af&ae-bd\end{bmatrix}\)

Transpose\(M^\backprime\)

\(M^{\backprime T}=\begin{bmatrix}ei-fh&ch-bi&bf-ce\\fg-di&ai-cg&cd-af\\dh-eg&bg-ah&ae-bd\end{bmatrix}\)

\(det\;M=a(ei-fh)-b(di-fg)+c(dh-eg)\)

The grand formula:

\(M^{-1}=\frac1{a(ei-fh)-b(di-fg)+c(dh-eg)}\begin{bmatrix}ei-fh&ch-bi&bf-ce\\fg-di&ai-cg&cd-af\\dh-eg&bg-ah&ae-bd\end{bmatrix}\)


Laplace expansion

Laplace expansion expresses the determinant of a matrix \(A\) recursively in terms of determinants of smaller matrices, known as its minors. The minor \(M_{i,j}\) is defined to be the determinant of the \((n - 1) × (n - 1)\)-matrix that results from \(A\) by removing the \(i\)-th row and the \(j\)-th column. The expression \({(-1)}^{i+j}M_{i,j}\) is known as a cofactor. For every \(i\), one has the equality

\(det(A)=\sum_{j=1}^n{(-1)}^{i+j}a_{i,j}M_{i,j}\)




Example (F)

A Population Growth Model

A population of rabbits has the following characteristics.

(i) Half of the rabbits survive their first year. Of those, half survive their second year. The maximum life span is 3 years.

(ii) During the first year, the rabbits produce no offspring. The average number of offspring is 6 during the second year and 8 during the third year.

The population now consists of 24 rabbits in the first age class, 24 in the second, and 20 in the third. How many rabbits will there be in each age class in 1 year?

\(A=\begin{bmatrix}0&6&8\\0.5&0&0\\0&0.5&0\end{bmatrix}\)

\(x_1=\begin{bmatrix}24\\24\\20\end{bmatrix}\)

\(x_2=Ax_1=\begin{bmatrix}0&6&8\\0.5&0&0\\0&0.5&0\end{bmatrix}\begin{bmatrix}24\\24\\20\end{bmatrix}=\begin{bmatrix}304\\12\\12\end{bmatrix}\)

Finding a Stable Age Distribution Vector

\(Ax=\lambda x\Rightarrow\vert A-\lambda I\vert=0\)

\(det(A-\lambda I)=-\lambda^3+3\lambda+2=-{(\lambda+1)}^2(\lambda-2)\)

\(\lambda=2\Rightarrow x=\begin{bmatrix}16\\4\\1\end{bmatrix}\)

\(\begin{bmatrix}0&6&8\\0.5&0&0\\0&0.5&0\end{bmatrix}\begin{bmatrix}16\\4\\1\end{bmatrix}=2\cdot\begin{bmatrix}16\\4\\1\end{bmatrix}=\begin{bmatrix}32\\8\\2\end{bmatrix}\)

Notice that the ratio of the three age classes is still 16 : 4 : 1, and so the percent of the population in each age class remains the same.


Convolution as Toeplitz matrix

In linear algebra, a Toeplitz matrix or diagonal-constant matrix, named after Otto Toeplitz, is a matrix in which each descending diagonal from left to right is constant. For instance, the following matrix is a Toeplitz matrix:

Given \(2n-1\) numbers \(a_k\) where \(k=-n+1,\;...,\;-1,\;0,\;1,\;...,\;n-1\), a Toeplitz matrix is a matrix which has constant values along negative-sloping diagonals, i.e., a matrix of the form

\(\begin{bmatrix}a_0&a_{-1}&a_{-2}&\cdots&a_{-n+1}\\a_1&a_0&a_{-1}&\ddots&\vdots\\a_2&a_1&a_0&\ddots&a_{-2}\\\vdots&\ddots&\ddots&\ddots&a_{-1}\\a_{n-1}&\cdots&a_2&a_1&a_0\end{bmatrix}\)

Matrix equations of the form

\(\sum_{j=1}^na_{i-j}x_j=y_i\)

\(A_{i,j}=A_{i+1,j+1}=a_{i-j}\)

can be solved with \(\mathcal O(n^2)\) operations. Typical problems modelled by Toeplitz matrices include the numerical solution of certain differential and integral equations (regularization of inverse problems), the computation of splines, time series analysis, signal and image processing, Markov chains, and queuing theory (Bini 1995).



matrix multiplication

Horizontal shear
with m = 1.25.
Reflection through the vertical axis Squeeze mapping
with r = 3/2
Scaling
by a factor of 3/2
Rotation
by π/6 = 30°
\(\begin{bmatrix}1&1.25\\0&1\end{bmatrix}\) \(\begin{bmatrix}-1&0\\0&1\end{bmatrix}\) \(\begin{bmatrix}\frac32&0\\0&\frac23\end{bmatrix}\) \(\begin{bmatrix}\frac32&0\\0&\frac32\end{bmatrix}\) \(\begin{bmatrix}\cos(\frac\pi6)&-\sin(\frac\pi6)\\\sin(\frac\pi6)&\cos(\frac\pi6)\end{bmatrix}\)

Left matrix one row is zero

\(\begin{bmatrix}X&Y&Z\\0&0&0\\A&B&C\end{bmatrix}\begin{bmatrix}1&2&3\\4&5&6\\7&8&9\end{bmatrix}=\begin{bmatrix}\alpha&\beta&\gamma\\0&0&0\\\chi&\psi&\omega\end{bmatrix}\)


Right matrix one column is zero

\(\begin{bmatrix}1&2&3\\4&5&6\\7&8&9\end{bmatrix}\begin{bmatrix}X&0&A\\Y&0&B\\Z&0&C\end{bmatrix}=\begin{bmatrix}\alpha&0&\chi\\\beta&0&\psi\\\gamma&0&\omega\end{bmatrix}\)


Name Example with n = 3
Diagonal matrix \(\begin{bmatrix}a_{11}&0&0\\0&a_{22}&0\\0&0&a_{33}\end{bmatrix}\)
Lower triangular matrix \(\begin{bmatrix}a_{11}&0&0\\a_{21}&a_{22}&0\\a_{31}&a_{32}&a_{33}\end{bmatrix}\)
Upper triangular matrix \(\begin{bmatrix}a_{11}&a_{12}&a_{13}\\0&a_{22}&a_{23}\\0&0&a_{33}\end{bmatrix}\)

\(\begin{bmatrix}x'\\y'\\1\end{bmatrix}=\begin{bmatrix}1&0&t_x\\0&1&t_y\\0&0&1\end{bmatrix}\begin{bmatrix}x\\y\\1\end{bmatrix}\)


Matrix Stretch

\(S_1\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}1+\delta&0\\0&1\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}(1+\delta)x\\y\end{bmatrix}\)

\(S_2\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}1&0\\0&1+\delta\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}x\\(1+\delta)y\end{bmatrix}\)

\(S_{12}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}1+\delta&0\\0&1+\delta\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}(1+\delta)x\\(1+\delta)y\end{bmatrix}\)


Matrix Shrink

\(A\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}\frac1{1+\delta}&0\\0&\frac1{1+\delta}\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}\frac x{1+\delta}\\\frac y{1+\delta}\end{bmatrix}\)


Maxtrix Reflection

\(R_1\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}-1&0\\0&1\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}-x\\y\end{bmatrix}\)

\(R_2\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}1&0\\0&-1\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}x\\-y\end{bmatrix}\)

\(R_{12}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}-1&0\\0&-1\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}-x\\-y\end{bmatrix}\)


Matrix Shear

\(T_1\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}1&\delta\\0&1\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}x+\delta y\\y\end{bmatrix}\)

\(T_2\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}1&0\\-\delta&1\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}x\\-\delta x+y\end{bmatrix}\)

\(T_{12}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}1&\delta\\-\delta&1\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}=\begin{bmatrix}x+\delta y\\-\delta x+y\end{bmatrix}\)


Matrix Rotation
Counter-clockwise Rotation Matrix in 2D Derivation

Expressing (x, y) in the polar form we have;

\(x=r\cos(\nu)\;-\;(1)\)

\(y=r\sin(\nu)\;-\;(2)\)

Similarly, expressing (x', y') in polar form

x' = r cos (v + θ)

y' = r sin (v + θ)

Expanding the brackets using trigonometric identities we get,

x' = r (cos v.cos θ - sin v.sin θ)

= r cos v.cos θ - r sin v.sin θ

From (1) and (2) we have,

x' = x cos θ - y sin θ -- (3)

y' = r (sin v.cos θ + cos v.sin θ)

= r sin v.cos θ + r cos v.sin θ

y' = y cos θ + x sin θ -- (4)

If we take the help of a 2 x 2 rotation matrix to denote (3) and (4) we get,

\(\begin{bmatrix} x' \\ \\y' \end{bmatrix}\) = \(\begin{bmatrix} cos\theta & -sin\theta \\ \\sin\theta& cos\theta \end{bmatrix}\) \(\begin{bmatrix} x \\ \\y \end{bmatrix}\).

Thus, \(\begin{bmatrix} cos\theta & -sin\theta \\ \\sin\theta& cos\theta \end{bmatrix}\) will be the rotation matrix.


Rotation Matrix in 3D

In 3D space, rotation can occur about the x, y, or z-axis. Such a type of rotation that occurs about any one of the axis is known as a basic or elementary rotation. Given below are the rotation matrices that can rotate a vector through an angle about any particular axis.

P (x, \(\gamma\)) = \(\begin{bmatrix} 1 & 0 & 0\\ 0 & cos\gamma & -sin\gamma \\ 0& sin\gamma & cos\gamma \end{bmatrix}\). This is also known as a roll. It is defined as the counterclockwise rotation of \(\gamma\) about the x axis.

P (y, \(\beta\)) = \(\begin{bmatrix} cos\beta & 0 & sin\beta\\ 0 &1 & 0 \\ -sin\beta & 0 & cos\beta \end{bmatrix}\). Such a matrix is known as a pitch. Here, it represents the counterclockwise rotation of \(\beta\) about the y axis.

P (z, \(\alpha\)) = \(\begin{bmatrix} cos\alpha & -sin\alpha &0 \\ sin\alpha & cos\alpha & 0 \\ 0& 0 & 1 \end{bmatrix}\). This rotation matrix is called a yaw and it is the the counterclockwise rotation of \(\alpha\) about the z axis.

According to the convention, a positive rotation given by angle θ is used to denote a counter-clockwise rotation. However, if we change the signs according to the right-hand rule, we can also represent clockwise rotations. The right-hand rule states that if you curl your fingers around the axis of rotation, where the fingers point to the direction of θ then the thumb points perpendicular to the plane of rotation in the direction of the axis of rotation.

Now if we want to find the new coordinates (x', y', z') of a vector(x, y, z) after rotation about a particular axis we follow the formula given below:

\(\begin{bmatrix} x'\\ y'\\ z' \end{bmatrix}\) = P(x, y or z) \(\begin{bmatrix} x\\ y\\ z \end{bmatrix}\)

Suppose an object is rotated about all three axes, then such a rotation matrix will be a product of the three aforementioned rotation matrices [P (z, \(\alpha\)), P (y, \(\beta\)) and P (x, \(\gamma\))]. The general rotation matrix is represented as follows:

P = \(\begin{bmatrix} cos\alpha & -sin\alpha &0 \\ sin\alpha & cos\alpha & 0 \\ 0& 0 & 1 \end{bmatrix}\) \(\begin{bmatrix} cos\beta & 0 & -sin\beta\\ 0 &1 & 0 \\ sin\beta & 0 & cos\beta \end{bmatrix}\) \(\begin{bmatrix} 1 & 0 & 0\\ 0 & cos\gamma & -sin\gamma \\ 0& sin\gamma & cos\gamma \end{bmatrix}\)

To find the coordinates of the rotated vector about all three axes we multiply the rotation matrix P with the original coordinates of the vector.


Matrix Inverse

\(A^{-1}A=AA^{-1}=I\)

\(S=\begin{bmatrix}1+\delta&0\\0&1\end{bmatrix},\;its\;inverse\;S^{-1}=\begin{bmatrix}\frac1{1+\delta}&0\\0&1\end{bmatrix}\)

\(SS^{-1}=\begin{bmatrix}1+\delta&0\\0&1\end{bmatrix}\begin{bmatrix}\frac1{1+\delta}&0\\0&1\end{bmatrix}=\begin{bmatrix}\frac{1+\delta}{1+\delta}&0\\0&1\end{bmatrix}=\begin{bmatrix}1&0\\0&1\end{bmatrix}\)

\(T_1=\begin{bmatrix}1&\delta\\0&1\end{bmatrix}\;has\;inverse\;T_1^{-1}=\begin{bmatrix}1&-\delta\\0&1\end{bmatrix}\)

\(T_1T_1^{-1}=\begin{bmatrix}1&\delta\\0&1\end{bmatrix}\;\begin{bmatrix}1&-\delta\\0&1\end{bmatrix}=\begin{bmatrix}1&-\delta+\delta\\0&1\end{bmatrix}=\begin{bmatrix}1&0\\0&1\end{bmatrix}\)


Matrix Projection

A square matrix \(P\) is called a projection matrix if it is equal to its square \(P^2=P\)

Orthogonal projection

\(P=\begin{bmatrix}1&0&0\\0&1&0\\0&0&0\end{bmatrix}\)

\(P\begin{bmatrix}x\\y\\z\end{bmatrix}=\begin{bmatrix}x\\y\\0\end{bmatrix}\Rightarrow P^2\begin{bmatrix}x\\y\\z\end{bmatrix}=P\begin{bmatrix}x\\y\\0\end{bmatrix}=\begin{bmatrix}x\\y\\0\end{bmatrix}\)

\(P^2=P\)

Oblique projection

\(P=\begin{bmatrix}0&0\\\alpha&1\end{bmatrix}\)

\(P^2=\begin{bmatrix}0&0\\\alpha&1\end{bmatrix}\begin{bmatrix}0&0\\\alpha&1\end{bmatrix}=\begin{bmatrix}0&0\\\alpha&1\end{bmatrix}=P\)

\(\)


Orthogonal Matrices
Circulant Matrices
Moore-Penrose Pseudoinverse

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)

\(\)


Hankel matrix

∮ ∯ ∰ ∫∬ ∭⨌

½, ↉, ⅓, ⅔, ¼, ¾, ⅕, ⅖, ⅗, ⅘, ⅙, ⅚, ⅐, ⅛, ⅜, ⅝, ⅞, ⅑, ⅒

remark_a8_brown

remark_a9_yellow

remark_a10_blue

remark_a1

The Spectral Theorem