線性代數 Linear Algebra

線性代數（Linear Algebra）

Mathematics Yoshio Mar 24th, 2024 at 8:00 PM 8 0

線性代數

Linear Algebra

Eigenvalues and Eigenvectors

The prefix eigen- is adopted from the German word eigen (cognate with the English word own) for 'proper', 'characteristic', 'own'.

\(A \vec{v} =\lambda \vec{v} \)

Notice we multiply a squarematrix by a vector and get the same result as when we multiply a scalar (just a number) by that vector.

We start by finding the eigenvalue. We know this equation must be true:

\(A \vec{v} =\lambda \vec{v} \)

Next we put in an identity matrix so we are dealing with matrix-vs-matrix:

\(A \vec{v} =\lambda {I} \vec{v} \)

Bring all to left hand side:

\(A \vec{v} -\lambda I \vec{v} = 0\)

If v is non-zero then we can (hopefully) solve for λ using just the determinant:

\(det(A - \lambda I) = 0\)

\(| A − \lambda I | = 0\)

trace of A

The matrix A has n eigenvalues (including each according to its multiplicity). The sum of the n eigenvalues of A is the same as the trace of A (that is, the sum of the diagonal elements of A). The product of the n eigenvalues of A is the same as the determinant of A.

Perpendicular Vector

While both involve multiplying the magnitudes of two vectors, the dot product results in a scalar quantity, which indicates magnitude but not direction, while the cross product results in a vector, which indicates magnitude and direction.

Inner product (Dot product)

Since the inner product generalizes the dot product, it is reasonable to say that two vectors are “orthogonal” (or “perpendicular”) if their inner product is zero. With this definition, we see from the preceding example that sin x and cosx are orthogonal (on the interval [−π, π]).

\(\overset\rightharpoonup A\cdot\overset\rightharpoonup B=\sum_{i=1}^n\;a_ib_i=a_1b_1+a_2b_2+\cdots+a_nb_n\)

Outer product (Cross product)

A cross product of any two vectors yields a vector which is perpendicular to both vectors .

\(\because\) for two vectors \(\overrightarrow A\) and \(\overrightarrow B\), if \(\overrightarrow C\) is the vector perpendicular to both.

\(\overrightarrow C=\overrightarrow A\times\overrightarrow B=\begin{bmatrix}\widehat i&\widehat j&\widehat k\\A_1&A_2&A_3\\B_1&B_2&B_3\end{bmatrix}\\=(A_2B_3-B_2A_3)\widehat i-(A_1B_3-B_1A_3)\widehat j+(A_1B_2-B_1A_2)\widehat k\)

Inserting given vectors we obtain

\(\overrightarrow C=\begin{bmatrix}\widehat i&\widehat j&\widehat k\\2&3&4\\1&2&3\end{bmatrix}\)

\(={(3\times3-2\times4)}\widehat i-{(2\times3-1\times4)}\widehat j+{(2\times2-1\times3)}\widehat k\)

\(=\widehat i-2\widehat j+\widehat k\)

Cross product trick

\( \vec{A} \times \vec{B}=\begin{vmatrix} \hat{i}&\hat{j}&\hat{k} \\ A_x & A_y & A_z \\ B_x&B_y&B_z \end{vmatrix} \)

\( \vec{A}\times\vec{B}=(A_y B_z- A_z B_y)\hat{i}+(A_z B_x - A_x B_z) \hat{j}+(A_x B_y- A_y B_x) \hat{k} \nonumber \)

Matrix multiplication

There are two basic types of matrix multiplication: inner (dot) product and outer product. The inner product results in a matrix of reduced dimensions, the outer product results in one of expanded dimensions. A helpful mnemonic: Expand outward, contract inward.

Eigenvalue Decomposition

Example (A)

As a real example of this, consider a simple model of the weather. Let’s say it either is sunny or it rains. If it’s sunny, there’s a 90% chance that it remains sunny, and a 10% chance that it rains the next day. If it rains, there’s a 40% chance it rains the next day, but a 60% chance of becoming sunny the next day. We can describe the time evolution of this process through a matrix equation:

\( \begin{bmatrix}{}_{Ps,n+1}\\{}_{Pr,n+1}\end{bmatrix}=\begin{bmatrix}0.9&0.6\\0.1&0.4\end{bmatrix}\begin{bmatrix}{}_{Ps,n}\\{}_{Pr,n}\end{bmatrix} \)

where \({}_{Ps,n}\), \({}_{Ps,n}\) are the probabilities that it is sunny or that it rains on day \(n\). (If you are curious, this kind of process is called a Markov process, and the matrix that I have defined is called the transition matrix. This is, by the way, not a very good model of the weather.) One way to see that this matrix does provide the intended result is to see what results from matrix multiplication:

\(p_{s,n+1}=0.9p_{s,n}+0.6p_{r,n}\)
\(p_{r,n+1}=0.1p_{s,n}+0.4p_{r,n}\)

This is just the law of total probability! To get a sense of the long run probabilities of whether it rains or not, let’s look at the eigenvectors and eigenvalues.

\(\begin{bmatrix}0.986394\\0.164399\end{bmatrix}\) with eigenvalue 1, \(\begin{bmatrix}‐0.707\\0.707\end{bmatrix}\) with eigenvalue 0.3

Now, this matrix ends up having an eigenvector of eigenvalue 1 (in fact, all valid Markov transition matrices do), while the other eigenvector is both unphysical (probabilities cannot be negative) and has an eigenvalue less than 1. Therefore, in the long run, we expect the probability of it being sunny to be 0.986 and the probability of it being rainy to be 0.164 in this model. This is known as the stationary distribution. Any component of the vector in the direction of the unphysical eigenvector will wither away in the long run (its eigenvalue is 0.3, and \(0.3^k\rightarrow0\;as\;k\rightarrow\infty\), while the one with eigenvalue 1 will survive and dominate.

Eigenspaces

The set of all eigenvectors of T corresponding to the same eigenvalue, together with the zero vector, is called an eigenspace, or the characteristic space of T associated with that eigenvalue. If a set of eigenvectors of T forms a basis of the domain of T, then this basis is called an eigenbasis.

Eigenvectors of Symmetric Matrices Are Orthogonal

軸元 Pivot

軸元（英語：pivot或pivot element）是矩陣、陣列或是其他有限集合的一個演算元素，算法（如高斯消去法、快速排序、單體法等等）首先選出軸元，用於特定計算。

The pivot or pivot element is the element of a matrix, or an array, which is selected first by an algorithm (e.g. Gaussian elimination, simplex algorithm, etc.), to do certain calculations.

Markov chain

A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happens next depends only on the state of affairs now." A countably infinite sequence, in which the chain moves state at discrete time steps, gives a discrete-time Markov chain (DTMC). A continuous-time process is called a continuous-time Markov chain (CTMC). It is named after the Russian mathematician Andrey Markov.

Markov chains have many applications as statistical models of real-world processes, such as studying cruise control systems in motor vehicles, queues or lines of customers arriving at an airport, currency exchange rates and animal population dynamics.

Example (B)

The Linear Algebra View of the Fibonacci Sequence

The Fibonacci sequence first appears in the book Liber Abaci (The Book of Calculation, 1202) by Fibonacci where it is used to calculate the growth of rabbit populations.

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, ....

Each term in the Fibonacci sequence equals the sum of the previous two. That is, if we let Fn denote the nth term in the sequence we can write

\(F_n=F_{n-1}+F_{n-2}\)

To make this into a linear system, we need at least one more equation. Let’s use an easy one:

\(F_{n-1}=F_{n-1}+\left(0\right)\times F_{n-2}\)

Putting these together, here’s our system of linear equations for the Fibonacci sequence:

\(F_n=F_{n-1}+F_{n-2}\)
\(F_{n-1}=F_{n-1}+0\)

We can write this in matrix notation as the following:

\(\begin{bmatrix}F_n\\F_{n-1}\end{bmatrix}=\begin{bmatrix}1&1\\1&0\end{bmatrix}\begin{bmatrix}F_{n-1}\\F_{n-2}\end{bmatrix}\)

Here’s how we do that:

\(\begin{vmatrix}1-\lambda&1\\1&-\lambda\end{vmatrix}=0\)

\((1-\lambda)(-\lambda)-1=0\)

\(\lambda^2-\lambda-1=0\)

get the following two solutions, which are our two eigenvalues:

\(\lambda_1=\frac{1+\sqrt5}2\)
\(\lambda_2=\frac{1-\sqrt5}2\)

The next step is to find the two eigenvectors that correspond to the eigenvalues.

\(\left(A-\lambda I\right)x=0\)

we have the eigenvalues and eigenvectors, we can write the \(\mathrm S\) and \(\Lambda\) matrices as follows:

\(\mathrm S=\begin{bmatrix}\frac{1+\sqrt5}2&\frac{1-\sqrt5}2\\1&1\end{bmatrix}\)

\(\mathrm\Lambda=\begin{bmatrix}\frac{1+\sqrt5}2&0\\0&\frac{1-\sqrt5}2\end{bmatrix}\)

\(\mathrm S^{-1}=\frac1{\sqrt5}\begin{bmatrix}1&\frac{\sqrt5-1}2\\-1&\frac{\sqrt{5+1}}2\end{bmatrix}=\begin{bmatrix}\frac1{\sqrt5}&\frac{\sqrt5-1}{2\sqrt5}\\-\frac1{\sqrt5}&\frac{\sqrt{5+1}}{2\sqrt5}\end{bmatrix}\)

\({38}_{th}=39088169\), and \({39}_{th}\) is ?
\(39088169\times\frac{1+\sqrt5}2=39088169\times1.61803398875=63245986\)
\(39088169\div\frac{1-\sqrt5}2=39088169\div-0.61803398875=-63245986\),
\(\left|-63245986\right|=63245986\)

\({57}_{th}=365435296162\), and \({56}_{th}\) is ?
\(365435296162\div\frac{1+\sqrt5}2=365435296162\div1.61803398875=225851433717\)
\(365435296162\times\frac{1-\sqrt5}2=365435296162\times-0.61803398875=-225851433717\),
\(\left|-225851433717\right|=225851433717\)

Every beehive has one female queen bee who lays all the eggs. If an egg is fertilized by a male bee, then the egg produces a female worker bee, who doesn’t lay any eggs herself. If an egg is not fertilized it eventually hatches into a male bee, called a drone.

Fibonacci sequence also occur in real populations, here honeybees provide an example. A look at the pedigree of family tree of honey bee suggests that mating tendencies follow the Fibonacci numbers.

In a colony of honeybees there is one distinctive female called the queen. The other females are worker bees who, unlike the queen bee, produce no eggs. Male bees do no work and are called drone bees. When the queen lays eggs, the eggs could be either fertilised or unfertilised. The unfertilised eggs would result in drone (male bee) which carry 100% of their genome and fertilised eggs will result in female worker bees which carry 50% of their genome.

Males are produced by the queen’s unfertilised eggs, so male bees only have a mother but no father. All the females are produced when the queen has mated with a male and so have two parents. Females usually end up as worker bees but some are fed with a special substance called royal jelly which makes them grow into queens ready to go off to start a new colony when the bees form a swarm and leave their home (a hive) in search of a place to build a new nest.

Matrix

Singular Value Decomposition

Example (C) Google PageRank

Let's assume the Web contains 6 pages only. Page 2 provides the links connecting to page 3 and page 4. The links between these and the other pages in this simple web are summarised in this diagram.

Their task was to find the "most important" page for a particular search query. The higher priority page would return at the top of the search result.

Google's use of eigenvalues and eigenvectors. Considering Page 1, it has 4 outgoing links (to pages 2, 4, 5, and 6).So in the first column of our "links matrix", we place value \(\frac14\) in each of rows 2, 4, 5 and 6.

\(A=\begin{bmatrix}0&0&0&0&\frac12&0\\\frac14&0&0&0&0&0\\0&\frac12&0&0&0&0\\\frac14&\frac12&0&0&\frac12&0\\\frac14&0&1&1&0&1\\\frac14&0&0&0&0&0\end{bmatrix}\)

Next, to find the eigenvalues. We have

\(\left|A-\lambda I\right|=\begin{bmatrix}-\lambda&0&0&0&\frac12&0\\\frac14&-\lambda&0&0&0&0\\0&\frac12&-\lambda&0&0&0\\\frac14&\frac12&0&-\lambda&\frac12&0\\\frac14&0&1&1&-\lambda&1\\\frac14&0&0&0&0&-\lambda\end{bmatrix}\)

\(=\lambda^6-\frac{5\lambda^4}8-\frac{\lambda^3}4-\frac{\lambda^2}8\)

This expression is zero for \(\lambda=-0.72031,\;-0.13985,\;\pm0.39240j,\;0,\;1\).

We can only use non-negative, real values of \(\lambda\;is\;1\)

we find the corresponding eigenvector is: \(V_1=\begin{bmatrix}4&1&0.5&5.5&8&1\end{bmatrix}^T\)

As Page 5 has the highest PageRank, we conclude it is the most "important", and it will appear at the top of the search results.

We often normalize this vector so the sum of its elements is 1.We just add up the amounts and divide each amount by that total. It is called normalized vector P for "PageRank".

\(P=\begin{bmatrix}0.2&0.05&0.025&0.275&0.4&0.05\end{bmatrix}^T\)

Visualization of Singular Value decomposition of a Symmetric Matrix

The Singular Value Decomposition of a matrix A satisfies

\(\mathbf A = \mathbf U \mathbf \Sigma \mathbf V^\top\) The visualization of it would look like

But when \(\mathbf A\) is symmetric we can do:

\( \begin{align*} \mathbf A\mathbf A^\top&=(\mathbf U\mathbf \Sigma\mathbf V^\top)(\mathbf U\mathbf \Sigma\mathbf V^\top)^\top\\ \mathbf A\mathbf A^\top&=(\mathbf U\mathbf \Sigma\mathbf V^\top)(\mathbf V\mathbf \Sigma\mathbf U^\top) \end{align*} \)

and since \(\mathbf V\) is an orthogonal matrix (\(\mathbf V^\top \mathbf V=\mathbf I\)), so we have:

\(\mathbf A\mathbf A^\top=\mathbf U\mathbf \Sigma^2 \mathbf U^\top\)

singular value decomposition

Multilinear singular value decomposition and low multilinear rank approximation

\[{\left[\begin{array}{cc}{\mathbf A}_{11}&{\mathbf A}_{12}\\{\mathbf A}_{21}&{\mathbf A}_{22}\end{array}\right]}{\left[\begin{array}{c}{\mathbf x}_1\\{\mathbf x}_2\end{array}\right]}={\left[\begin{array}{c}{\mathbf b}_1\\{\mathbf b}_2\end{array}\right]}\]

\[\begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix}\]

tricks for computing eigenvalues

Matrix 2x2

\(\begin{bmatrix}a&b\\c&d\end{bmatrix}\xrightarrow{quick}\lambda_1,\;\lambda_2\)

1) \(\frac12tr\left(\begin{bmatrix}\boxed a&b\\c&\boxed d\end{bmatrix}\right)=\frac{a+d}2=\frac{\lambda_1+\lambda_2}2=m\;(mean)\)

2) \(det\left(\begin{bmatrix}a&b\\c&d\end{bmatrix}\right)=ad-bc=\lambda_1\lambda_2=p\;(product)\)

\(p\;=\;m^2\;-\;d^2\;=(m\;+\;d)(m\;-\;d)\)

\(d^2\;=\;m^2\;-\;p\)

3) \(\lambda_1,\;\lambda_2\;=\;m\pm\sqrt{m^2-p}\;\)

Matrix 3x3

\(A=\begin{bmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{bmatrix}\)

\(\lambda^3-\beta_1\lambda^2+\beta_2\lambda-\beta_3=0\)

\(\beta_1=Trace\;of\;Matrix\;=\;Sum\;of\;diagonal\;elements\)

\(\beta_1=a_{11}+a_{22}+a_{33}\)

\(\beta_2=Sum\;of\;Minors\;of\;diagonal\;elements\)

\(\begin{vmatrix}\left(a_{11}\right)&a_{12}&a_{13}\\a_{21}&\left(a_{22}\right)&a_{23}\\a_{31}&a_{32}&\left(a_{33}\right)\end{vmatrix}\)

|\begin{matrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{matrix}| |\begin{matrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{matrix}| |\begin{matrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{matrix}|

\(\beta_2=\begin{vmatrix}a_{11}&a_{12}\\a_{21}&a_{22}\end{vmatrix}+\begin{vmatrix}a_{22}&a_{23}\\a_{32}&a_{33}\end{vmatrix}+\begin{vmatrix}a_{11}&a_{13}\\a_{31}&a_{33}\end{vmatrix}\)

\(\beta_3=det(A)\)

\(\beta_3=\begin{vmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{vmatrix}\)

\(det(A)=\begin{vmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{vmatrix} = a_{11} \begin{vmatrix} a_{22} & a_{23} \\ a_{32} & a_{33} \end{vmatrix} - a_{12} \begin{vmatrix} a_{21} & a_{23} \\ a_{31} & a_{33} \end{vmatrix} + a_{13} \begin{vmatrix} a_{21} & a_{22} \\ a_{31} & a_{32} \end{vmatrix}\)

\(del(A)=\begin{array}{c}a_{13}\\\;\\a_{33}\end{array}\begin{vmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{vmatrix}\begin{array}{c}a_{11}\\\;\\a_{31}\end{array}\)

\(=+\left[\begin{array}{c}\bcancel{a_{13}}\\\;\\a_{33}\\\;\end{array}\begin{vmatrix}\bcancel{a_{11}}&\bcancel{a_{12}}&a_{13}\\\bcancel{a_{21}}&\bcancel{a_{22}}&\bcancel{a_{23}}\\a_{31}&\bcancel{a_{32}}&\bcancel{a_{33}}\\\;&\;&i\end{vmatrix}\begin{array}{cc}a_{11}&\;\\\;&\;\\\bcancel{a_{31}}&\;\\j&k\end{array}\right]-\left[\begin{array}{cc}\;&a_{13}\\\;&\;\\\;&\cancel{a_{33}}\\x&y\end{array}\begin{vmatrix}a_{11}&\cancel{a_{12}}&\cancel{a_{13}}\\\cancel{a_{21}}&\cancel{a_{22}}&\cancel{a_{23}}\\\cancel{a_{31}}&\cancel{a_{32}}&a_{33}\\z&\;&\;\end{vmatrix}\begin{array}{c}\cancel{a_{11}}\\\;\\a_{31}\\\;\end{array}\right]\)

\(=(i\;+\;j\;+\;k)\;–\;(x\;+\;y\;+\;z)\)

Example (D)

Humans and Zombies

Let me a give a toy example. Suppose that I have a population of humans and zombies. We’ll check in on them once every few days (what’s the worst that can happen?). Suppose that every human has an 80% chance of surviving those few days, and a 20% chance of turning into a zombie. Thankfully, the humans aren’t completely defenseless—they have found a cure! Unfortunately, it’s very difficult to apply the cure, so every zombie has a 99% chance of remaining a zombie, and a 1% chance of actually being cured. Let’s write this information down in matrix form:

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}\)

Here is why this is useful: if I write the number of humans \(H\) and the number of zombies \(Z\) as a vector\((H, Z)\), then multiplying by the above matrix will exactly tell me how many humans and how many zombies I should expect in a couple of days. For instance, suppose that I start with one zombie and 99 humans, and check back in two days. Well,

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}\begin{pmatrix}99\\1\end{pmatrix}=\begin{pmatrix}79.21\\20.79\end{pmatrix}\)

so I’m expecting that there should be about 79 humans and 21 zombies at this point. What if I check back in another two days? Well,

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}\begin{pmatrix}79.21\\20.79\end{pmatrix}\approx\begin{pmatrix}63.58\\36.42\end{pmatrix}\)

so I’m expecting that there should be about 64 humans and 36 zombies. I can keep going: after six days, there will be 51 humans and 49 zombies; after eight days, there will be 41 humans and 59 zombies; after ten days, there will be 34 humans and 66 zombies…

This is starting to look worrisome. Are the humans going to be wiped out? It’s certainly possible, but we can use some techniques from linear algebra and calculus to figure out what will happen in the long term. To wit, what we have shown is that the number of humans and zombies after \(2n\) days will be

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}^n\begin{pmatrix}99\\1\end{pmatrix}=\begin{pmatrix}H\\Z\end{pmatrix}\)

On the other hand, one can work out (look up Jordan decomposition if you are interested—you may need to brush up on some linear algebra first) that

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}\approx\begin{pmatrix}-0.05&-0.71\\-1.00&0.71\end{pmatrix}\begin{pmatrix}1&0\\0&0.79\end{pmatrix}\begin{pmatrix}-0.05&-0.71\\-1.00&0.71\end{pmatrix}^{-1}\)

from which it follows that

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}^n=\begin{pmatrix}-0.05&-0.71\\-1.00&0.71\end{pmatrix}\begin{pmatrix}1&0\\0&0.79\end{pmatrix}^n\begin{pmatrix}-0.05&-0.71\\-1.00&0.71\end{pmatrix}^{-1}\)

but since

\(\begin{pmatrix}1&0\\0&0.79\end{pmatrix}^{n\rightarrow\infty}\rightarrow\begin{pmatrix}1&0\\0&0\end{pmatrix}\)

as \(n→∞\) (that is, as \(n\) gets very large, the matrix that we multiply by itself over and over again gets closer and closer to the matrix on the right hand side), it follows that

\(\begin{pmatrix}0.8&0.01\\0.2&0.99\end{pmatrix}^n=\begin{pmatrix}-0.05&-0.71\\-1.00&0.71\end{pmatrix}\begin{pmatrix}1&0\\0&0\end{pmatrix}\begin{pmatrix}-0.05&-0.71\\-1.00&0.71\end{pmatrix}^{-1}=\begin{pmatrix}0.05&0.05\\0.95&0.95\end{pmatrix}\)

Thus, if we start with 99 humans and 1 zombie, the expectation is that after a long period of time, we’ll have

\(\begin{pmatrix}0.05&0.05\\0.95&0.95\end{pmatrix}\begin{pmatrix}99\\1\end{pmatrix}\approx\begin{pmatrix}4.76\\95.24\end{pmatrix}\)

—that is, about 5 humans and 95 zombies. The good news is that the humans survive; the bad news is that they only just barely survive.

Naturally, zombies aren’t very “real life,” but the spread of genes through a population is, and Markov chains can be applied to that perfectly well. Figuring out how to weigh different websites based on how many other sites link to them can also be modeled as a Markov process—this is the fundamental observation behind PageRank, Google’s algorithm for figuring out what websites it should display first. Predictive text works on basically the same principle. In fact, I strongly recommend looking at the Wikipedia page for applications of Markov chains, because there are a lot of them, coming from all sorts of different disciplines.

Zombie outbreak

Let \(y_t, z_t\) represent the number of human and zombies (respectively) \(t\) months after an initial outbreak. Suppose the outbreak obeys the predator-prey model

\(\begin{bmatrix}y_{t+1}\\z_{t+1}\end{bmatrix}=\begin{bmatrix}1.02&-0.1\\0.08&0.83\end{bmatrix}\begin{bmatrix}y_t\\z_t\end{bmatrix}\)

Prove that any number of zombies will eventually destroy humanity

Before we get to the solution, it's probably important to say a bit more about what this “predator-prey model” means. This matrix equation is equivalent to a system of two linear equations:

\(y_{t+1}=1.02y_t-0.1z_t\)

\(z_{t+1}=0.08y_t+0.83z_t\)

The first equation says that the human population \((y_t)\) would grow exponentially 2% per month in the absence of zombies, but the zombies kill off one-tenth of their number in humans. The second equation says the zombie population decreases in the absence of humans (they require humans as prey) but that most of the humans they kill turn into more zombies. Details and exact numbers aside, this is a reasonable approximation of how one could model a predator-prey relationship. We could replace humans and zombies with the more traditional rabbits and foxes or whatever. The math doesn't care.

Solution outline: use eigenvalues! You can calculate them manually or by calculator, but they are approximately the following:

\(\lambda_1\approx.957,\;\lambda_2\approx.893\)

According to the theorem above, this means that regardless of the initial populations, we have a general solution of the form

\(\begin{bmatrix}y_t\\z_t\end{bmatrix}=c_1\left(.957\right)^tv_1+c_2\left(.893\right)^tv_2\)

Since both eigenvalues are less than 1, this population vector converges to zero as t increases. That is, humanity slowly perishes (and the zombies with them).

We could even be a bit more precise about how they perish — in the long run, the second term is asymptotically smaller in magnitude than the first. We can say that

\(\lim_{t\rightarrow\infty}\begin{bmatrix}y_t\\z_t\end{bmatrix}\lambda_1^t=c_1v_1\)

so the population decays exponentially along the \(λ_1\) eigenvector \(v_1\), which in this case represents a ratio of approximately 846 humans to 533 zombies. So the leading eigenvalue tells you the rate of growth/decay, and the leading eigenvector tells you the limiting population ratios as they grow or decay.

I could go on or do a more complex example, but hopefully this conveys what eigenvalues and eigenvectors tell you at least for the predator/prey linear dynamical system model.

In general, eigenvectors and eigenvalues are crucial for determining the long-term behavior of all kinds of models. A partial list of applications includes the following:

Population Dynamics (biology/epidemiology)

Stabilizing a system e.g. antilock brakes (control theory/engineering)

Finding resonant frequencies (engineering, physics)

Ranking internet pages by importance (computer science)

Principal component analysis (statistics)

Finding stable axes of rotation (physics)

Stability analysis of the Lotka-Volterra Predator-Prey model

Example (E)

Truck Rental Company

A truck rental company has locations all over Vancouver, where you can rent moving trucks. You can return them to any other location. For simplicity, pretend that there are three locations, and that every customer returns their truck the next day. Let \(v_t\) be the vector whose entries \(x_t, y_t, z_t\) are the number of trucks in locations \(1, 2\), and \(3\), respectively. Let \(A\) be the matrix whose \(i, j\) -entry is the probability that a customer renting a truck from location \(j\) returns it to location \(i\). For example, the matrix

\(A=\begin{pmatrix}.3&.4&.5\\.3&.4&.3\\.4&.2&.2\end{pmatrix}\)

\(AQ=Q\Lambda\)

\( \begin{pmatrix}.3&.4&.5\\.3&.4&.3\\.4&.2&.2\end{pmatrix}\begin{pmatrix}1.4&-1&0.5\\1.2&0&-1.5\\1&1&1\end{pmatrix}=\begin{pmatrix}1.4&-1&0.5\\1.2&0&-1.5\\1&1&1\end{pmatrix}\begin{pmatrix}1&0&0\\0&-0.2&0\\0&0&0.1\end{pmatrix} \)

Therefore, the Truck numbers in location are \(1, 2\), and \(3\) is \(1.4:1.2:1\). That is \(7:6:5\).

Eigenvalue Decomposition

Orthogonal projectors

Truck Rental Company

P is an orthogonal projector if and only if \(P^2=P\) and \(AP\) is symmetric.

In linear algebra, an \(Orthogonal\) \(Matrix\), or \(Orthonormal\) \(Matrix\), is a real square matrix whose columns and rows are orthonormal vectors. One way to express this is

\(Q^TQ=QQ^T=I,\)

where \(QT\) is the transpose of \(Q\), and \(I\) is the identity matrix.

This leads to the equivalent characterization: a matrix \(Q\) is orthogonal if its transpose is equal to its inverse:

\(Q^T=Q^{-1},\)

where \(Q^{-1}\) is the inverse of \(Q\).

An orthogonal matrix Q is necessarily invertible (with inverse \(Q^{-1}=Q^T\)), unitary \((Q^{-1}=Q^*)\), where \(Q^∗\) is the Hermitian adjoint (conjugate transpose) of \(Q\), and therefore normal \((Q^∗Q = QQ^∗)\) over the real numbers.
The determinant of any orthogonal matrix is either +1 or -1. As a linear transformation, an orthogonal matrix preserves the inner product of vectors, and therefore acts as an isometry of Euclidean space, such as a rotation, reflection or rotoreflection. In other words, it is a unitary transformation.

Orthogonal Matrix

Determinant

“negative area” in the case of 2x2 matrix or “negative volume” in the case of 3x3 matrix. It is related to the relative orientation of vectors. You maybe heard of the “right hand rule”

Why determinant of a 2 by 2 matrix is the area of a parallelogram?

Parallelogram = Rectangle - Extra Stuff.

\((c+a)(b+d)-2ad-cd-ab\)

Also interesting to note that if you swap vectors places then you get a negative(opposite of what \(bc−ad\) would produce) area, which is basically:

-Parallelogram = Rectangle - (2*Rectangle - Extra Stuff)

Or more concretely:

\((c+a)(b+d) - [2*(c+a)(b+d) - (2ad+cd+ab)]\)

Also it's \(ad−bc\), when simplified.

so with a little algebra we find that the base length is \(a-b\times\frac{c}{d}\)

Hence the area is just \(d(a-\frac{bc}{d}) = ad - bc\)

Determinant of a \(3×3\) matrix

Proof that the determinant of a 3×3 matrix is the volume of the parallelepiped spanned by the columns.

General Formula for the Inverse of a \(2×2\) matrix

The inverse of \(A\) is \(A^{-1}\) only when:\(AA^{-1} = A^{-1}A = I\). Sometimes there is no inverse at all.

\(A=\begin{bmatrix}a&b\\c&d\end{bmatrix}\)

\(det(A)=ad-bc\)

\(A^{-1}=\begin{bmatrix}a&b\\c&d\end{bmatrix}^{-1}=\frac1{det(A)}\begin{bmatrix}d&-b\\-c&a\end{bmatrix}=\frac1{ad-bc}\begin{bmatrix}d&-b\\-c&a\end{bmatrix}\)

General Formula for the Inverse of a \(3×3\) matrix

\(let M=\begin{bmatrix}a&b&c\\d&e&f\\g&h&i\end{bmatrix},\)

Now produce the matrix of minors

\(\begin{bmatrix}ei-fh&di-fg&dh-eg\\bi-ch&ai-cg&ah-bg\\bf-ce&af-cd&ae-bd\end{bmatrix},\)

Use the alternating law of signs to produce the matrix of cofactors \(M^\backprime \)

\(M^\backprime =\begin{bmatrix}ei-fh&fg-di&dh-eg\\ch-bi&ai-cg&bg-ah\\bf-ce&cd-af&ae-bd\end{bmatrix}\)

Transpose\(M^\backprime\)

\(M^{\backprime T}=\begin{bmatrix}ei-fh&ch-bi&bf-ce\\fg-di&ai-cg&cd-af\\dh-eg&bg-ah&ae-bd\end{bmatrix}\)

\(det\;M=a(ei-fh)-b(di-fg)+c(dh-eg)\)

The grand formula:

\(M^{-1}=\frac1{a(ei-fh)-b(di-fg)+c(dh-eg)}\begin{bmatrix}ei-fh&ch-bi&bf-ce\\fg-di&ai-cg&cd-af\\dh-eg&bg-ah&ae-bd\end{bmatrix}\)

Laplace expansion

Laplace expansion expresses the determinant of a matrix \(A\) recursively in terms of determinants of smaller matrices, known as its minors. The minor \(M_{i,j}\) is defined to be the determinant of the \((n - 1) × (n - 1)\)-matrix that results from \(A\) by removing the \(i\)-th row and the \(j\)-th column. The expression \({(-1)}^{i+j}M_{i,j}\) is known as a cofactor. For every \(i\), one has the equality

\(det(A)=\sum_{j=1}^n{(-1)}^{i+j}a_{i,j}M_{i,j}\)