10.018 Modelling Space and Systems - Linear Algebra Notes (Weeks 8-13)

Topic 1: Systems of Linear Equations

Augmented Matrices and Elementary Row Operations

A system of linear equations can be represented compactly as an augmented matrix \([A|\mathbf{b}]\), where \(A\) is the coefficient matrix and \(\mathbf{b}\) is the right-hand side vector.

Elementary Row Operations (EROs) are the three operations that preserve the solution set:

Swap: Interchange two rows \(R_i \leftrightarrow R_j\)
Scale: Multiply a row by a nonzero constant \(kR_i \to R_i\), where \(k \neq 0\)
Add: Add a multiple of one row to another \(R_i + kR_j \to R_i\)

Row Echelon Form (REF) and Reduced Row Echelon Form (RREF)

A matrix is in Row Echelon Form (REF) if:

All zero rows are at the bottom
The leading (first nonzero) entry of each nonzero row is to the right of the leading entry of the row above it
All entries below a leading entry are zero

A matrix is in Reduced Row Echelon Form (RREF) if additionally:

Each leading entry equals 1 (called a pivot)
Each pivot is the only nonzero entry in its column

Gaussian and Gauss-Jordan Elimination

Gaussian Elimination: Use EROs to reduce the augmented matrix to REF, then use back-substitution to find the solution.

Gauss-Jordan Elimination: Continue reducing from REF to RREF. The solution can be read directly without back-substitution.

Classification of Solutions

After row reduction, a system is:

Inconsistent (no solution): if a row of the form \([0 \; 0 \; \cdots \; 0 \;|\; c]\) with \(c \neq 0\) appears
Consistent with a unique solution: if every column (except the last) has a pivot
Consistent with infinitely many solutions: if consistent and there are free variables (columns without pivots)

Parametric Systems

When the system contains a parameter (e.g., \(a\)), row reduce and examine the pivots to classify solutions based on the parameter value.

Method for parametric systems: Row reduce the augmented matrix. Examine the last meaningful row, which typically takes the form \([0 \; 0 \; \cdots \; f(a) \;|\; g(a)]\):

No solution: \(f(a) = 0\) and \(g(a) \neq 0\)
Unique solution: \(f(a) \neq 0\)
Infinitely many solutions: \(f(a) = 0\) and \(g(a) = 0\)

Homogeneous Systems

A homogeneous system \(A\mathbf{x} = \mathbf{0}\) is always consistent (since \(\mathbf{x} = \mathbf{0}\) is always a solution, called the trivial solution).

If the system has more unknowns than equations (\(n > m\)), then it has infinitely many solutions (there must be free variables).

General Solution Structure

The general solution of \(A\mathbf{x} = \mathbf{b}\) can be written as: \[ \mathbf{x} = \mathbf{x}_p + \mathbf{x}_h \] where \(\mathbf{x}_p\) is any particular solution of \(A\mathbf{x} = \mathbf{b}\), and \(\mathbf{x}_h\) is the general solution of the homogeneous system \(A\mathbf{x} = \mathbf{0}\).

Parametric system: For what values of \(a\) does the following system have no solution, a unique solution, or infinitely many solutions? \[ x + y + z = -1 \] \[ x + 2y + az = 2a \] \[ x + ay + 2z = -2 \] Solution: Form the augmented matrix and row reduce: \[ \begin{bmatrix} 1 & 1 & 1 & | & -1 \\ 1 & 2 & a & | & 2a \\ 1 & a & 2 & | & -2 \end{bmatrix} \] \(R_2 - R_1 \to R_2\), \(R_3 - R_1 \to R_3\): \[ \begin{bmatrix} 1 & 1 & 1 & | & -1 \\ 0 & 1 & a-1 & | & 2a+1 \\ 0 & a-1 & 1 & | & -1 \end{bmatrix} \] \(R_3 - (a-1)R_2 \to R_3\): \[ \begin{bmatrix} 1 & 1 & 1 & | & -1 \\ 0 & 1 & a-1 & | & 2a+1 \\ 0 & 0 & 1-(a-1)^2 & | & -1-(a-1)(2a+1) \end{bmatrix} \] Simplify the last row: the pivot entry is \(1 - (a-1)^2 = -(a^2 - 2a) = -a(a-2)\) and the RHS is \(-1 - (2a^2 - a - 1) = -2a^2 + a = -a(2a-1)\). Analysis:

If \(a \neq 0\) and \(a \neq 2\): pivot \(\neq 0\), so unique solution
If \(a = 0\): pivot \(= 0\), RHS \(= -0(2(0)-1) = 0\), so infinitely many solutions
If \(a = 2\): pivot \(= 0\), RHS \(= -2(2(2)-1) = -2(3) = -6 \neq 0\), so no solution

Topic 2: Matrix Operations and Inverses

Matrix Arithmetic

For matrices of compatible dimensions:

Addition: \((A+B)_{ij} = a_{ij} + b_{ij}\) (same dimensions required)
Scalar multiplication: \((cA)_{ij} = c \cdot a_{ij}\)
Matrix multiplication: \((AB)_{ij} = \sum_{k} a_{ik} b_{kj}\) — requires cols(A) = rows(B)
Transpose: \((A^T)_{ij} = a_{ji}\)

Key properties: In general \(AB \neq BA\). Also \((AB)^T = B^T A^T\) (order reverses).

Symmetric Matrices

A square matrix \(A\) is symmetric if \(A = A^T\).

If \(A, B\) are symmetric, then \(A + B\) is symmetric.
If \(A\) is invertible and symmetric, then \(A^{-1}\) is also symmetric.

Proof of second claim: \((A^{-1})^T = (A^T)^{-1} = A^{-1}\) since \(A^T = A\).

Matrix Inverse

A square matrix \(A\) is invertible if there exists a matrix \(A^{-1}\) such that \(AA^{-1} = A^{-1}A = I\). The inverse is unique.

Properties of inverses:

\((A^{-1})^{-1} = A\)
\((AB)^{-1} = B^{-1}A^{-1}\) (order reverses!)
\((A^T)^{-1} = (A^{-1})^T\)
\((cA)^{-1} = \frac{1}{c}A^{-1}\)

2x2 Inverse Formula

For \(A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}\), if \(ad - bc \neq 0\): \[ A^{-1} = \frac{1}{ad - bc}\begin{bmatrix} d & -b \\ -c & a \end{bmatrix} \]

Solving Matrix Equations

To solve equations like \(AXB = C\) for \(X\):

Left-multiply by \(A^{-1}\): \(XB = A^{-1}C\)
Right-multiply by \(B^{-1}\): \(X = A^{-1}CB^{-1}\)

Be careful with order! You cannot write \(X = A^{-1}B^{-1}C\) because matrix multiplication is not commutative.

Elementary Matrices

An elementary matrix is obtained by performing exactly one ERO on the identity matrix \(I\). There are three types:

Type I (Swap): \(E\) swaps rows \(i\) and \(j\) of \(I\)
Type II (Scale): \(E\) multiplies row \(i\) of \(I\) by \(k \neq 0\)
Type III (Add): \(E\) adds \(k\) times row \(j\) to row \(i\) in \(I\)

Key property: Left-multiplying \(A\) by elementary matrix \(E\) performs the corresponding ERO on \(A\). That is, \(EA\) equals the result of applying that ERO to \(A\).

Inverses of elementary matrices:

Swap: \(E^{-1} = E\) (swap again undoes it)
Scale by \(k\): \(E^{-1}\) scales by \(1/k\)
Add \(k \times R_j\) to \(R_i\): \(E^{-1}\) adds \(-k \times R_j\) to \(R_i\)

Expressing A as a Product of Elementary Matrices

If \(E_k \cdots E_2 E_1 A = I\), then: \[ A^{-1} = E_k \cdots E_2 E_1 \] \[ A = E_1^{-1} E_2^{-1} \cdots E_k^{-1} \]

Gauss-Jordan Method for Finding Inverses

To find \(A^{-1}\):

Form the augmented matrix \([A \;|\; I]\)
Row reduce until the left half becomes \(I\)
The right half is \(A^{-1}\): \([A \;|\; I] \to [I \;|\; A^{-1}]\)

If you cannot reduce the left half to \(I\), then \(A\) is not invertible.

Fundamental Theorem of Invertible Matrices (Part 1)

For an \(n \times n\) matrix \(A\), the following are equivalent:

\(A\) is invertible
\(A\mathbf{x} = \mathbf{b}\) has a unique solution for every \(\mathbf{b} \in \mathbb{R}^n\)
\(A\mathbf{x} = \mathbf{0}\) has only the trivial solution \(\mathbf{x} = \mathbf{0}\)
The RREF of \(A\) is \(I_n\)
\(A\) can be expressed as a product of elementary matrices

Find the inverse of \(A = \begin{bmatrix} 1 & 2 & 1 \\ 2 & 5 & 3 \\ 1 & 3 & 2 \end{bmatrix}\) using Gauss-Jordan elimination. Solution: Form \([A|I]\) and row reduce: \[ \begin{bmatrix} 1 & 2 & 1 & | & 1 & 0 & 0 \\ 2 & 5 & 3 & | & 0 & 1 & 0 \\ 1 & 3 & 2 & | & 0 & 0 & 1 \end{bmatrix} \] \(R_2 - 2R_1\), \(R_3 - R_1\): \[ \begin{bmatrix} 1 & 2 & 1 & | & 1 & 0 & 0 \\ 0 & 1 & 1 & | & -2 & 1 & 0 \\ 0 & 1 & 1 & | & -1 & 0 & 1 \end{bmatrix} \] \(R_3 - R_2\): \[ \begin{bmatrix} 1 & 2 & 1 & | & 1 & 0 & 0 \\ 0 & 1 & 1 & | & -2 & 1 & 0 \\ 0 & 0 & 0 & | & 1 & -1 & 1 \end{bmatrix} \] The left side cannot reach \(I\) (third row is all zeros), so \(A\) is not invertible.

Alternative example: \(A = \begin{bmatrix} 1 & 0 & 1 \\ 0 & 1 & 1 \\ 1 & 1 & 0 \end{bmatrix}\) \[ [A|I] = \begin{bmatrix} 1 & 0 & 1 & | & 1 & 0 & 0 \\ 0 & 1 & 1 & | & 0 & 1 & 0 \\ 1 & 1 & 0 & | & 0 & 0 & 1 \end{bmatrix} \] \(R_3 - R_1\): \[ \begin{bmatrix} 1 & 0 & 1 & | & 1 & 0 & 0 \\ 0 & 1 & 1 & | & 0 & 1 & 0 \\ 0 & 1 & -1 & | & -1 & 0 & 1 \end{bmatrix} \] \(R_3 - R_2\): \[ \begin{bmatrix} 1 & 0 & 1 & | & 1 & 0 & 0 \\ 0 & 1 & 1 & | & 0 & 1 & 0 \\ 0 & 0 & -2 & | & -1 & -1 & 1 \end{bmatrix} \] \(R_3 \div (-2)\): \[ \begin{bmatrix} 1 & 0 & 1 & | & 1 & 0 & 0 \\ 0 & 1 & 1 & | & 0 & 1 & 0 \\ 0 & 0 & 1 & | & 1/2 & 1/2 & -1/2 \end{bmatrix} \] \(R_1 - R_3\), \(R_2 - R_3\): \[ \begin{bmatrix} 1 & 0 & 0 & | & 1/2 & -1/2 & 1/2 \\ 0 & 1 & 0 & | & -1/2 & 1/2 & 1/2 \\ 0 & 0 & 1 & | & 1/2 & 1/2 & -1/2 \end{bmatrix} \] Therefore: \(A^{-1} = \frac{1}{2}\begin{bmatrix} 1 & -1 & 1 \\ -1 & 1 & 1 \\ 1 & 1 & -1 \end{bmatrix}\)

Topic 3: Determinants, Trace, and Subspaces

Cofactor Expansion

For an \(n \times n\) matrix \(A\), the minor \(M_{ij}\) is the determinant of the \((n-1)\times(n-1)\) submatrix obtained by deleting row \(i\) and column \(j\). The cofactor is \(C_{ij} = (-1)^{i+j} M_{ij}\).

The determinant can be computed by expanding along any row \(i\) or column \(j\): \[ \det(A) = \sum_{j=1}^{n} a_{ij} C_{ij} = \sum_{i=1}^{n} a_{ij} C_{ij} \] Tip: Choose the row or column with the most zeros for efficiency.

Properties of Determinants

Effects of EROs on determinants:

Operation	Effect on det
Swap two rows	\(\det \to -\det\)
Scale a row by \(k\)	\(\det \to k \cdot \det\)
Add multiple of one row to another	\(\det\) unchanged

Key determinant identities:

Triangular matrix: \(\det(A) = \) product of diagonal entries
\(\det(AB) = \det(A)\det(B)\)
\(\det(A^{-1}) = 1/\det(A)\)
\(\det(cA) = c^n \det(A)\) for \(n \times n\) matrix \(A\)
\(\det(A^T) = \det(A)\)

Algorithm: Computing Determinants via Row Reduction

Row reduce \(A\) to upper triangular form (REF), tracking all swaps and scalings.
Compute: \[\det(A) = (-1)^s \cdot \frac{1}{\text{product of scaling constants}} \cdot \text{product of diagonal entries of REF}\] where \(s\) = number of row swaps performed.

\(A\) is invertible \(\iff\) \(\det(A) \neq 0\).

Trace

The trace of a square matrix \(A\) is the sum of its diagonal entries: \[ \operatorname{tr}(A) = \sum_{i=1}^n a_{ii} \]

Properties:

\(\operatorname{tr}(A + B) = \operatorname{tr}(A) + \operatorname{tr}(B)\)
\(\operatorname{tr}(cA) = c \cdot \operatorname{tr}(A)\)
\(\operatorname{tr}(AB) = \operatorname{tr}(BA)\) (cyclic property)

Prove: There are no \(2 \times 2\) matrices \(A, B\) such that \(AB - BA = I\). Proof: Suppose \(AB - BA = I\). Take the trace of both sides: \[ \operatorname{tr}(AB - BA) = \operatorname{tr}(I) \] \[ \operatorname{tr}(AB) - \operatorname{tr}(BA) = 2 \] But \(\operatorname{tr}(AB) = \operatorname{tr}(BA)\), so the left side is 0. This gives \(0 = 2\), a contradiction. Therefore no such matrices exist. (This argument works for any \(n \times n\) case: \(0 = n\), contradiction.)

Linear Combination and Span

A linear combination of vectors \(\mathbf{v}_1, \ldots, \mathbf{v}_k\) is any vector of the form \(c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \cdots + c_k\mathbf{v}_k\) where \(c_i \in \mathbb{R}\).

The span of a set of vectors is the set of all linear combinations: \(\operatorname{span}\{\mathbf{v}_1, \ldots, \mathbf{v}_k\} = \{c_1\mathbf{v}_1 + \cdots + c_k\mathbf{v}_k : c_i \in \mathbb{R}\}\).

Linear Independence

Vectors \(\mathbf{v}_1, \ldots, \mathbf{v}_k\) are linearly independent if the only solution to \(c_1\mathbf{v}_1 + \cdots + c_k\mathbf{v}_k = \mathbf{0}\) is \(c_1 = c_2 = \cdots = c_k = 0\).

Otherwise they are linearly dependent (at least one vector can be written as a linear combination of the others).

Subspaces

A nonempty subset \(W\) of \(\mathbb{R}^n\) is a subspace if:

\(\mathbf{0} \in W\) (contains the zero vector)
If \(\mathbf{u}, \mathbf{v} \in W\), then \(\mathbf{u} + \mathbf{v} \in W\) (closed under addition)
If \(\mathbf{u} \in W\) and \(c \in \mathbb{R}\), then \(c\mathbf{u} \in W\) (closed under scalar multiplication)

Examples of subspaces: Planes through the origin, lines through the origin, \(\{\mathbf{0}\}\), \(\mathbb{R}^n\) itself.

Non-subspaces: Sets not containing \(\mathbf{0}\) (e.g., \(\{(x,y): x + y = 1\}\)); sets with nonlinear constraints (e.g., \(\{(x,y): x^2 + y^2 \leq 1\}\)).

Compute \(\det(A)\) for \(A = \begin{bmatrix} 2 & 1 & 3 \\ 4 & 1 & 4 \\ 2 & 0 & 1 \end{bmatrix}\) using row reduction. Solution: \(R_2 - 2R_1 \to R_2\), \(R_3 - R_1 \to R_3\): \[ \begin{bmatrix} 2 & 1 & 3 \\ 0 & -1 & -2 \\ 0 & -1 & -2 \end{bmatrix} \] \(R_3 - R_2 \to R_3\): \[ \begin{bmatrix} 2 & 1 & 3 \\ 0 & -1 & -2 \\ 0 & 0 & 0 \end{bmatrix} \] No swaps or scalings were used. Product of diagonal = \(2 \times (-1) \times 0 = 0\). Therefore \(\det(A) = 0\), and \(A\) is not invertible.

Topic 4: Basis, Rank, and Change of Basis

Basis and Dimension

A basis for a subspace \(W\) is a set of vectors that is both:

Spanning: every vector in \(W\) can be written as a linear combination of the basis vectors
Linearly independent

The dimension of \(W\) is the number of vectors in any basis for \(W\).

The standard basis for \(\mathbb{R}^n\) is \(\{\mathbf{e}_1, \mathbf{e}_2, \ldots, \mathbf{e}_n\}\) where \(\mathbf{e}_i\) has 1 in position \(i\) and 0 elsewhere.

Row Space, Column Space, and Null Space

For an \(m \times n\) matrix \(A\):

Row space: \(\operatorname{row}(A) = \operatorname{span}\{\text{rows of } A\} \subseteq \mathbb{R}^n\)
Column space: \(\operatorname{col}(A) = \operatorname{span}\{\text{columns of } A\} \subseteq \mathbb{R}^m\)
Null space: \(\operatorname{null}(A) = \{\mathbf{x} \in \mathbb{R}^n : A\mathbf{x} = \mathbf{0}\} \subseteq \mathbb{R}^n\)

Finding Bases

Given matrix \(A\), compute its RREF:

Basis for row space: The nonzero rows of the RREF.
Basis for column space: The columns of the original matrix \(A\) corresponding to the pivot columns of the RREF.
Basis for null space: Solve \(A\mathbf{x} = \mathbf{0}\). Set each free variable = 1 (others = 0) one at a time; the resulting vectors form a basis.

Important: For the column space, use columns of the ORIGINAL matrix, not the RREF! Row reduction changes the column space but preserves the row space.

Rank and Nullity

Rank: \(\operatorname{rank}(A) = \dim(\operatorname{row}(A)) = \dim(\operatorname{col}(A)) = \) number of pivots in RREF
Nullity: \(\operatorname{nullity}(A) = \dim(\operatorname{null}(A)) = \) number of free variables

Rank-Nullity Theorem: For an \(m \times n\) matrix \(A\): \[ \operatorname{rank}(A) + \operatorname{nullity}(A) = n \] where \(n\) is the number of columns.

Orthogonal and Orthonormal Sets

A set of vectors \(\{\mathbf{v}_1, \ldots, \mathbf{v}_k\}\) is orthogonal if \(\mathbf{v}_i \cdot \mathbf{v}_j = 0\) for all \(i \neq j\).

It is orthonormal if additionally each vector has unit length: \(\|\mathbf{v}_i\| = 1\) for all \(i\).

An orthogonal set of nonzero vectors is linearly independent.

A matrix \(Q\) has orthonormal columns \(\iff\) \(Q^T Q = I\).

Change of Basis

Let \(B = \{\mathbf{b}_1, \ldots, \mathbf{b}_n\}\) be a basis for \(\mathbb{R}^n\), and let \(P_B = [\mathbf{b}_1 | \cdots | \mathbf{b}_n]\) be the matrix whose columns are the basis vectors.

If \([\mathbf{x}]_B\) denotes the coordinate vector of \(\mathbf{x}\) with respect to basis \(B\), then: \[ \mathbf{x} = P_B [\mathbf{x}]_B \quad \Longrightarrow \quad [\mathbf{x}]_B = P_B^{-1} \mathbf{x} \] To convert between two bases \(A\) and \(B\): \[ [\mathbf{x}]_B = P_B^{-1} P_A [\mathbf{x}]_A \]

For orthogonal basis \(\{\mathbf{v}_1, \ldots, \mathbf{v}_n\}\): the coordinates are \[ c_i = \frac{\mathbf{u} \cdot \mathbf{v}_i}{\mathbf{v}_i \cdot \mathbf{v}_i} \]

For orthonormal basis: even simpler: \(c_i = \mathbf{u} \cdot \mathbf{v}_i\)

Find the rank, nullity, and bases for all fundamental subspaces of: \[ A = \begin{bmatrix} 1 & 2 & 0 & 1 \\ 2 & 4 & 1 & 3 \\ 3 & 6 & 1 & 4 \end{bmatrix} \] Solution: Row reduce to RREF: \[ R_2 - 2R_1,\; R_3 - 3R_1: \begin{bmatrix} 1 & 2 & 0 & 1 \\ 0 & 0 & 1 & 1 \\ 0 & 0 & 1 & 1 \end{bmatrix} \] \[ R_3 - R_2: \begin{bmatrix} 1 & 2 & 0 & 1 \\ 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 0 \end{bmatrix} \] Pivots are in columns 1 and 3. Free variables: \(x_2, x_4\). Rank = 2, Nullity = 4 - 2 = 2. Check: rank + nullity = 2 + 2 = 4 = n. Basis for row space: \(\{(1,2,0,1),\; (0,0,1,1)\}\) (nonzero rows of RREF) Basis for column space: Columns 1 and 3 of original \(A\): \(\left\{\begin{pmatrix}1\\2\\3\end{pmatrix}, \begin{pmatrix}0\\1\\1\end{pmatrix}\right\}\) Basis for null space: Solve \(A\mathbf{x}=\mathbf{0}\):
From RREF: \(x_1 = -2x_2 - x_4\), \(x_3 = -x_4\). Set \(x_2=1, x_4=0\): \((-2,1,0,0)\). Set \(x_2=0, x_4=1\): \((-1,0,-1,1)\).
Basis: \(\{(-2,1,0,0),\; (-1,0,-1,1)\}\)

Change of basis: Let \(B = \left\{\begin{pmatrix}1\\1\end{pmatrix}, \begin{pmatrix}1\\-1\end{pmatrix}\right\}\). Find \([\mathbf{x}]_B\) for \(\mathbf{x} = \begin{pmatrix}3\\1\end{pmatrix}\). Solution: \(P_B = \begin{bmatrix}1&1\\1&-1\end{bmatrix}\), \(P_B^{-1} = \frac{1}{-2}\begin{bmatrix}-1&-1\\-1&1\end{bmatrix} = \begin{bmatrix}1/2&1/2\\1/2&-1/2\end{bmatrix}\) \[[\mathbf{x}]_B = P_B^{-1}\mathbf{x} = \begin{bmatrix}1/2&1/2\\1/2&-1/2\end{bmatrix}\begin{pmatrix}3\\1\end{pmatrix} = \begin{pmatrix}2\\1\end{pmatrix}\] Check: \(2\begin{pmatrix}1\\1\end{pmatrix} + 1\begin{pmatrix}1\\-1\end{pmatrix} = \begin{pmatrix}3\\1\end{pmatrix}\). Correct.

Topic 5: Eigenvalues and Diagonalisation

Eigenvalues and Eigenvectors

A scalar \(\lambda\) is an eigenvalue of a square matrix \(A\) if there exists a nonzero vector \(\mathbf{v}\) such that: \[ A\mathbf{v} = \lambda\mathbf{v} \] The nonzero vector \(\mathbf{v}\) is called an eigenvector corresponding to \(\lambda\).

Finding Eigenvalues

Form the characteristic polynomial: \(\det(A - \lambda I) = 0\)
Solve for \(\lambda\) (the eigenvalues)
For each eigenvalue \(\lambda\), find the eigenspace \(E_\lambda = \operatorname{null}(A - \lambda I)\) by solving \((A - \lambda I)\mathbf{x} = \mathbf{0}\)

For a 2x2 matrix \(A\), the characteristic polynomial is: \[ \lambda^2 - \operatorname{tr}(A)\lambda + \det(A) = 0 \]

Multiplicities

Algebraic multiplicity of \(\lambda\): the number of times \(\lambda\) appears as a root of the characteristic polynomial
Geometric multiplicity of \(\lambda\): \(\dim(E_\lambda) = \dim(\operatorname{null}(A - \lambda I))\)

For every eigenvalue: \(1 \leq \text{geometric multiplicity} \leq \text{algebraic multiplicity}\).

Useful Properties

Triangular matrices: eigenvalues = diagonal entries
\(\det(A) = \lambda_1 \lambda_2 \cdots \lambda_n\) (product of all eigenvalues, with multiplicity)
\(\operatorname{tr}(A) = \lambda_1 + \lambda_2 + \cdots + \lambda_n\) (sum of all eigenvalues)
If \(A\mathbf{v} = \lambda\mathbf{v}\), then \(A^k\mathbf{v} = \lambda^k\mathbf{v}\)

Diagonalisation

A matrix \(A\) is diagonalisable if there exists an invertible matrix \(P\) and diagonal matrix \(D\) such that: \[ A = PDP^{-1} \] where \(P = [\mathbf{v}_1 | \mathbf{v}_2 | \cdots | \mathbf{v}_n]\) (columns are eigenvectors) and \(D = \operatorname{diag}(\lambda_1, \ldots, \lambda_n)\).

\(A\) is diagonalisable \(\iff\) for every eigenvalue, geometric multiplicity = algebraic multiplicity \(\iff\) \(A\) has \(n\) linearly independent eigenvectors.

Matrix Powers via Diagonalisation

If \(A = PDP^{-1}\), then:

\[ A^k = PD^kP^{-1} \]

where \(D^k = \operatorname{diag}(\lambda_1^k, \ldots, \lambda_n^k)\) is trivial to compute.

Markov Chains

A transition matrix \(T\) has columns summing to 1 (column-stochastic). The state after \(k\) steps: \(\mathbf{x}_k = T^k \mathbf{x}_0\).

The steady-state vector \(\mathbf{q}\) satisfies \(T\mathbf{q} = \mathbf{q}\) (eigenvector for \(\lambda = 1\)), normalised so entries sum to 1. Every column-stochastic matrix has \(\lambda = 1\) as an eigenvalue.

Diagonalise \(A = \begin{bmatrix} 4 & 1 \\ 2 & 3 \end{bmatrix}\) and compute \(A^k\). Solution: Characteristic polynomial: \(\lambda^2 - \operatorname{tr}(A)\lambda + \det(A) = \lambda^2 - 7\lambda + 10 = (\lambda-5)(\lambda-2) = 0\) Eigenvalues: \(\lambda_1 = 5\), \(\lambda_2 = 2\). For \(\lambda_1 = 5\): \((A - 5I)\mathbf{v} = \mathbf{0}\): \[\begin{bmatrix}-1&1\\2&-2\end{bmatrix}\mathbf{v}=\mathbf{0} \implies \mathbf{v}_1 = \begin{pmatrix}1\\1\end{pmatrix}\] For \(\lambda_2 = 2\): \((A - 2I)\mathbf{v} = \mathbf{0}\): \[\begin{bmatrix}2&1\\2&1\end{bmatrix}\mathbf{v}=\mathbf{0} \implies \mathbf{v}_2 = \begin{pmatrix}1\\-2\end{pmatrix}\] So \(P = \begin{bmatrix}1&1\\1&-2\end{bmatrix}\), \(D = \begin{bmatrix}5&0\\0&2\end{bmatrix}\). \(P^{-1} = \frac{1}{-3}\begin{bmatrix}-2&-1\\-1&1\end{bmatrix} = \begin{bmatrix}2/3&1/3\\1/3&-1/3\end{bmatrix}\) \[A^k = PD^kP^{-1} = \begin{bmatrix}1&1\\1&-2\end{bmatrix}\begin{bmatrix}5^k&0\\0&2^k\end{bmatrix}\begin{bmatrix}2/3&1/3\\1/3&-1/3\end{bmatrix}\] \[= \frac{1}{3}\begin{bmatrix}2\cdot5^k + 2^k & 5^k - 2^k \\ 2\cdot5^k - 2\cdot2^k & 5^k + 2\cdot2^k\end{bmatrix}\]

Markov Chain: A system has transition matrix \(T = \begin{bmatrix}0.7 & 0.4 \\ 0.3 & 0.6\end{bmatrix}\). Find the steady state. Solution: Solve \((T - I)\mathbf{q} = \mathbf{0}\): \[\begin{bmatrix}-0.3&0.4\\0.3&-0.4\end{bmatrix}\begin{pmatrix}q_1\\q_2\end{pmatrix} = \mathbf{0}\] From the first row: \(-0.3q_1 + 0.4q_2 = 0 \implies q_1 = \frac{4}{3}q_2\). With constraint \(q_1 + q_2 = 1\): \(\frac{4}{3}q_2 + q_2 = 1 \implies \frac{7}{3}q_2 = 1 \implies q_2 = \frac{3}{7}\), \(q_1 = \frac{4}{7}\). Steady state: \(\mathbf{q} = \begin{pmatrix}4/7 \\ 3/7\end{pmatrix}\).

Topic 6: Linear Transformations and Projections

Definition of Linear Transformation

A function \(T: \mathbb{R}^n \to \mathbb{R}^m\) is a linear transformation if for all \(\mathbf{u}, \mathbf{v} \in \mathbb{R}^n\) and \(c \in \mathbb{R}\):

\(T(\mathbf{u} + \mathbf{v}) = T(\mathbf{u}) + T(\mathbf{v})\)
\(T(c\mathbf{u}) = cT(\mathbf{u})\)

Quick test: If \(T(\mathbf{0}) \neq \mathbf{0}\), then \(T\) is NOT linear.

Transformation Matrix

Every linear transformation \(T: \mathbb{R}^n \to \mathbb{R}^m\) can be represented as matrix multiplication: \[ T(\mathbf{x}) = A\mathbf{x} \] where the standard matrix is: \[ A = [T(\mathbf{e}_1) | T(\mathbf{e}_2) | \cdots | T(\mathbf{e}_n)] \]

Finding [T] from Input-Output Pairs

If you know \(T(\mathbf{u}_1) = \mathbf{w}_1, \ldots, T(\mathbf{u}_n) = \mathbf{w}_n\) where \(\{\mathbf{u}_1, \ldots, \mathbf{u}_n\}\) forms a basis, then: \[ [T] \cdot [\mathbf{u}_1 | \cdots | \mathbf{u}_n] = [\mathbf{w}_1 | \cdots | \mathbf{w}_n] \] \[ [T] = [\mathbf{w}_1 | \cdots | \mathbf{w}_n][\mathbf{u}_1 | \cdots | \mathbf{u}_n]^{-1} \]

Composition of Transformations

If \(T\) has matrix \(A\) and \(S\) has matrix \(B\), then the composite \(S \circ T\) has matrix \(BA\) (right to left, matching function composition order).

Standard Transformations in \(\mathbb{R}^2\)

Transformation	Matrix
Rotation CCW by \(\theta\)	\(\begin{bmatrix}\cos\theta & -\sin\theta \\ \sin\theta & \cos\theta\end{bmatrix}\)
Reflection across \(x\)-axis	\(\begin{bmatrix}1 & 0 \\ 0 & -1\end{bmatrix}\)
Reflection across \(y = x\)	\(\begin{bmatrix}0 & 1 \\ 1 & 0\end{bmatrix}\)
Reflection across line at angle \(\theta\) to \(x\)-axis	\(\begin{bmatrix}\cos 2\theta & \sin 2\theta \\ \sin 2\theta & -\cos 2\theta\end{bmatrix}\)
Scaling by \(a\) (horiz.) and \(b\) (vert.)	\(\begin{bmatrix}a & 0 \\ 0 & b\end{bmatrix}\)

Projection

The projection matrix onto the line in the direction of \(\mathbf{a}\) is: \[ P = \frac{\mathbf{a}\mathbf{a}^T}{\mathbf{a}^T\mathbf{a}} \] The projection of \(\mathbf{v}\) onto the line: \(\operatorname{proj}_{\mathbf{a}} \mathbf{v} = P\mathbf{v} = \frac{\mathbf{a} \cdot \mathbf{v}}{\mathbf{a} \cdot \mathbf{a}}\mathbf{a}\)

The distance from \(\mathbf{v}\) to the line: \(\|\mathbf{v} - P\mathbf{v}\|\)

Projection matrices are idempotent: \(P^2 = P\). Applying the projection twice gives the same result as applying it once.

Range and Kernel

For a linear transformation \(T(\mathbf{x}) = A\mathbf{x}\):

Range (image) = \(\operatorname{col}(A)\) = set of all possible outputs
Kernel (null space) = \(\operatorname{null}(A)\) = set of all inputs mapped to \(\mathbf{0}\)

Least Squares

When \(A\mathbf{x} = \mathbf{b}\) has no exact solution (inconsistent system), the least squares solution minimises \(\|A\mathbf{x} - \mathbf{b}\|^2\) and is given by: \[ \hat{\mathbf{x}} = (A^T A)^{-1} A^T \mathbf{b} \] This solves the normal equations: \(A^T A \hat{\mathbf{x}} = A^T \mathbf{b}\).

Projection and distance: Find the projection of \(\mathbf{v} = \begin{pmatrix}3\\4\end{pmatrix}\) onto the line spanned by \(\mathbf{a} = \begin{pmatrix}1\\2\end{pmatrix}\), and the distance from \(\mathbf{v}\) to this line. Solution: \[\operatorname{proj}_{\mathbf{a}} \mathbf{v} = \frac{\mathbf{a} \cdot \mathbf{v}}{\mathbf{a} \cdot \mathbf{a}}\mathbf{a} = \frac{3(1)+4(2)}{1^2+2^2}\begin{pmatrix}1\\2\end{pmatrix} = \frac{11}{5}\begin{pmatrix}1\\2\end{pmatrix} = \begin{pmatrix}11/5\\22/5\end{pmatrix}\] The projection matrix: \(P = \frac{1}{5}\begin{bmatrix}1&2\\2&4\end{bmatrix}\). Verify: \(P^2 = P\). Distance: \[\mathbf{v} - P\mathbf{v} = \begin{pmatrix}3 - 11/5\\4 - 22/5\end{pmatrix} = \begin{pmatrix}4/5\\-2/5\end{pmatrix}\] \[\|\mathbf{v} - P\mathbf{v}\| = \sqrt{(4/5)^2 + (-2/5)^2} = \sqrt{16/25 + 4/25} = \sqrt{20/25} = \frac{2\sqrt{5}}{5}\]

Least squares regression: Fit a line \(y = c_0 + c_1 x\) to the points \((1, 2), (2, 3), (3, 6)\). Solution: Set up the system \(A\mathbf{c} = \mathbf{b}\): \[ A = \begin{bmatrix}1&1\\1&2\\1&3\end{bmatrix}, \quad \mathbf{b} = \begin{pmatrix}2\\3\\6\end{pmatrix} \] Compute normal equations: \[A^T A = \begin{bmatrix}3&6\\6&14\end{bmatrix}, \quad A^T\mathbf{b} = \begin{bmatrix}1&1&1\\1&2&3\end{bmatrix}\begin{pmatrix}2\\3\\6\end{pmatrix} = \begin{pmatrix}11\\26\end{pmatrix}\] Solve \(A^TA\hat{\mathbf{c}} = A^T\mathbf{b}\): \[\begin{bmatrix}3&6\\6&14\end{bmatrix}\begin{pmatrix}c_0\\c_1\end{pmatrix} = \begin{pmatrix}11\\26\end{pmatrix}\] \(\det(A^TA) = 42 - 36 = 6\) \[\hat{\mathbf{c}} = \frac{1}{6}\begin{bmatrix}14&-6\\-6&3\end{bmatrix}\begin{pmatrix}11\\26\end{pmatrix} = \frac{1}{6}\begin{pmatrix}154-156\\-66+78\end{pmatrix} = \frac{1}{6}\begin{pmatrix}-2\\12\end{pmatrix} = \begin{pmatrix}-1/3\\2\end{pmatrix}\] The best-fit line is \(y = -\frac{1}{3} + 2x\).