A Primer on Linear Algebra
A mxn matrix is given by
where \(m\) is number of rows and \(n\) is number of columns.
A vector is a single column matrix and is given by
with \(5\) rows.
Notations
- \(A_{ij}\) refers to element in \(i\)th row, \(j\)th column.
- In general matrixes are
1
indexed - both in math and in Matlab - \(v_{i}\) refers to element in \(i\)th row of a vector
- A vector with
n
rows is considered an n-dimensional vector - Matrices are denoted in uppercase and vectors and scalars in lower case.
Matrix operations
Matrix addition and subtraction
You cannot add a scalar to a matrix. You can however add two matrices, they need to be of same dimensions. You add each element at corresponding positions.
The same applies for subtraction.
Matrix and scalar multiplication and division
You can multiply a scalar with a matrix, there are not restrictions with respect to dimensions. You multiply or divide each element with the same scalar.
Division is similar.
Matrix and vector multiplication
Solving linear equations as matrix operations
For optimization, you can represent linear equations as matrix operations. For instance, consider the hypothesis function \(h_{\theta}x = -40 + 0.45x_{i}\). To compute the hypothesis for \(n\) different values of \(x_{i}\) (34,56,21,11,10), you can represent the calculation as a matrix operation:
$$ \begin{bmatrix} 1& 34\\ 1& 56\\ 1& 21\\ 1& 11\\ 1& 10 \end{bmatrix} \times \begin{bmatrix} -40\\ 0.45 \end{bmatrix} = \begin{bmatrix} -24.7\\ -14.8\\ -30.55\\ -35.5\\ -35.5 \end{bmatrix} $$ Such matrix computation is way faster than a loop. This is applicable for most language including java, c++, octave, python.
Matrix x matrix multiplication
To multiply two matrices, the number of columns of first should match number of row of second => (mxn x nxp = mxp matrix).
Extending the former example, suppose you want to calculate the prediction for 3 different hypothesis functions, you can represent that problem as a matrix x matrix multiplication:
Representing these as matrix operations allows programming languages to compute them in parallel, allowing for great speedups.
Properties of matrix multiplications
- Matrices are not commutative: \(A\times B \ne B\times A\)
- Matrices are associative: \((A \times B) \times C = A \times (B \times C)\)
- Identity matrix is a matrix made of ones for diagonals of same dimension such that \(A \times I = I \times A = A\)
- Identity matrix is always a square matrix.
Matrix inverse
A matrix is said to be the inverse of another matrix, if you multiply that with the matrix, you get an identity matrix. \(A \times A^{-1} = I\).
Only certain square matrices have inverses. We typically compute inverse using software.
Matrix transpose
Transpose of a matrix can be created by flipping the rows and columns. For a matrix \(A\), matrix \(B\) is said to be its transpose \(A^{T} = B\) if \(B_{ij} = A_{ji}\). In other words, \(A_{ij} = A^T_{ji}\).