Why studying Linear Algebra is important for Machine Learning and where to start

Linear algebra is the foundation of machine learning, from the notations used to describe the algorithms, to the implementation of the algorithms itself. Here we explain the importance of linear algebra for machine learning.
Cover Image

introduction to linear algebra

What do you see when you look at this image? Well, it's a cute dog playing with a ball. Easy, right? This seemingly obvious observation is not an easy task for a computer. How does a computer that processes everything in 0s and 1s even store this image?

The answer is matrices!


A digital image is made up of small indivisible units called pixels. Each pixel of an image is represented by an intensity value. Therefore, an image is essentially a matrix whose elements are the intensity values of each individual pixel.

To expand, compress, crop or perform any operation on these images, linear algebra is most likely involved.

So what is linear algebra?

Linear algebra is a branch of mathematics that deals with linear equations and linear functions which are represented through matrices and vectors. In simpler words, linear algebra helps you understand geometric concepts such as planes, in higher dimensions, and perform mathematical operations on them.

It can be thought of as an extension of algebra into an arbitrary number of dimensions. Rather than working with scalars, it works with matrices and vectors.

Why do I need to study linear algebra for machine learning?

Linear algebra is the building block of machine learning and deep learning. Understanding these concepts at the vector and matrix level deepens your understanding and widens your perspective of a particular ML problem.

Here are a few applications of linear algebra in machine learning:

  1. Vectorization:

    Take, for example, you are estimating the price of houses in a neighborhood. After observing the dataset, you come up with the equation:

    Price = 116890 + 9120 * (Size of the house in square feet)

    Now, consider you need to calculate the price of 100 houses:

    116890 + 9120 * (3150) = ₹ 2,88,44,890
    116890 + 9120 * (5600) = ₹ 5,11,88,890
    116890 + 9120 * (1230) = ₹ 1,13,34,490
    116890 + 9120 * (4140) = ₹ 3,78,73,690
    116890 + 9120 * (2560) = ₹ 2,34,64,090
    And so on..

    These computations can be performed using a for-loop for 100 iterations. However, as the volume of data increases, operations done with scalar values start to be inefficient.
    Standard 'for-loop' algorithms can be reformulated as matrix equations to provide significant gains in computational efficiency.

    mat(100x2) . mat(2x1) = mat(100x1)

    Linear algebra provides the first steps into vectorization, presenting a deeper way of thinking about parallelization. Even the most rudimentary machine learning models like linear regression use these techniques. Such methods are also used in Python libraries such as NumPy, SciPy, Scikit-Learn, Pandas, and Tensorflow.

  1. Dimensionality Reduction:

    Consider a dataset consisting of training samples with ‘n’ number of features used to predict a certain target variable. A useful geometric representation of this dataset could be used to consider the training samples as points in an n-dimensional feature space.

    If n is very large, the volume of the feature space is very large. This means that the points in that space often represent a small and non-representative sample. Too many input variables can dramatically impact the performance of machine learning algorithms. This is generally referred to as the “curse of dimensionality.”

    One method of tackling the ‘curse of dimensionality’ uses matrix factorization . In this, matrices are represented in terms of simpler and structured smaller dimensional matrices with important computational properties. Matrix decomposition techniques include Lower Upper (LU) decomposition, QR decomposition, Eigen Decomposition and Singular Value Decomposition (SVD). They are an intrinsic component of dimensionality reduction algorithms such as Principal Components Analysis (PCA).

  2. Computer Vision:

    As explained in the example before, all images are stored as a matrix. Each pixel in an image has an intensity value that ranges from 0 to 255. A value of 0 represents a black pixel and 255 represents a white pixel. Similarly, a m x n grayscale image can be represented as a 2D matrix with m rows and n columns with the cells containing the respective pixel intensity values. A colored image, on the other hand, is generally stored in the RGB system. Each image can be thought of as being represented by three 2D matrices, one for each R, G, and B channel. However, instead of using 3 matrices to represent an image, a tensor is used.

It can’t be emphasized enough how fundamental linear algebra is to machine learning. For those of you that want a deeper understanding of the inner workings of machine learning and deep learning algorithms, linear algebra is essential.

Okay, I guess I do need it. Where do I begin?

Linear Algebra - Khan Academy

Khan Academy offers practice exercises, instructional videos, and a personalized learning dashboard that empower learners to study at their own pace in and outside of the classroom. Khan Academy's linear algebra lecture series provides a thorough introduction to linear algebra from the basics of vectors and matrices to projections into lower-dimensional spaces, eigenvectors, and eigenvalues. If you are new to all of these concepts, this is a good place to start!

Essence of linear algebra - 3Blue1Brown

The Essence of linear algebra playlist contains 14 video lectures by Grant Sanderson. The lectures give a geometric understanding of the subject with good visualizations. The lectures cover vectors, linear combinations, matrices, determinants, inverse matrices, systems of linear equations, dot products, cross products, transformations, eigenvalues, and eigenvectors. If you are a visual learner, these videos are for you! If you are already familiar with the fundamentals of linear algebra, these videos can help you brush up on your basics.

Linear Algebra | Mathematics  - MIT OpenCourseware

This course parallels the combination of theory and applications in Professor Strang’s textbook Introduction to Linear Algebra. Dr. Gilbert Strang picks out four key applications in the book: Graphs and Networks; Systems of Differential Equations; Least Squares and Projections; and Fourier Series and the Fast Fourier Transform.

Introduction to Linear Algebra - Gilbert Strang

This leading textbook Introduction to Linear Algebra gives a clear introduction to the subject of linear algebra. Unlike most other linear algebra textbooks, the approach is not a repetitive drill and shows the beauty and variety of the subject. The self-teaching book is loaded with examples and graphics and provides a wide array of probing problems, accompanying solutions.

Linear Algebra and Its Applications - Gilbert Strang

Strang's Linear Algebra and Its Applications contains a wealth of applications and examples of how to use linear algebra in science and engineering. This is a great reference book for anyone who has already taken an introductory course in linear algebra.

Linear Algebra: Gateway to Mathematics - Robert Messer

This text combines the many simple and elegant results of elementary linear algebra with some powerful computational techniques to demonstrate that theoretical mathematics need not be difficult or mysterious. This book is written for the second course in linear algebra.

Hope you got learn the importance of linear algebra for machine learning. Like this article? Learn about other machine learning prerequisites - probability, calculus, and Git and Anaconda.

Share this post

-