Motivation

Learning the theoretical background for data science or machine learning can be a daunting experience, as it involves multiple fields of mathematics and a long list of online resources.

In this piece, my goal is to suggest resources to build the mathematical background necessary to get up and running in data science practical/research work. These suggestions are derived from my own experience in the data science field, and following up with the latest resources suggested by the community.

However, suppose you are a beginner in machine learning and looking to get a job in the industry. In that case, I don’t recommend studying all the math before starting to do actual practical work, this bottom-up approach is counter-productive, and you’ll get discouraged, as you started with the theory (dull?) before the practice (fun!).

My advice is to do it the other way around (top-down approach), learn how to code, learn how to use the PyData stack (Pandas, sklearn, Keras, etc.), get your hands dirty building real-world projects, use libraries documentation and YouTube/Medium tutorials. THEN, you’ll start to see the bigger picture, noticing your lack of theoretical background, to understand how those algorithms work, at that moment, studying math will make much more sense to you!

Here’s an article by the fantastic fast.ai team, supporting the top-down learning approach

And another one by Jason Brownlee in his gold mine “Machine Learning Mastery” blog

Resources

I will divide the resources to 3 sections (Linear Algebra, Calculus, Statistics and probability), the list of resources will be in no particular order, resources are diversified between video tutorials, books, blogs, and online courses.

Linear Algebra

Used in machine learning (& deep learning) to understand how algorithms work under the hood. It’s all about vector/matrix/tensor operations; no black magic is involved!

Calculus

Used in machine learning (&deep learning) to formulate the functions used to train algorithms to reach their objective, known by loss/cost/objective functions.

Statistics and Probability

Used in data science to analyze and visualize data, to discover (infer) helpful insights.

Bonus materials

So, that was me giving away my carefully curated Math bookmarks folder for the common good! Hope that helps you expand your machine learning knowledge, and fight your fear of discovering what’s happening behind the scenes of your sklearn/Keras/pandas import statements.

Your contributions are very welcomed, through reviewing one of the listed resources or adding new awesome ones.