The Undeserving One

In the recent Machine Learning (ML) uprising, supervised and unsupervised learning algorithms, such as classification with Deep Learning and clustering with KNN, got most of the spotlight. When these algorithms receive flattering praises from the enthusiastic community, something equally powerful and elegant sits in the dark corner calmly and quietly. Its name is Monte Carlo — the forgotten and undeserving hero of atomic physics, modern finance and biomedical research, and gambling (or a villain depends on your opinions on these matters).

Note: I will refer to supervise and unsupervised learning methods as “ML algorithm” and Monte Carlo methods as “Simulation” for brevity.

A Short History

Stanislaw Ulam, Enrico Fermi, and John von Neumann — the geniuses at Los Alamos — invented, improved, and popularized the Monte Carlo method in the 1930s for a not-so-noble cause (hint: it’s not for the bomb). Watch the video to find out more.

A Short History of Monte Carlo Simulation (YouTube)

What is Monte Carlo Simulation?

If I was to summarize what Monte Carlo simulation is in one sentence, here it is: Fake it a billion times until we kind of know what the reality is.

On a technical (and more serious) level, the goal of the Monte Carlo method is to approximate the expectations of outcomes given various inputs, uncertainty, and system dynamics. This video walks through some high-level mathematics for those who are interested.

Monte Carlo Approximation, YouTube

Why use Simulation?

If I were to highlight one (oversimplified) advantage of Simulation over ML algorithms, it would be this: Exploration. We use Simulation to understand the inner working of any systems at any scale (e.g. the world, a community, a company, a team, a person, a fleet, a car, a wheel, an atom, etc.)

By re-creating a system virtually with simulations, we can calculate and analyze hypothetical results without actually changing the world or waiting for real events to happen. In other words, Simulations allow us to ask bold questions and develop tactics to manage various future outcomes without much risk and investment.

When to use Simulation, instead of ML?

According to Benjamin Schumann, a well-known simulation expert, Simulation is process-driven while ML is data-centric. To produce good Simulation, we need to understand the process and underlying principles of a system. In contrast, we can create reasonably good predictions using ML by only using data from a data warehouse and some out-of-box algorithms.

In other words, creating good simulation is often more expensive financially and cognitively. Why would we ever use Simulation?

Well, consider three simple questions:

Do you have data in a data warehouse to represent the business problem?

Do you have enough of these data — quantity- and quality-wise — to build a good ML model?

Is prediction more important than exploration (e.g. ask what-if questions and develop tactics to support business decisions)?

If you answer “No” to any of these, then you should consider using Simulation instead of ML algorithms.