I recently joined Stanford University as Part-Time/Adjunct Faculty. This quarter, I am preparing for a course on "Reinforcement Learning for Finance" that I teach in the Winter quarter. But I am also spending some time this quarter with graduate students who are overwhelmed with the rapid changes in the industry (driven by the hype around Artificial Intelligence, Machine Learning, Data Science), and the key question they ask me is how should they optimize their time at Stanford to prepare for the industry, bearing in mind that the next 5-10 years will see further rapid changes in this space. Pondering over this topic the past few weeks while advising my students (particularly those specializing in Mathematical and Computational Finance), I thought I should write about this to help a wider community of people in the space of "Data Science" (students as well as those new to the industry). The goal here is to identify the relevant topics that will help this population not just with the current market situation but also help them prepare for a robust career, bearing in mind how the world is likely to change over the next few decades.

You will notice that I have referred to "Data Science" in quotations. The reason is that my typical advice to aspiring "Data Scientists" is to zoom out a bit and acquire slightly broader skills in what I call "Applied Mathematics and Computation". My definition of "Applied Mathematics and Computation" includes some mathematical/theoretical foundations, some algorithmic/computational skills, and some data/statistical skills. The idea is to bullet-proof for the future by going a bit broader than the topics that are in vogue today (some of which are unlikely to be popular in say 5-10 years). There will be a bias in my advice to focus on foundational strengths that will stand the test of time. But I do make some effort to cater to current hot topics like Deep Learning and Reinforcement Learning. Below I have laid out the key topics, roughly at the level of advanced undergraduate or Master's level courses. There will be some overlap of content across typical courses on these topics, and it can be a bit tricky determining the best sequence of courses. However, I did make some effort to lay them out in a sensible sequence based on typical pre-requisite considerations.

Linear Algebra

Partial Differential Equations

Measure-Theoretic Probability

Convex Optimization

Statistical Inference

Discrete Mathematics

Scientific Computing/Numerical Analysis

Data Structures and Algorithms

Software Design Paradigms in Python and C++

Stochastic Calculus

Stochastic Optimization

Managing/Analyzing Large Data Sets

Parallel/Distributed Computing

Deep Learning

Reinforcement Learning

One domain/practical project course focused on modeling/algorithms

One domain/practical project course focused on big data

This is a lot of content and some people might even need to do some pre-requisites that I didn't even list here. If you are a Master's student, try to stay for 2 years at your university so you can actually get the above coverage. If you are a newcomer to the industry, you will have to pace yourself as it can be hard balancing your job with education of these topics - you will want to give yourself a few years to get through these topics (hopefully you'd have already covered some of these topics while in school).

It's important to recognize that there are no short cuts to success. Some of these topics might seem "ancient" and too theoretical, but if you want to do well in "Data Science" (and really in the broader space of "Applied Mathematics and Computation"), you will need to develop adequate foundational understanding across both Mathematics and Computer Science. Hence, the demanding structure of my prescribed topics.

Finally, I want to emphasize two points that will go a long way in developing a deep and sustained understanding of these topics.

For each theoretical/abstract topic/concept you learn, remember to combine the rigor/notation/equations with a geometric/visual/intuitive understanding. Mathematics is as much about abstraction as about visualization. For each new concept you learn, quickly code up the idea or algorithm in say Python - this not only helps etch the concept in your head, it's also joyful to experience the theoretical beauty of mathematics brought to life in your code.

Good luck, and remember to have fun while you learn! As always, I am happy to further advice those of you I get to meet/work with at Stanford/in the Bay Area.