Welcome to Profitable Python. I’m Ben McNeill, and today we talk to Alexey Grigorev.

Alexey lives in Berlin with his wife and son. He’s a software engineer with a focus on machine learning. He works at OLX Group as a Lead Data Scientist. Alexey is a Kaggle master and he wrote a couple of books. One of them is “Mastering Java for Data Science” and now he’s working on another one — “Machine Learning Bookcamp”. That’s not a typo, it’s called “Bookcamp”.

Alexey, welcome!

Hi Ben, it’s a pleasure to be here.

Glad to have you. Can you tell me more about the book you’re working on?

Yes! The idea is to learn machine learning by doing projects. Every chapter is a new project: you get a dataset, prepare it and build a model. We also look inside each model and see how it makes predictions and try to understand it. It’s pretty hands-on and we keep the theory to a minimum. It’s not always possible to completely avoid it, that’s why there are still formulas here and there, but the focus is on coding.

Illustration from the book: the input to an ML algorithm is features and target, and the output is a model

The target reader is somebody who can already code, like a software engineer, or a student who already has experience in coding. As you go through the book, you build a portfolio of projects, and in the end, you know enough to get a job as an ML engineer and continue learning at work.

The book is still in progress, but it’s already possible to have a look inside and decide if you like it or not. There are already three chapters and one appendix. In total, there will be ten chapters, so it’s 30% ready now.

What triggered your idea for writing this book?

I wrote another book some time ago — it was about Java for data science. I really liked it, but it turned out to be a pretty niche topic: Java is not that popular for ML. Today, if you want to do ML, you use Python, not Java. So, I thought, I still have things that I want to share with the world, but the world is not interested in Java, so let’s write another book — this time using Python.

The idea of learning by doing projects is something I picked up from Kaggle — this is a website for hosting data science competitions. At Kaggle, a company prepares a dataset and then ask the community to train the best possible model for them.

Only on Kaggle I really learned ML. I had spent years at university learning theory, but it was not helpful without proper practice. And, to be honest, that theory wasn’t really needed for competitions — participating is a very hands-on activity which requires a lot of coding: you generate new features, throw them in a library, tweak the parameters, see the results, and keep repeating till the competition is over.

I took part in more than 10 competitions, and each one of them was extremely useful

This made me realize that the best way of learning ML is not studying it, but doing it: each Kaggle competition is a project, and by doing these projects I was able to actually learn ML. To me, it was more helpful than watching how equations are solved on a blackboard.

That’s why I think that the project-based approach is the best way for software engineers and other people who can code. That’s why I decided to write this book.

Cool. I’m excited to see how that progresses. Can you tell us about your background and how you became a machine learning expert?

I’m originally from a small city from the Far East of Russia. There’s a small university in my city and the things I studied were a bit outdated: I was learning to use Delphi and things like that. In 2006, it was all that organizations in my region needed. Yet, we learned some fundamental things like databases and automating business processes.

That was fun, but the subjects I really enjoyed were math and stats. Unfortunately, there was no way to apply these skills: everybody just needed a database.

Eventually, I moved to a bigger city in central Russia. Companies there didn’t need Delphi, they needed Java and they needed web services. So I switched my focus to Java and worked as a Java developer for a while.

In 2012 Coursera became popular. One of the courses there was machine learning. I happened to watch that course and it changed my life: Java and databases were fun, but ML got me, so I decided to go that way.

The famous Machine Learning course by Andrew Ng

Around that time, data science positions also started to appear on the market, even though in small numbers. Back then I lived in Krakow, Poland, and there were a couple of open positions, but all of them required a PhD and five years of Hadoop experience. The companies didn’t understand what they wanted to get from data science, they just heard about ML and thought, “let’s hire somebody with a PhD and let them figure out what we want”.

Data science interviews in 2013 (Photo by Van Tay Media on Unsplash)

It was a difficult time to switch from Java to data science. That’s why I decided to get some additional education and do a master’s.

After graduating I suddenly discovered that companies didn’t need a PhD anymore. The market figured out what they needed from data scientists. They understood that a PhD is a nice to have, but not a must. And since then the demand for data science and ML has been only growing.

Google trends: the interest in data science and machine learning is growing steadily (link)

When I started working as a full-time data scientist, I found out that having a background in software engineering is very helpful. Only training a model is not enough, you need to go to production, and this is something many data scientists struggle with. They know how to read papers and implement algorithms from these papers, but going to production is a different thing. That’s why having some exposure to software engineering is helpful for data scientists to make use of their models and solve business problems faster.

A software engineer is doing data science in production (Photo by Alora Griffiths on Unsplash)

I also noticed that many companies saw that and started looking for these kinds of people — software engineers who know ML, maybe not in-depth, but enough to actually train a model and then make a business impact.

That’s interesting, and a little counterintuitive.

Yes. I tried to go deep into math, but in the end, it wasn’t needed. Even though it’s fun to study it, the job I and many of my colleagues are doing is different. We don’t solve math equations. We spend months preparing a dataset, then take a library, wait for ten minutes for training to finish, get a model, and then again spend months on productionizing the model.

The part of training a model doesn’t take a lot of time, yet the focus of many programs at universities and many online courses is on the mathematics behind ML. Yes, it’s still needed: as a data scientist, you sometimes need to understand what’s going on inside the libraries — to the same extent that a software engineer needs to know how TCP/IP works. However, not many software engineers need to go deep down the network stack daily. Maybe once in five years, when there’s a problem, but day-to-day work typically doesn’t require that.

OSI Model. Quite useful for software engineers, but maybe not needed for the day-to-day work (source: https://commons.wikimedia.org/wiki/File:Osi-model.png)

So, you just need to have broad knowledge and good exposure to different techniques. Knowing the details is good, but not always necessary. All you need is the foundations — to be able to notice if things aren’t right. But when you face a problem, then you need to go deeper.

I did it the other way around and tried to understand everything in advance. Instead, I should have focused on hands-on skills, and dig deep when it’s required.

I noticed that on your CV you mention “rapid prototyping skills”. How do you use them?

For software engineering projects, especially for ML, it’s important to validate ideas as fast as possible.

Often you can validate an idea without even coding anything.

Imagine that we want to build a system that determines if an image is good or not: it’s properly framed, in focus and has good light exposure. Instead of investing time in a model, we ourselves can just look at the pictures and decide if the quality is good or not. If it’s not, then we send emails to the sellers and see how they react. If their reaction is positive and it’s something they need, then we invest time into building an actual model.

Me during rapid prototyping (Image: old Soviet cartoon https://www.youtube.com/watch?v=_39zAeiNXxo — in Russian with English subtitles)

When it comes to prototyping, you can quickly build something, e.g. a simple Flask app in Python that initially might not even follow the best engineering standards. The focus is on speed because you want to validate the idea first before investing time into it. Then you present this to the users, or the stakeholders within the company, and see if they’re interested in this project.

Moving fast also means failing fast and learning fast: often the decision-makers would say “this is not what we meant, we don’t need this thing”. It’s very good to learn feedback early: spending half a year only to find out the project is not needed is quite frustrating.

How to become an expert at rapid prototyping?

Time-frame your task. Let’s say, give it only five days, and then do as much as possible during these days. Move fast, but don’t let it take more time than planned.

If you only have one week, then you start thinking about the most important thing to focus on. Then within a week, you have a working system and demo it.

How without much effort convince somebody to hire you as a machine learning professional?

Well, first of all, buy my book (laughs).

But, seriously, do projects.

However, to get a job, you may also need to get noticed. If you have an idea, find or collect a dataset, implement this idea and train a model. Don’t stop at that yet. Put the code on GitHub, write a blog post about it and share it on social media. If you do this for ten projects — you’ll get noticed for sure.

What machine learning newcomers should avoid?

In the beginning, try to avoid getting too deep into theory. For example, there are some algorithms, like SVM, that require a strong mathematical foundation to understand them: two years of calculus and then a year of convex optimization on top of that. This is quite serious, and you might think, “I don’t understand anything, so I’m quitting”. And people give up. Don’t. Don’t let these equations scare you.

A data science newcomer trying to understand SVM (Photo by Tom Pumford on Unsplash)

The ML libraries hide this complexity. You can use them without being afraid of what’s inside — just like a software engineer can create a web service without knowing how TCP/IP works. Of course, knowing the foundations is important, but as a newcomer, you should focus on learning fast rather than going deep.

Scikit-Learn: a great library for doing machine learning (scikit-learn.org)

So stay at the right level of abstraction — for a beginner that might be a library that you treat as a black box. When you need, then go deeper. Don’t try to peel the onion all at once.

What are the key learnings from running ML models in production?

Go as fast as possible. Try to deploy a model as quickly as possible. One of the most common reasons for project failures is spending too much time on making the perfect model. Don’t. Start with a simple baseline, then try to deploy it — and go with it to production and see how people react to it.

Don’t make it overly complex. For example, if you want to build a price suggestion model, start with the average price per category. In the case of a car, it can be as simple as the average per make and model. Don’t start with a neural network or something equally complex.

If you invest a lot of time in a system, it’s hard to let it go. It’s known as the IKEA effect: if you build something yourself, you love it, even if for others it’s nothing special. Imagine you’re building a chair, you spend three hours building it and you finally made it — it’s not falling apart and you can sit on it. You love this chair, but for others, it may look pretty ugly. The same thing happens with software: if people build something for a long time, they have trouble letting it go.

This is what happens when I build a chair myself (Photo by William Warby on Unsplash)

I have this problem, so realizing the problem is already helpful. At some point, it doesn’t make sense to continue working on a project and you just need to stop.

But if at the beginning you spend only one week and it doesn’t solve the problem — you stop working on it. You learn from the feedback early and move on.

Best advice you’ve ever received?

“Math is not as important as you think, but solving business problems is more important than it seems”. If your program is just a bunch of “if” statements, but it solves a problem — it’s good. So start with simple heuristics, and if you see that it’s helpful, move on to do more complex things.

Nobody cares what’s inside your service as long as it solves business problems

How do you decide what speaking opportunities to take on?

I will take on pretty much anything because I don’t have that many speaking opportunities at the moment. I’m still learning how to make my proposals sound exciting, so the conference organizers accept them and let me speak at their events.