A founding member of Google Brain and the mind behind AutoML, Quoc Le is an AI natural: he loves machine learning and loves automating things.

Le used millions of YouTube thumbnails to develop an unsupervised learning system that recognized cats when he was a Stanford University PhD in 2011. In 2014, he pushed machine translation performance with deep learning techniques and an end-to-end system that automatically converted words and documents into vector representations, laying the groundwork for Google’s subsequent breakthroughs in neural machine translation.

Quoc Le

Since 2014, Le has set his sights on automated machine learning (AutoML). The process of building machine learning models essentially requires repetitive manual tuning: Researchers attempt different architectures and hyperparameters on an initial model, evaluate the performance on a dataset, come back with changes, and the process repeats toward optimization.

Le sees this as a simple trial-and-error problem that can be solved by machine learning.

In 2016, Le teamed up with a Google resident and published the seminal paper Neural Architecture Search with Reinforcement Learning. The core idea was akin to building blocks: The machine picks up the components it needs from a defined space to build a neural network, and then improves its accuracy using a trial-and-error technique, which is reinforcement learning. The result was promising, as machines generated models that matched humans’ best performance models.

Le’s research contributed with the creation of Google Cloud AutoML, a set of tools that enables developers with limited machine learning expertise to train high quality models. Unsurprisingly AutoML quickly became a popular topic, with tech giants and startups alike following Google’s footprints and betting on the new tech.

Google Cloud released AutoML vision earlier this year, followed by AutoML translation and language.

Synced recently spoke with Le. In a wide ranging video interview, the unassuming 36-year-old Vietnamese AI expert spoke on his inspiration, the tech behind and the road ahead for AutoML, and its important new role in the machine learning field. Read on for insight into the man behind so many transformative technologies. The interview has been edited for brevity and clarity.

At the upcoming AI Frontiers Conference on Nov 9 in San Jose, California, Quoc Le will give a talk on “Using Machine Learning to Automate Machine Learning,” with a special focus on Neural Architecture Search and AutoAugment.

The Inspiration

When did you start thinking about designing a new neural architecture search and what inspired you?

It goes back to around 2014 and happened gradually over time. I’m an engineer in machine learning. When you work with neural networks all the time, what you realize is that a lot of them require manual tuning, what people call hyperparameters — the number of layers in the neural network, the learning rate, and what type of layers go into these networks. AI researchers tend to start with some principles, and then over time the principles kinda break loose and they try different things. I followed some of the developments in ImageNet competitions and I saw the development of Inception networks at Google.

I started thinking but wasn’t clear on what I wanted to do. I like convolutional networks, but I don’t like the fact that the weights in a convolutional network are not shared with each other. So I thought that maybe I should develop a mechanism to actually learn how to share weights in a neural network.

As I moved along I gained more and more intuition about this and then I looked into what to do. What researchers do is they take a bunch of existing building blocks, and then they try them out. They see some accuracy improvement. And then they say, “Okay, maybe I just introduced a good idea. How about keeping the good things I just introduced but replacing the old things with something new?” So they continue in that process — and an expert in this area could try hundreds of architectures.

Around 2016, I began thinking if there’s a process that requires so much trial and error, we should automatically be using machine learning there, because machine learning itself is also based on trial and error. If you think about reinforcement learning and the way a machine learned to play the game of Go, it is basically trial and error.

I worked out how much real compute I would need to do this. My thinking was a human might require a hundred networks because humans already have a lot of intuition and a lot of training. If you use an algorithm to do this, you might be one or two orders of magnitude slower. I thought actually that to be one or two orders of magnitude slower wasn’t too bad, as we already had sufficient compute power to do it. So I decided to start the project with a resident (Barret Zoph, who is now a Google Brain researcher).

I didn’t expect that it would be so successful. I thought that the best we could do was maybe 80 percent of human performance, and that would be a success. But the resident was so good that he was actually able to match human performance.

A lot of people said to me: “You spent so many resources to just match human level?” But what I saw from that experiment was that automated machine learning was now possible. It was just a matter of scale. So if you scale more, you get a better result. We continued into the second project and we scaled even more and worked on ImageNet, and then the results started to become really promising.

Can you tell us about Jeff Dean’s involvement?

Well, he was very supportive. Actually I want to credit Jeff Dean for his help in the inception of the idea.

I had a lunch with Jeff in 2014 and he shared a very similar intuition. He suggested if you looked closely at what researchers were doing in deep learning at that time, they were spending a lot of time tuning architecture at hyperparameters and so on. We thought there must be a way to automate this process. Jeff likes scaling and automating the difficult stuff that most tech people don’t like to do. Jeff encouraged me and I finally decided to do it.

Head of Google AI Jeff Dean

How different is neural architecture search from your previous research?

It’s different from what I did before in computer vision. The journey came from a thought and grew over time. I also had some wrong ideas. For example, I wanted to automate and rebuild the convolution, but that was the wrong intuition. Maybe I should have accepted the convolution and then used the convolution to build something else? It was a learning process for me, but it wasn’t too bad.

The Technology

What sort of components does a researcher or engineer need to build a neural network model?

It does vary a little bit amongst applications, so let’s narrow it down to computer vision first — and even within computer vision, there’s a lot of stuff going on. Typically in a convolutional network you have an input which is the image and then you have a convolutional layer and then a pooling layer and then batch normalization. And then there’s an activation function, and you decide to make a skip connection to a new layer and things like that.

Within the convolutional blocks, you have many additional decisions. For example in the convolution, you must decide the size of the filter: Is it 1×1? 3×3? 5×5? You also have to decide on pooling and batch norm. Regarding the skip connection, you can choose from layer one to layer ten or layer one to layer two. So there’s a lot of decisions to be made and a large number of total possible architectures. There could be a trillion possibilities, but humans now only look at a tiny fraction of what’s possible.