We recently caught up with Ryan Adams - Assistant Professor of Computer Science at the Harvard School of Engineering and Applied Sciences and leader of the HIPS (Harvard Intelligent Probabilistic Systems) group - to learn more about the research underway at HIPS and his recent work putting powerful probabilistic reasoning algorithms in the hands of bioengineers...

Hi Ryan, firstly thank you for the interview. Let's start with your background.

Q - What is your 30 second bio?

A - I grew up in Texas, where my family has a ranch. I went to MIT for EECS and spent some time at NASA and in industry. I got my PhD in Physics at Cambridge University as a Gates Cambridge Scholar. I spent two years as a CIFAR Junior Research Fellow at the University of Toronto. I joined Harvard in the School of Engineering and Applied Sciences two and a half years ago as an assistant professor.



Q - How did you get interested in Machine Learning?

A - I was deeply interested in Artificial Intelligence, and as an undergrad received the excellent advice to work with Leslie Kaelbling, a premier researcher in the field and professor at MIT.



Q - What was the first data set you remember working with? What did you do with it?

A - As an undergrad, I spent some time doing financial modeling with neural networks.





Ryan, an impressive and interesting background - thank you for sharing. Next, let's talk more about Intelligent Probabilistic Systems and your work at HIPS.



Q - What excites you most about your work at HIPS?

A - I'm most excited about my fantastic students and collaborators, and the range of science that we can all do together. We pursue a lot of different research interests in my group. I'm excited about our new theoretical and methodological developments in areas such as Markov chain Monte Carlo, Bayesian optimization, Bayesian non-parametrics, and deep learning. I'm also excited about our collaborations in astronomy, chemistry, neuroscience, and genetics.



Q - What are the biggest areas of opportunity / questions you want to tackle?

A - There are several big questions that I'd like to make progress on in the near future:

How do we scale up computations for Bayesian inference, to reason under uncertainty even when data sets become large?

How do we perform optimizations over complicated structured discrete objects?

How can we automatically discover hierarchical modularity in data?

Q - What is the most interesting model / tool / computational structure you have developed thus far?

A - Of the work we've done recently, our stuff on practical Bayesian optimization is the hottest, I think. We're actually working on a startup based on this technology as well. (Keep an eye on whetlab.com for developments.)



Q - What problem does it solve?

A - It optimizes difficult functions, but in particular, it can automatically tune other machine learning algorithms. It's had big success in tuning deep learning procedures without human intervention. Our open source software tool (called "Spearmint") has enabled non-experts to apply machine learning to novel domains.



Q - How does it work?

A - It uses relatively sophisticated Bayesian modeling and inference for Gaussian process function models to make recommendations on what function evaluations to try. The idea is to use information theory to make good decisions about optimization.



Q - What has been the most surprising insight it has generated?

A - It's a case where marginalizing over uncertainty in a probabilistic model really gives a huge win. It turns out that humans are pretty bad at these problems in more than a couple of dimensions, but that machine learning algorithms can often do a great job.





Very interesting - look forward to hearing more about Whetlab in the near future! Let's talk about your recent work with Wyss Institute for Biologically Inspired Engineering, which has shown how AI algorithms could be implemented using chemical reactions...



Q - What question / problem were you trying to solve?

A - We were initially working on distributed inference algorithms for robotics, but we realized that chemical reactions mapped much better onto inference problems. We then focused on figuring out how chemical reaction networks could be used to implement the belief propagation algorithm.



Q - How is AI/Machine Learning helping?

A - It's not that AI/ML are helping, it's that in the longer term we're hoping these algorithms will be useful for synthetic biology.



Q - Got it, so what answers/insights did you uncover?

A - We showed that these important classes of computations can be performed without needing a digital computer. In particular, chemical reactions turn out to be a very natural substrate for graphical model computation.



Q - What are the next steps?

A - In addition to working with experimentalists to try to implement these ideas in vitro, we have several theoretical directions we want to pursue. For example, these algorithms should lead to improved error correction in synthetic biology implementations.



Editor Note - If you are interested in more details on this research, Ryan's paper on Message Passing Inference with Chemical Reaction Networks is very insightful; and recent press coverage, such as this Phys.org report, provides a little more color ... Now, back to the interview!



Finally let's talk a little about helpful resources and where you think your field is headed...



Q - What publications, websites, blogs, conferences and/or books are helpful to your work?

A - The main ML publication venues that I read and contribute to are:

Conferences: NIPS, ICML, UAI, AISTATS

ML Journals: JMLR, IEEE TPAMI, Neural Computation, Machine Learning

Stats Journals: JASA, Annals of Statistics, Journal of the Royal Statistical Society, Biometrika, Bayesian Analysis, Statistical Science, etc.

Blogs: Andrew Gelman, Radford Neal, Larry Wasserman, Yisong Yue, John Langford, Paul Mineiro, Yaroslav Bulatov, Il 'Memming' Park, Hal Daume, Danny Tarlow, Christian Robert, and others I'm sure I've forgotten.

Q - What does the future of Machine Learning look like?

A - I think machine learning will continue to merge with statistics, as ML researchers come to appreciate the statistical approach to these problems, and statisticians realize that they need to have a greater focus on algorithms and computation.



Q - What is something a smallish number of people know about that you think will be huge in the future?

A - I think some of the recent work on the generalized method of moments (and tensor decomposition) is very interesting. I also think that the area of Bayesian optimization is going to get bigger, as we figure out how to tackle harder and harder problems. People are also beginning to understand better the behavior of approximate inference algorithms, which will become a bigger deal I expect.



Q - Any words of wisdom for Machine Learning students or practitioners starting out?

A - Go deep. Learn all the math you can. Ignore artificial boundaries between institutions and disciplines. Work with the best people you can. Be wary of training in "data science" where you just learn to use other people's tools. To innovate, you have to learn how to build this stuff yourself.





Ryan - Thank you ever so much for your time! Really enjoyed learning more about the work you are doing at HIPS and where it could go next.

HIPS can be found online at http://hips.seas.harvard.edu and Ryan Adams at http://www.seas.harvard.edu/directory/rpa.



Readers, thanks for joining us!



what it takes to become a data scientist

what skills do I need

what type of work is currently being done in the field

If you enjoyed this interview and want to learn more aboutthen check out Data Scientists at Work - a collection of 16 interviews with some the world's most influential and innovative data scientists, who each address all the above and more! :)