Long Beach was really pretty at night.

I recently had the opportunity to attend the Neural Information Processing Systems conference (NIPS). I’m glad that I attended because I learned a lot and made a lot of really good connections (and as a bonus I was able to see Southern California, which I had never seen before, and escape the harsh Maine winter). At NIPS several overarching themes and techniques seemed to come up throughout the conference and the workshops. Additionally, several applications of ML consistently received a lot of attention. Since one could easily write a dozen articles on NIPs, in this article I’m going briefly survey what I thought constituted these overarching themes. Later, I will write more detailed articles on the ones that I personally explored at NIPS. I will leave the other ones to people that attended them. I made related links available at the bottom of the page so that you can explore the topics at your own leisure.

(Note that NIPs is such a large conference I could easily be missing something and to a certain extent these may reflect my own experiences and the people I hung out with. Nonetheless, I tried my best to capture all the ideas that seemed prominent across “disciplines” at NIPs).

Additionally, before getting started it does seem proper to mention that while thankfully the conference was not affected by the fires, many people in Southern California were not so fortunate. So please donate to either the United Way Campaign relief fund or one of the other charities if you get a chance and are able to.

Bayesian Deep Learning (and Deep Bayesian Learning)

Applying Bayesian techniques to deep learning was a huge topic at NIPs. On Thursday Yee Whye Teh gave a talk on Bayesian Deep Learning and Deep Bayesian Learning. He described how synchronization works in distributed Bayesian deep learning, stating that the server essentially maintains a posterior distribution rather than the authoritative copy of the parameters. He then tied this idea to the problem of overcoming catastrophic forgetting in neural networks and elastic weight consolidation. In the second part of the talk he described how Deep Learning techniques could improve Bayesian learning. He described how DL can help overcome some of the rigidity in Bayesian models and increase the inference and scalability of them. In addition to the keynote there were several spotlights on Bayesian techniques. With one of my personal favorites being on Bayesian GANs.

I also saw Bayesian techniques come up several times during the meta-learning symposia. Furthermore, Bayesian techniques hold the record (at least for this year) with four directly related workshops. All around the building you could hear people saying Bayesian in reference to some technique or model. Finally, using Bayesian methods in deep learning has the potential to help with the issue of interpretability or at least provide more “nuanced” decisions. This brings us to our next trend.

Rigor, interpretability, stability/safety, and theory

This trend should come as no surprise. By now you have probably all seen the test of time video by Ali Rahimi and the resulting discussion it generated (if you haven’t links to the presentations are below). It is important to note that these four subjects are similar but not exactly the same. I chose to group them together under the broad idea that they all are related to providing an explanation, whether technical or non-technical, theoretical or experimental, empirical or qualitative, for why a model behaves the way it does (or alternatively guaranteeing the overall stability of a model). For instance, one could demonstrate rigor through experimental results rather than theory. Additionally, interpretability can mean different things to different people (for instance what a ML researcher thinks is a rational explanation for a model’s decision might not be convincing enough for a doctor to base his diagnosis on [or a patient to accept]). I will not weigh too much into the enthusing debate regarding alchemy versus engineering except to say that it definitely generated a lot of discussion and strong opinions on both sides (links below).

I will say that whether or not you think interpretability or ‘rigor’ is essential for research, that it is important on a practical level for applications. How can we expect Congress to approve self-driving cars, if we cannot explain the decisions that it makes and the reasons why? Or how can we expect a doctor to perform a surgery just because a model thinks that there is a high probability of someone developing cancer? Finally, the study of interpretability will greatly help with debugging neural networks. For me at least, there are many times when I write code and it runs fine but then the network won’t converge or it gives me a totally unexpected result. With that said there were many good presentations and interesting works at the interpretability workshop and symposium that I urge you to check out.

Geometric Deep Learning and the Graph CNN

A lot of data is naturally best represented by a graph type structure; for instance, a social network, a route between multiple cities, or chemicals. However, traditional neural networks do not handle this type of structured data well. At NIPs 2017 the Graph Neural Network or GNN was featured prominently. On the first day of NIPS Michael Bronstein, Joan Bruna, Arthur Szlam, and Yann LeCun hosted a tutorial about geometric deep learning on structures and manifolds. In the tutorial they explained the theory behind the GNN model and several of its applications. But the GNN was also found in several NIPS papers and around many of the workshops. The topic came up many times in informal discussions as well. Personally, I think being able to “maintain” graph or manifold structure “within” a neural network is a big step forward that could have applications in a variety of different fields.

Generative Adversarial Networks

GANs were quite popular all around NIPS this year. Perhaps not quite as hot as last year, nonetheless they could be found throughout the main conference and in almost every workshop. In the main conference we had “Dual-Agent GANs for Photorealistic and Identity”, “VEEGAN”, “f-GANs”, “Dualing GANs”, and “Improved Training of Wasserstein GANs, to name a few. Despite not having a dedicated workshop they still popped all over the place on Friday and Saturday. For instance, the creativity workshop was almost entirely made up of GANs generating art, music, and speech. Several presenters even mentioned GANs in the ML4H and MedNIPS workshops. Apparently there are several applications in the medical field such as augmenting training data, generating synthetic data (that is not subject to HIPPA compliance), and reconstructing/changing modalities of images (i.e. MRI to CT).

Reinforcement Learning (particularly deep reinforcement learning)

Reinforcement Learning continued to be a hot subject at NIPs. The main conference had a whole track dedicated to RL, a tutorial on using RL with people, and a workshop on deep RL. But what is even more significant was the number of times RL came up in other workshops and discussions. For instance, several papers at the ML4H workshop discussed using RL for guiding Sepsis treatment and Malaria Likelihood Prediction.

This conference made it clear (at least to me) that RL is no longer something used solely on computer and board games with well defined rules, and that now researchers actively apply RL in solving real world problems from sepsis treatment, to chatbots, to robotics.

Meta-learning

Meta-learning was also a fairly large topic at NIPS. Meta-learning is essentially the art of learning to learn or optimizing the optimization algorithm. If this seems confusing, you are not alone. During the panel discussion there was a long back forth about what exactly meta-learning is. In the context of this conference at least, meta-learning seemed to essentially consist of using algorithms to find the optimal hyper-parameters of a model and/or the optimal structure of a network.

The basic idea is that a lot of time (and often money) is wasted manually testing various hyper-parameter configurations and different network structures. Many times the truly optimal structure might never be found. What if we had an algorithm that could learn to optimize the network to provide the best result. Several presenters explored this issue in the symposia on Thursday evening and the discussion continued in the workshop on Friday. Many people also brought up meta-learning in conversation by asking presenters if they automatically tuned their hyper-parameters.

II. Application Areas

Long Beach was bright and sunny for the pretty much the entire conference.

Healthcare

Applying machine learning to health care was a popular topic at this year’s conference. The conference had two workshops, Machine Learning for Healthcare (ML4H) and Medical Imaging meets NIPs (both of which I will summarize in detail in coming articles) and several related workshops that featured a large amount of healthcare related submissions (e.g., computational biology). In summary, ML4H focused on a variety of areas of healthcare. These areas include using ML for drug discovery, hospital operations (like forecasting length of stay, hand hygiene, etc.), and genetics research. Medical imaging meets NIPS focused exclusively on medical imaging, primarily exploring how ML techniques could help with medical image segmentation, reconstruction, and classification.

Healthcare also popped up at the main conference as well. On Tuesday Brendan Frey of Deep Genomics gave a keynote on how AI could accelerate research and even provide curative effects of genetic disease. Additionally, there was an abundance of medical AI companies around with many having interactive demos in the convention center.

Robotics

Robotics was another area that seemed to get a fair amount of attention in the main conference and workshops. On Tuesday there was a demo of “ Deep Robotic Learning using Visual Imagination and Meta-Learning” in the Pacific Ballroom. Then on Wednesday Pieter Abbel gave a talk on Deep Learning for Robotics. Likewise, there was also a workshop on Friday that discussed “Acting and Interacting in the real world.”

Sciences, Energy, and Engineering

There also seemed to be a lot of content about applying machine learning techniques in other sciences (or vice versa using principles from other sciences to augment machine learning) and to solve energy problems. In the main conference there was a keynote related to energy and emissions. Specifically, Monday after the opening remarks, John Platt talked about “Energy Strategies to Decrease CO2 Emissions.” In it he described how ML could be applied to forecasting utilization and finding combinations of energy sources in order to get down to a set fraction of the current emissions. Platt went on to discuss how ML could be used to make progress in nuclear fusion research.

On the workshops’ side there were two interesting workshops. The first was on Deep Learning for the Physical Sciences and the second Deep Learning for Molecules and Materials. The ML for computational biology workshop also focused on these ideas (although as one could expect, it did have quite some overlap with the healthcare related ones).

Conclusion

This is just a glimpse of some of the really exciting trends and ideas presented at NIPs this year. Many of these topics I just barely touched on so I encourage you to explore them in greater detail. Additionally, there were many other small topics that were just as interesting if not more so. I will be writing several follow-up articles on ML4H, MedNIPs, interpretability, and (maybe) meta-learning. So stay tuned for those. As always if you think I missed something or have other comments please leave them below.

Links (as promised)

Bayesian Techniques

Keynote by Yee Whye Teh

Rigor and Interpretability

The talk of Ali Rahimi at NIPS

Facebook post in response by Yann and the resulting discussion in the comments.

Geometric Deep Learning

Reinforcement Learning

Tutorial on RL with people

Keynote at the Deep Reinforcement Learning Symposium

Main conference RL spotlight

GANs

As stated before not a lot of “pure GAN” content but they popped in many workshops and spotlights.

This session has quite a lot about GANs.

Meta-learning

Workshop

Healthcare

https://ml4health.github.io

Energy and science

Robotics