This year NIPS had two healthcare/medicine related workshops. Machine Learning for Healthcare (ML4H) and Medical Imaging meets NIPS. Naturally, some people might wonder about the difference between the two. ML4H focused on the application of Machine Learning to many different areas of healthcare including drug discovery, hospital operations, precision medicine, etc. This involved a variety of techniques including computer vision, NLP, and time series forecasting. MedNIPs, as I described last month, focused on the specific application of machine learning to medical imaging. In other words ML4H was more general. There was some overlap, but since I covered imaging in detail in my last article, I’m going to focus on all the other areas that ML4H covered.

Interestingly, despite the different areas of machine learning and medicine that presenters at both workshops dealt with, there seemed to be one universal underlying theme: the problem of limited and/or messy data. Likewise, techniques presented at the workshops, like multitask learning, generating synthetic data with GANs, and incorporating human feedback into training, all aimed at addressing this fundamental issue.

[Unfortunately, unlike MedNIPs, none of the talks (to my knowledge) for ML4H are online and only some of the abstracts are. Additionally, for a significant part of Friday I was running around trying to get my own poster printed. Therefore, I may be missing some important parts. So if one of the presenters or other attendees want to add something, feel free to leave a comment and I’ll make sure to include it].

Hospital Operations

Machine learning has the potential to improve hospital operations and care. Several of the invited speakers and spotlight presenters discussed this issue. There are many areas where machine learning can potentially improve quality of care received at hospitals. Fei Fei Li had a keynote on the current research at Stanford to reduce hospital acquired infections and improve care. Fei Fei discussed how the majority of research focused exclusively on finding new drugs or better interpreting radiology reports. However, a large part of healthcare does not just happen in the lab, but with interactions between healthcare professionals and their patients. She argued that by studying and augmenting these day to day interactions with machine learning we can potentially save many more lives. These areas include reducing hospital acquired infections, monitoring the ICU activities, and forecasting staffing levels.

In an extension to their previous article on identifying breaches of hand hygiene protocol, which they presented in Boston last summer, Michelle Guo et al. presented a poster on using “Viewpoint Invariant Convolutional Neural Networks” to automatically detect potentially risky hand hygiene scenarios in hospitals. They used depth images and fed them to convolutional networks to classify the type of action taken in the photo. Here they were interested in categorizing the photo as either touching a patient or a sterile area in the environment.

Classification of surgical instruments from page 2 of Jin et al.

The best paper of the workshop, Tool Detection and Operative Skill Assessment in Surgical Videos Using Region-Based Convolutional Neural Networks, by Amy Jin et al. (also from Fei Fei’s research group), described how to use Region based CNNs, like F-RCNN, to gauge the skills of surgeons. They outlined an approach that tracks and analyzes the use of tools in surgical videos. This approach could provide valuable feedback to surgeons on how well they operate and what areas they need to improve. Specially, they tracked the switching of tools and the length of the surgery in order to assess the surgeon’s skills. This type of monitoring could potentially reduce the number of surgical mistakes.

Classifying patient actions in an ICU Ward. Taken from Bianconi et al. poster.

Another spotlight, by Gabriel Bianconi et al. (also from Fei Fei’s research group), described Vision Based Prediction of ICU Mobility Care Activities with RNNs. This work attempted to classify patient actions in the ICU ward of a hospital using HIPPA complaint depth sensors. Interpreting these activities has multiple benefits; for example, it can be used to gauge the hospital’s compliance with protocols as well as the relationship between following the protocol and overall patient outcomes.

David Kale of the University of Southern California presented on creating a public benchmark on MIMIC-III data. His team trained a Multitask LSTM to forecast four correlated metrics including overall mortality, length of stay, disease phenotype, and physiological decline. They published online both their models and the code necessary to generate the specific training and test sets that they used. Altogether this was an interesting application of multitask learning in healthcare and it provides some valuable benchmarks to compare future results against.

ML supported treatment

Several papers described the application of reinforcement learning to the treatment of sepsis and to predicting if someone had malaria, which I thought was interesting. Prior to this I had envisioned RL as a kind of a niche topic that only worked on problems with well defined rules like PacMan or Alpha Go.

Figure from page 7 Futoma et al.

“Learning to Treat Sepsis with Multi-Output Gaussian Process Deep Recurrent Q-Networks” by Futoma et al. described an application of reinforcement learning and Gaussian processes. Their paper employed two major parts working in unison a MGP or Multi-Gaussian Process and a Recurrent Q Network. The MGP process handles the time series patient physiological data which is often sloppy as it’s recorded at different intervals while the recurrent q network learns the clinical policy. The specifics are somewhat complicated, but overall it is a very interesting synthesis of a number of different topics. Altogether the authors estimate that their approach could reduce overall patient mortality by 8.2% (from an overall baseline of 13.2%).

A health policy oriented paper by Rajpurkar, et al. titled, “Malaria Likelihood Prediction By Effectively Surveying Households Using Deep Reinforcement Learning,” described training an RL agent to ask a series of questions to predict if someone had malaria. Specifically, the authors write “the RL agent learns to determine which survey question to ask next and when to stop to make a prediction about their likelihood of malaria based on their responses hitherto” (abstract of Rajpurkar, et al).

Another paper that focused on treatment was “Correlational Dueling Bandits with Application to Clinical Treatment in Large Decision Spaces.” In this paper the authors formulate exploring a clinical decision space as a K-Armed bandit problem Interestingly. However this poses problems as the decision space is very large and this means that it can take a very long time before conventional methods converge on the optimal arm. To overcome this they develop an algorithm named CorrDuel that takes advantage of the correlations between arms in order to converge faster. They then applied this to helping to find the optimal electrical stimuli to help paraplegics regain control of their movements (thorough these spinal stimuli) and even stand.

Finally, “Hybrid Gradient Boosting Trees and Neural Networks for Forecasting Operating Room Data” by Chen et al. also presented a method for predicting hypoxia.

Drug Discovery and Personalized Medicine

Several of the presenters addressed using machine learning for drug discovery. Atul Butte for instance in his spotlight, discussed mining data from clinical trials data and feeding it to machine learning algorithms in order to discover new drugs. Additionally, he described data visualization techniques for plotting the most likely prognosis for a patient. (Butte’s slides are available on slideshare and well worth the read if you have the time). Several other presenters detailed specific areas that personalized medicine could impact.

Jennifer Chayes presented on “Challenges and Opportunities for Machine Learning in Cancer Immunotherapy.” Chayes described the challenge of predicting the immunogencity of neo-antigens and patient responses to specific types of immunotherapy. By better predicting the immunogencity of neo-antigens it is easier to find neo-antigens that actually have anti-tumor properties. Secondly, by predicting specific patient responses to the treatment, the treatment could potentially be introduced earlier when the immune system hasn’t been run down by chemo. However, like many other areas in medical research this is limited by a small dataset. In addition to being small, the dataset is very high dimensional.

Another spotlight, “Ask the Doctor — Improving Drug Sensitivity Predictions through Active Expert Knowledge Elicitation,” described how to improve models that gauge the specific efficacy a cancer cell has to a drug of by asking “an expert”. This technique is somewhat similar to active learning (which I talked about in my previous article). In this instance, the experts were specialists in treating blood cancers. Normally, expert solicitation algorithms send drug features to the expert at random and they indicate if it is relevant or not. The major contribution of the authors is in devising an algorithm that reduces the number of queries sent to the experts by only sending them pairs where their feedback provides the most utility (rather than random pairs) and including directional feedback in addition to the relevant vs non-relevant feedback.

Synthetic Data

Several papers discussed using GANs to generate synthetic data. This is a useful technique due to the intricacies of acquiring real medical data due to HIPPA compliance. The paper “Synthetic Medical Images from Dual Generative Adversarial Networks,” by JT Guibas described how a pair of GANs could be used to generate realistic looking retinal images. The researchers introduced an online repository called SynthMed for synthetically generated medical imaging photos.

Diagram from Synthetic Medical Images from Dual Generative Adversarial Networks

Another paper, Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs (that was also a spotlight presentation), described how to generate synthetic ICU time series data with R-GANs. They then ran several interesting experiments to validate their results; for example, they looked at the distribution of reconstruction errors, qualitatively comparing the generated samples, and interpolation. Altogether the paper presented a really good use case of GANs that could help resolve many of the data access issues plaguing researchers. If you are into GANs this definitely is a work to check out.

Other

Several other papers I found trouble placing in the above categories, but nonetheless found interesting. In brief:

“Detection and Characterization of Illegal Marketing and Promotion of Prescription Drugs on Twitter” by Kalyanam et al.

Mondrian Processes for Flow Cytometry Analysis by Ji et al.

Finally, there were 97 total papers presented at the workshop therefore I’m bound to have missed some good ones. Therefore, I encourage you to check out the workshop website. Also if anyone has any info on the other keynotes please let know and I will add it.

Announcements

I will be giving a talk on Wednesday February 7th on working with small datasets with a focus on healthcare applications. The talk will take place in Orono ME from 7–9pm EST and will be live streamed at the following link on Zoom http://zoom.us/j/794523352 .

CurativeAI Slack channel now as 40+ members and continues. So come join the discussion if you haven’t already.