Data Challenge

The Data Challenge is NOW OPEN! https://www.synapse.org/ACIC2018Challenge

It is the third time a Data Challenge is taking place as part of the ACIC conference. Similar to previous years, the challenge focuses on computational methods of inferring causal effects from real-world healthcare-related data. The data challenge is organized this year by Prof Ashley Naimi (University of Pittsburgh) and supported by the Machine Learning for Healthcare and Life Sciences group at IBM Research - Haifa.

This year's challenge extends the previous challenges in two main respects (as explained in more detail below):

Evaluating performance as a function of the size of the dataset

Using censored outcome (i.e. missing outcome values for some of the samples)

Scaling

As datasets get larger, it becomes imperative to understand how run-time and memory requirements of the applied causal methods scale. Likewise, it is also important to understand which methods benefit from additional data and which do not.

Censoring

The problem of censoring is frequently encountered in real-world data. For example, some diseases may be treated by a highly effective drug, making follow up data unlikely, and therefore the individuals for which follow-up data exists are mostly those for which the drug was not effective. The censoring in such a case is very informative and must be properly accounted for to correctly estimate the effect.