1. Thomson, R., Luettel, D., Healey, F. & Scobie, S. Safer Care for the Acutely Ill Patient: Learning from Serious Incidents (National Patient Safety Agency, 2007).

2. Henry, K. E., Hager, D. N., Pronovost, P. J. & Saria, S. A targeted real-time early warning score (TREWscore) for septic shock. Sci. Transl. Med. 7, 299ra122 (2015).

3. Rajkomar, A. et al. Scalable and accurate deep learning with electronic health records. npj Digit. Med. 1, 18 (2018).

4. Koyner, J. L., Adhikari, R., Edelson, D. P. & Churpek, M. M. Development of a multicenter ward-based AKI prediction model. Clin. J. Am. Soc. Nephrol. 11, 1935–1943 (2016).

5. Cheng, P., Waitman, L. R., Hu, Y. & Liu, M. Predicting inpatient acute kidney injury over different time horizons: how early and accurate? In AMIA Annual Symposium Proceedings 565 (American Medical Informatics Association, 2017).

6. Koyner, J. L., Carey, K. A., Edelson, D. P. & Churpek, M. M. The development of a machine learning inpatient acute kidney injury prediction model. Crit. Care Med. 46, 1070–1077 (2018).

7. Komorowski, M., Celi, L. A., Badawi, O., Gordon, A. C. & Faisal, A. A. The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care. Nat. Med. 24, 1716–1720 (2018).

8. Avati, A. et al. Improving palliative care with deep learning. In 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 311–316 (2017).

9. Lim, B. & van der Schaar, M. Disease-Atlas: navigating disease trajectories with deep learning. Proc. Mach. Learn. Res. 85, 137–160 (2018).

10. Futoma, J., Hariharan, S. & Heller, K. A. Learning to detect sepsis with a multitask Gaussian process RNN classifier. In Proc. International Conference on Machine Learning (eds Precup, D. & Teh, Y. W.) 1174–1182 (2017).

11. Miotto, R., Li, L., Kidd, B. A. & Dudley, J. T. Deep Patient: an unsupervised representation to predict the future of patients from the electronic health records. Sci. Rep. 6, 26094 (2016).

12. Lipton, Z. C., Kale, D. C., Elkan, C. & Wetzel, R. Learning to diagnose with LSTM recurrent neural networks. Preprint at https://arxiv.org/abs/1511.03677 (2016).

13. Cheng, Y. P. Z. J. H. & Wang, F. Risk prediction with electronic health records: a deep learning approach. In Proc. SIAM International Conference on Data Mining (eds Venkatasubramanian, S. C. & Meria, W.) 432–440 (2016).

14. Soleimani, H., Subbaswamy, A. & Saria, S. Treatment-response models for counterfactual reasoning with continuous-time, continuous-valued interventions. In Proc. 33rd Conference on Uncertainty in Artificial Intelligence (AUAI Press Corvallis, 2017).

15. Alaa, A. M., Yoon, J., Hu, S. & van der Schaar, M. Personalized risk scoring for critical care prognosis using mixtures of Gaussian process experts. IEEE Trans. Biomed. Eng. 65, 207–218 (2018).

16. Perotte, A., Ranganath, R., Hirsch, J. S., Blei, D. & Elhadad, N. Risk prediction for chronic kidney disease progression using heterogeneous electronic health record data and time series analysis. J. Am. Med. Inform. Assoc. 22, 872–880 (2015).

17. Bihorac, A. et al. MySurgeryRisk: development and validation of a machine-learning risk algorithm for major complications and death after surgery. Ann. Surg. 269, 652–662 (2019).

18. Khwaja, A. KDIGO clinical practice guidelines for acute kidney injury. Nephron Clin. Pract. 120, c179–c184 (2012).

19. Stenhouse, C., Coates, S., Tivey, M., Allsop, P. & Parker, T. Prospective evaluation of a modified early warning score to aid earlier detection of patients developing critical illness on a general surgical ward. Br. J. Anaesth. 84, 663P (2000).

20. Alge, J. L. & Arthur, J. M. Biomarkers of AKI: a review of mechanistic relevance and potential therapeutic implications. Clin. J. Am. Soc. Nephrol. 10, 147–155 (2015).

21. Wang, H. E., Muntner, P., Chertow, G. M. & Warnock, D. G. Acute kidney injury and mortality in hospitalized patients. Am. J. Nephrol. 35, 349–355 (2012).

22. MacLeod, A. NCEPOD report on acute kidney injury—must do better. Lancet 374, 1405–1406 (2009).

23. Lachance, P. et al. Association between e-alert implementation for detection of acute kidney injury and outcomes: a systematic review. Nephrol. Dial. Transplant. 32, 265–272 (2017).

24. Johnson, A. E. W. et al. Machine learning and decision support in critical care. Proc. IEEE Inst. Electr. Electron Eng. 104, 444–466 (2016).

25. Mohamadlou, H. et al. Prediction of acute kidney injury with a machine learning algorithm using electronic health record data. Can. J. Kidney Health Dis. 5, 1–9 (2018).

26. Pan, Z. et al. A self-correcting deep learning approach to predict acute conditions in critical care. Preprint at https://arxiv.org/abs/1901.04364 (2019).

27. Park, S. et al. Impact of electronic acute kidney injury (AKI) alerts with automated nephrologist consultation on detection and severity of AKI: a quality improvement study. Am. J. Kidney Dis. 71, 9–19 (2018).

28. Chen, I., Johansson, F. D. & Sontag, D. Why is my classifier discriminatory? Preprint at https://arxiv.org/abs/1805.12002 (2018).

29. Schulam, P. & Saria, S. Reliable decision support using counterfactual models. In Advances in Neural Information Processing Systems 30 (eds Guyon, I. et al.) 1697–1708 (2017).

30. Telenti, A., Steinhubl, S. R. & Topol, E. J. Rethinking the medical record. Lancet 391, 1013 (2018).

31. Department of Veterans Affairs. Veterans Health Administration: Providing Health Care for Veterans. https://www.va.gov/health/ (accessed 9 November 2018).

32. Razavian, N. & Sontag, D. Temporal convolutional neural networks for diagnosis from lab tests. In 4th Int. Conf. Learn. Representations (2016).

33. Zadrozny, B. & Elkan, C. Transforming classifier scores into accurate multiclass probability estimates. In Proc. 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (eds, Zaïane, O. R. et al.) 694–699 (ACM, 2002).

34. Zilly, J. G., Srivastava, R. K., Koutník, J. & Schmidhuber, J. Recurrent highway networks. In Proc. International Conference on Machine Learning (vol. 70) (eds Precup, D. & Teh, Y. W.) 4189–4198 (2017).

35. Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).

36. Collins, J., Sohl-Dickstein, J. & Sussillo, D. Capacity and trainability in recurrent neural networks. In International Conference on Learning Representations (eds Bengio, Y. & LeCun, Y.) https://openreview.net/forum?id=BydARw9ex (2017).

37. Bradbury, J., Merity, S., Xiong, C. & Socher, R. Quasi-recurrent neural networks. In International Conference on Learning Representations (eds Bengio, Y. & LeCun, Y.) https://openreview.net/forum?id=H1zJ-v5xl (2017).

38. Lei, T. & Zhang, Y. Training RNNs as fast as CNNs. Preprint at https://arxiv.org/abs/1709.02755v1 (2017).

39. Chung, J., Gulcehre, C., Cho, K. & Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modelling. Preprint at https://arxiv.org/abs/1412.3555 (2014).

40. Graves, A., Wayne, G. & Danihelka, I. Neural Turing machines. Preprint at https://arxiv.org/abs/1410.5401 (2014).

41. Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D. & Lillicrap, T. Meta-learning with memory-augmented neural networks. In Proc. International Conference on Machine Learning (eds Balcan, M. F. & Weinberger, K. Q.) 1842–1850 (2016).

42. Graves, A. et al. Hybrid computing using a neural network with dynamic external memory. Nature 538, 471–476 (2016).

43. Santoro, A. et al. Relational recurrent neural networks. In Advances in Neural Information Processing Systems 31 (eds Bengio, S. et al.) 7310–7321 (2018).

44. Caruana, R., Baluja, S. & Mitchell, T. in Advances in Neural Information Processing Systems (eds Mozer, M. et al.) 959–965 (1996).

45. Wiens, J., Guttag, J. & Horvitz, E. Patient risk stratification with time-varying parameters: a multitask learning approach. J. Mach. Learn. Res. 17, 1–23 (2016).

46. Ding, D. Y. et al. The effectiveness of multitask learning for phenotyping with electronic health records data. Preprint at https://arxiv.org/abs/1808.03331v1 (2018).

47. Glorot, X. & Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. In International Conference on Artificial Intelligence and Statistics (vol. 9) (eds Tehand, Y. W. & Titterington, M.) 249–256 (2010).

48. Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. In International Conference on Learning Representations (eds Bengio, Y. & LeCun, Y.) https://dblp.org/rec/bib/journals/corr/KingmaB14 (2015).

49. Guo, C., Pleiss, G., Sun, Y. & Weinberger, K. Q. On calibration of modern neural networks. In Proc. International Conference on Machine Learning (eds Precup, D. & Teh, Y. W.) 1321–1330 (2017).

50. Platt, J. C. in Advances in Large-Margin Classifiers (eds Smola, A. et al.) 61–74 (MIT Press, 1999).

51. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weath. Rev. 78, 1–3 (1950).

52. Niculescu-Mizil, A. & Caruana, R. Predicting good probabilities with supervised learning. In Proc. International Conference on Machine Learning (eds Raedt, L. D. & Wrobel, S.) 625–632 (ACM, 2005).

53. Saito, T. & Rehmsmeier, M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE 10, e0118432 (2015).

54. Efron, B. & Tibshirani, R. J. An Introduction to the Bootstrap (CRC, 1994).

55. Mann, H. B. & Whitney, D. R. On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 18, 50–60 (1947).

56. Lakshminarayanan, B., Pritzel, A. & Blundell, C. Simple and scalable predictive uncertainty estimation using deep ensembles. In Advances in Neural Information Processing Systems (eds Guyon, I. et al.) 6402–6413 (2017).

57. De Fauw, J. et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat. Med. 24, 1342–1350 (2018).