The DALEX package version 1.0 CRAN release is scheduled for Feb 20. It brings lots of improvements and changes. Below I will briefly summarise how this package helps to develop better and safer predictive models. To see code snippets scroll to the end. The package was refactored according to the principles of Explanatory Model Analysis (EMA).

The XAI/EMA pyramid

Explainability of predictive models is often linked with methods such as LIME, SHAP or PDP, which help to better understand local or global model behaviour. These techniques respond to specific needs for understanding models as a whole or understanding of predictions for specific instances.

In order not to get lost in the variety of different XAI techniques, several approaches to cataloguing are proposed (local/global, model agnostic/specific, case-based/rule-based/profile-based and so on). We tried a few in the last two years. Finally, we put together the most popular techniques into the XAI pyramid presented below (a triangular version of the periodic table).

Figure 1. The XAI/EMA pyramid. Needs related to the exploration of predictive models are gathered into an extensible drill-down map. Left side is about needs related to a single instance, right side to a model as a whole. Consecutive layers dig into more and more detailed questions about the model behaviour (local or global).

The model exploration for an individual instance starts with a single number — a prediction. This is the top level of the pyramid.

To this prediction we want to assign particular variables, to understand which are important and how strongly they influence this particular prediction. One can use methods as SHAP, LIME, Break Down, Break Down with interactions. This is the second from the top level of the pyramid.

Moving down, the next level is related to the sensitivity of the model to change of one or more variables’ values. Ceteris Paribus profiles allow to explore the conditional behaviour of the model. Going further, we can investigate how good is the local fit of the model. It may happen, that the model is very good on average, but for the selected observation the local fit is very low, errors/residuals are larger than on average. The above pyramid can be further extended, i.e. by adding interactions of variable pairs.

The exploration for the whole model starts with an assessment of the quality of the model, either with F1, MSE, AUC or LIFT/ROC curves. Such information tells us how good the model is in general.

The next level helps to understand which variables are important and which ones make the model work or not. A common technique is permutation importance of variables.

Moving down, methods on the next level help us to understand what the response profile of the model looks like as a function of certain variables. Here you can use such techniques as Partial Dependence Profiles or Accumulated Local Dependence.

Going further we have more and more detailed analysis related to the diagnosis of the errors/residuals.

DALEX v 1.0 architecture

The first version of the DALEX package was released in 2018. During this time the architecture has been constantly improved, the final approach is implemented in version 1.0. Each function is responsible for a single part of the pyramid. If a given part can be calculated in several ways, the desired method can be specified with the argument type.

Figure 2. Implementation of the XAI/EMA pyramid in the DALEX package. Dark violet names are names of functions that implements methods for particular needs. Light violet names stand for implemented methods that address selected needs.

Code snippets

Let’s see how to use the new DALEX on the example of the estimating football player’s value based on dataset FIFA 19. We will focus on the model exploration. If you are interested in the dataset itself or the modelling, find more reproducible examples here.

We start with a model. Let’s use the boosting model as implemented in the gbm library to predict log values of football players.

library(“gbm”)

fifa_gbm data = fifa19small,

n.trees = 250,

interaction.depth = 4,

distribution = “gaussian”)

2. Different models have different structures, so we need something with uniform interface — a wrapper. In most cases it would be enough to call the DALEX::explain() function with a single argument — the model. In this use-case we trained a model on log-values so the wrapper is slightly more complicated, because model predictions needs to be transformed from the log_10 scale to Euro.

library(“DALEX”)

fifa_exp data = fifa19small,

y = 10^fifa19small$LogValue,

predict_function = function(m,x) 10^predict(m, x, n.trees = 250))

3. We need an instance for which prediction will be explained. In the FIFA example you can choose any player you like. Here, we will use Cristiano Ronaldo.

cr7

Now we are ready for model exploration. Let’s start with the prediction for CR7.

> predict(fifa_exp, cr7)

## 47858094

Almost 48 millions. But why? How variables contribute to this prediction?

predict_parts(fifa_exp, cr7) %>% plot()

The predict_parts() function is using the Break Down algorithm by default. Now, let’s try a different method for variable attribution, e.g. Shapley values. To achieve that, we only need to change the type argument.

predict_parts(fifa_exp, cr7, type = “shap”) %>% plot(show_boxplots = FALSE)

In both cases, it looks like high score in BallControl increases the predicted value while high Age decreases it. Let’s look into this a bit deeper.

predict_profile(fifa_exp, cr7) %>% plot(variables = c(“Age”, “BallControl”))

Indeed, the close to perfect BallControl significantly increases the prediction. Also, we see that the GBM model is cruel for players older than 30 years.

Next question— how good the model fit is for this particular instance? Are residuals for players similar to CR7 larger or smaller than ,,average’’ residuals?

predict_diagnostics(fifa_exp, cr7) %>% plot()

Lime histogram shows distribution of residuals for players similar for CR7 while the violet one shows distribution for all players. We see that this distribution is much wider than the ,,usual’’ residuals, so we shall be more suspicious.

Above we showed how to explore the model around a single instance like CR7. We may use DALEX also for model level exploration. The grammar is identical. The instance-level predict_parts() becomes model_parts().

model_parts(fifa_exp) %>% plot(show_boxplots = FALSE, max_vars = 10)

Globally the most important variable is Reactions. BallControll and Age are high.

The instance level predict_profile() becomes model_profile().

model_profile(fifa_exp) %>% plot(variables = c(“Age”, “BallControl”), geom = “profiles”)

Find more football players in this interactive modelStudio app.

DALEX family

The functions available in the DALEX package are the backbone on which many tools for exploring machine learning models are built. Among the latest we have:

DALEXtra with connectors to popular ML frameworks like scikit-learn, H2O, mlr, caret, tensorflow, keras…

modelDown builds a static HTML website with a summary of dataset level model exploration,

modelStudio builds an interactive serverless D3js based website with a dashboard of complementary model views

ingredients, iBreakDown, auditor with specific techniques for model explorations.

Sidenote: The problem with interpretability and explanability

Much attention is paid to the explanability and interpretability of machine learning models. But these concepts are poorly defined. They do not have a single, fixed and precise mathematical definition and without such a definition it is easy to fall into a wishful thinking. We’re confronted with problems like: what does it mean to be interpretable? Is the same thing equally understandable to different people? Can we explain our decisions ourselves?

Instead of struggling with ill-defined problems, in the DALEX we decided to change the perspective. We are not looking for a single one-shot explanations that will reveal everything. We are focused on an iterative process of gradual exploration of a model. Each next step of the analysis will increase our understanding of the models. In some cases, after a few steps we may realize: yes, I understand it now! For other cases, we may end-up with increased doubts towards the model. And it is also completely ok.

Appendix: Short history of the DALEX

At UseR!2017 Brussel I gave a talk ,,Show me your model’’ which summarised packages randomForestExplainer and factorMerger: tools devoted to model visualisation (in this case, random forest and linear models).

During the preparation of this talk, I reviewed a lot of tools for the analysis of complex predictive models. Each one allowed for very useful model analysis, but also each did it in a different way.

This topic has started to draw me in more and more. A few months later, during the secondment in Nanyang Technological University in Singapour, I started to work on a tool to apply different techniques for model exploration in a consistent way. The first commit in the DALEX repository happens on 2018/02/18. At UseR!2018 Brisbane I have a workshop related to the DALEX package. Perfect opportunity to collected comments on the subject of interpretability and explainability of predictive models.

A group of great students and PhD students from Warsaw University of Technology and the University of Warsaw have joined the development of DALEX and other packages from the family. Just to name a few with dozens of commits in alphabetical order @hbaniecki, @agosiewska, @AdamIzdebski, @Yue-Jiang, @kozaka93,@maksymiuks, @tmikolajczyk, @kasiapekala, @marta-tatarynowicz (the full list is much longer). And after two years of research, we have version 1.0 of DALEX and dozen other packages in the DrWhy.AI family!

At UseR!2020 European hub Munich I am going to review the language for model explanations in the presentation ,,Talk with your model’’.

See you there!