Now lets make some plots!

Lets start by plotting what I find the most interesting, the relationship between “employed” and “gov.support”.

I’m using an absolutely brilliant package called “sjPlot” for this:

library("sjPlot")

plot_model(full.fit, type = “pred”, terms = c(“gov.support”))

Giving us this plot:

Completely useless! What’s going on here? Well for our convenience ‘plot_model’ spits our the following error:

This is exactly why we needed to transform the data! Let’s try again:

plot_model(full.fit, type = “pred”, terms = c(“gov.support [exp]”))

This is pretty damn close to being useless as well, what’s going on now?

Lets try taking a look at our model and how it tries to capture the relationships in the data. There’s a few different ways to do this but lets just begin by looking at the coefficients of our full model:

What’s the first thing you notice when looking at this? For me it’s the factor-variable ‘zero.young.children’ having a coefficient of almost 13! This isn’t necessarily a problem though but lets take a look at the confidence-intervals of the coefficients:

confint(full.fit)

Right, this looks like trouble! Take a look at the span on ‘zero.young.children’ this means that we’ll have huge confidence-intervals in our marginal-effect plots. This is most likely why our plot has such a huge confidence-interval.

sjPlot can plot these as well:

plot_model(full.fit, transform = NULL, show.values = TRUE, axis.labels = “”, value.offset = .4)

This makes it real easy to see that the factor variable ‘zero.young.children’ is quite problematic with regard to making any plots with any real confidence.

Lets take a little detour and create a more simple model without any variable-interactions, turns out this is what we get:

simple.fit <- glm(employed ~ foreigner

+ age

+ zero.young.children

+ zero.children

+ log(gov.support)

+ I(age^2), family = binomial, data = dat)

summary(simple.fit)

A small increase in AIC which isn’t a good thing but it’s not too much and we gain a lot of simplicity by using this model instead! (If we measure performance with BIC instead we actually find that this model is better due to BIC penalizing harder on complex models! 1047 vs 1068)

Lets look at the new coefficients:

plot_model(simple.fit, transform = NULL, show.values = TRUE, axis.labels = “”, value.offset = .4)

This look a lot better! Let’s try making our plot once more using this model instead!

plot_model(simple.fit, type = “pred”, terms = c(“gov.support [exp]”))

Nice, something useful! So obviously the probability of being employed is lower the more government support you’re entitled to. This makes quite a bit of intuitive sense! How about the difference between foreigners and non-foreigners? Well plot_model makes this real easy to do as well, just add it as a term!:

plot_model(simple.fit, type = “pred”, terms = c(“gov.support [exp]”, “foreigner”))

This is a breeze! So this model suggests that foreigners are more likely to be employed than non-foreigners given that the other variables are identical.

How about age? This required a slightly different approach because we don’t really assume a strictly positive or negative relationship with age which is why the power-transformation I(age²) makes sense. That is, we don’t expect a 10 year old to be employed, neither a 70 year old, but we do expect a 30 year old to be employed.

Sadly sjPlot doesn’t take too kindly to these kinds of power-transformations (Or I’m just an imbecile who can’t work it out), so I’m using the package ‘jtools’ instead:

effect_plot(simple.fit, pred = age, data = dat, interval = TRUE)

So it seems employment “tops” around the mid to late 30s and then tampers off, this also makes a lot of intuitive sense!

Now what about having children?:

plot_model(simple.fit, type = “pred”, terms = c(“gov.support [exp]”, “zero.children”))

Surprise surprise! Not having any children makes you more likely to be employed and on the market!

Now this probably seems a bit backwards but lets try to compare our plots to the coefficients of our model and try to make sense of it! (I’m a visual learner so this makes sense to me damn it!):

The categorical variables foreigner, zero.children and zero.young.children are easily interpreted, being a non-foreigner (foreignerno) lowers your Log-Odds of being employed by 1.12, not having any young kids improve your Log-Odds by 1.31 and not having any kids at all improves it by 1.06, you might wonder “Why doesn’t zero.children improve the odds more than zero.young.children” well, you have to consider than whenever zero.children is TRUE so is zero.young.children, basically giving an “accumulated improvement” to the Log-Odds.

Higher age improves your Log-Odds of being employed by 0.37, BUT the odds also declines by age² * -0.0055, what does this mean? Well this is simply a concave 2. degree polynomial, somewhat similar to our plot of age, that is, we see an initial rise, followed by some maxima and then a decline.

That’s it! We’ve successfully replicated the values of our coefficient in a visual and intuitive manner, job well done.