This page shows how you can plot statistical models and visualise the uncertainty inherent to them. To reproduce the analyses and plots, you’ll need R. In addition you’ll need a couple of add-on packages.

To install a package, run install.packages(), e.g.,:

install.packages("lme4")

# Load add-on packages
library(tidyverse) # for plotting and working with data
library(broom)     # optional: for tidying model outputs
library(lme4)      # for fitting frequentist mixed-effects models
library(brms)      # for fitting Bayesian models

The following command just changes some aesthetic aspects of the plots:

# optional
theme_set(theme_classic())

1 Why plot models, and why visualise uncertainty?

Graphs are easier to grok than coefficient tables. Some numeric model outputs are uninterpretable without visualisation (e.g., nonlinear terms in generalised additive models). Others can, in principle, be interpreted, but are difficult to do so and liable to misinterpretation (e.g., interaction terms).
Visualising uncertainty could serve as an antidote to uncertainty laundring (i.e., “\(p < 0.05\), hence this is how the world works”): A range of patterns that are compatible with data and model is shown rather than just one estimate.

Most of what I’ll illustrate below can also, and more quickly, be accomplished using the effects package (Fox 2003). But it’s good to know how to draw these plots yourself so that you can tweak them as needed and so that you know what these plots do and don’t show.

2 The principle: An example with simple linear regression

We’ll start with an example in which there is just one, linear predictor of a continuous outcome. The data stem from a study by DeKeyser et al. (2010) and were reanalysed by Vanhove (2013). The dataset can be downloaded from https://janhove.github.io/visualise_uncertainty/dekeyser2010.csv.

Each row in the dataset contains the data for a different participant for two variables:

AOA contains the age at which the participant started learning Hebrew as a second language.
GJT contains their score in a 204-item grammaticality judgement task.

The goal of the analysis was to characterise the relationship between these two variables.

# Read in the dataset
dekeyser <- read_csv("https://janhove.github.io/visualise_uncertainty/dekeyser2010.csv")

## Parsed with column specification:
## cols(
##   AOA = col_double(),
##   GJT = col_double()
## )

# Draw a scatterplot
ggplot(data = dekeyser,
       aes(x = AOA,
           y = GJT)) +
  geom_point(shape = 1) +
  xlab("Age of L2 acquisition") +
  ylab("L2 grammar test score")

Using geom_smooth(), we can add a trendline to this scatterplot. The regression line (in blue) connects the estimated means (according to a linear regression model, see below) at each age of acquisition. According to this model, if you collected data for a large group of L2 learners with an age of acquisition of 29, their mean grammar test performance would be about 155. The uncertainty about this mean is expressed in an uncertainty interval (here a 95% confidence interval). If you string together the estimated means at each age of acquisition, you obtain the blue line; if you string together the uncertainty intervals around these means, you obtain the grey ribbon (known as a ‘pointwise 95% confidence band’). Note, incidentally, that there were no participants with an age of acquisition of 29 in the dataset: the conditional mean of 155 is inferred on the basis of the other data and the analyst’s assumption that the relationship between AOA and GJT is linear.

ggplot(data = dekeyser,
       aes(x = AOA,
           y = GJT)) +
  geom_point(shape = 1) +
  # simple linear regression ("lm") with 95% confidence band
  geom_smooth(method = "lm") +
  xlab("Age of L2 acquisition") +
  ylab("L2 grammar test score") +
  geom_vline(xintercept = 29, linetype = "dotted") +
  geom_hline(yintercept = 155, linetype = "dotted")

## `geom_smooth()` using formula 'y ~ x'

In the following, we’re going to

reconstruct this regression line and confidence band ourselves without using a shortcut like geom_smooth();
apply the same techniques to other models, including multiple linear regression, logistic regression, and mixed-effect models.

The steps involved are mostly the same for all of these models, so let’s go through them in detail for the simple linear regression model.

2.1 Step 1: Fit the model

When you want to draw model-based graphs, you need a model! A linear regression model is obviously a model, but you can also rewrite a Student’s t-test as a model. We’ll do this below. See also https://lindeloev.github.io/tests-as-linear/.

The code below fits a simple linear regression model with AOA as the predictor and GJT as the outcome and then displays the estimated model coefficients.

dekeyser.lm <- lm(GJT ~ AOA, data = dekeyser)
tidy(dekeyser.lm)

# you could also use summary(dekeyser.lm)

2.2 Step 2: Compute the conditional means and confidence intervals

2.2.1 The analytic approach

For models fitted using the lm() function, you can compute conditional means and confidence intervals using the predict() function. The maths behind this function is explained in the tab Extra: More about the analytic approach, which you can feel free to skip at this stage. The predict() function takes as its arguments

the model object (dekeyser.lm),
a new data frame in which you specify the predictor value, or the combination of predictor values, for which you want to compute a modelled estimated mean and its confidence interval (here: AOA = 29),
which kind of interval, if any, you want to compute (here: a "confidence" interval; we’ll discuss another type of interval, viz., prediction intervals, in a bit),
the desired width of the interval (here: a 95% interval, hence 0.95).

The function then outputs the conditional mean (fit) and the lower (lwr) and upper (upr) bounds of the requested interval.

predict(dekeyser.lm,
        newdata = data.frame(AOA = 29),
        interval = "confidence",
        level = 0.95)

##      fit    lwr    upr
## 1 155.09 151.27 158.91

Now we just compute these conditional means and their confidence intervals along the entire AOA range observed in the dataset. First generate a new data frame (creatively called new_AOA) that contains this range of AOA values:

new_AOA <- data.frame(
  AOA = seq(from = 5, to = 71, by = 1)
)
new_AOA

Compute the conditional means and confidence intervals. I convert the resulting matrix to a “tibble” (or data frame) and add a column containing the AOA values for which they were computed:

ci_95 <- predict(dekeyser.lm,
                 newdata = new_AOA,
                 interval = "confidence",
                 level = 0.95) %>% 
  as_tibble() %>% 
  bind_cols(new_AOA, .)
ci_95

2.2.2 Extra: More about the analytic approach

Computing a conditional mean is pretty easy. You take the regression equation and plug in the estimated coefficients: \(\widehat{\alpha}\) is the estimated intercept (here: 190.41) and \(\widehat{\beta}\) is the estimated slope for the AOA predictor (here: -1.22).

\[\widehat{y}_i = \widehat{\alpha} + \widehat{\beta} x_i = 190.41 + (-1.22)x_i\]

Then, plug in the predictor value (here: an age of acquisition of 29) for \(x_i\):

\[\widehat{\textrm{GJT}}_{\textrm{AOA}=29} = 190.41 + (-1.22) \times 29 \approx 155\]

If instead you’re interested in the conditional mean for an AOA of 14, use this value for \(x_i\):

\[\widehat{\textrm{GJT}}_{\textrm{AOA}=14} = 190.41 + (-1.22) \times 14 \approx 173\]

# Check with predict()
predict(dekeyser.lm,
        newdata = data.frame(AOA = 14),
        interval = "confidence",
        level = 0.95)

##      fit    lwr    upr
## 1 173.36 167.96 178.75

For the confidence interval, you need the standard error for the conditional mean. We can obtain this using predict() as well; we’ll see in a minute how it’s computed.

predict(dekeyser.lm,
        newdata = data.frame(AOA = 29),
        se.fit = TRUE)

## $fit
##      1 
## 155.09 
## 
## $se.fit
## [1] 1.9172
## 
## $df
## [1] 74
## 
## $residual.scale
## [1] 16.396

se.fit is the standard error for the conditional mean (here: the conditional mean for an AOA of 29). The confidence interval is then constructed using the t-distribution with the appropriate degrees of freedom (df, here: 74). 95% of a distribution lies between its 2.5th and 97.5th percentile. Multiplying these percentiles by the standard error and centring the resulting interval around the conditional mean yields the 95% confidence interval:

155.09 + 1.9172 * qt(0.025, df = 74)

## [1] 151.27

155.09 + 1.9172 * qt(0.975, df = 74)

## [1] 158.91

For an 80% confidence interval, you’d use the 10th and 90th percentiles, as 80% of the distribution lies between these two percentiles:

155.09 + 1.9172 * qt(0.10, df = 74)

## [1] 152.61

155.09 + 1.9172 * qt(0.90, df = 74)

## [1] 157.57

You can check this result using the predict() function by setting level to 0.80.

For the inquisitive, this still leaves the matter of computing the standard error by hand. Weisberg (2005, Sections 2.8 and 3.6) provides the formulae. For a simple linear regression model, you need to know

the estimated standard deviation of the residuals (here: 16.4)
the number of observations in the dataset (here: 76)
the predictor value for which you computed the corresponding outcome (here: 29)
the mean predictor value in the dataset (here: 32.54)
the sum of squares for the predictor values in the dataset (here: 24319)

# estimated standard deviation of the residuals
glance(dekeyser.lm)$sigma

## [1] 16.396

# number of observations in dataset
nrow(dekeyser)

## [1] 76

# mean predictor value in dataset
mean(dekeyser$AOA)

## [1] 32.539

# sum of squares of predictor values in dataset
sum((dekeyser$AOA - mean(dekeyser$AOA))^2)

## [1] 24319

From this you can compute the standard error of the conditional mean for AOA = 29:

\[16.4 \times \sqrt{\frac{1}{76} + \frac{(29 - 32.54)^2}{24319}} \approx 1.92\]

For AOA = 14:

\[16.4 \times \sqrt{\frac{1}{76} + \frac{(14 - 32.54)^2}{24319}} \approx 2.71\]

This formula also makes it clear that your confidence band will always be narrowest at the mean predictor value (here: 32.54) since the standard error is smallest for this value. (Try plugging in 32.54 for \(x_i\) yourself.)

For multiple regression models, you need to construct a matrix containing (in different columns) the predictor values (as well as a column with 1s to represent the intercept); the so-called model matrix (mm below). You also need to extract the model’s variance-covariance matrix, which contains estimates of how the model’s terms vary and covary with one another (vcox_matrix). This computation method can also be applied to simple regression models, so we can illustrate it here:

# Construct model matrix:
# 1st column contains 1s to represent intercept
# 2nd column contains the predictor values 
#   corresponding to the model's first predictor
mm <- matrix(c(1, 14,
               1, 29), ncol = 2, byrow = TRUE)
mm

##      [,1] [,2]
## [1,]    1   14
## [2,]    1   29

# Alternatively, you can use model.matrix()
# after having specified the data frame with 
# the predictor values.
nd <- data.frame(AOA = c(14, 29))
mm <- model.matrix(~ AOA, data = nd)
mm

##   (Intercept) AOA
## 1           1  14
## 2           1  29
## attr(,"assign")
## [1] 0 1

# Extract the model's variance-covariance matrix
vcov_matrix <- vcov(dekeyser.lm)
vcov_matrix

##             (Intercept)       AOA
## (Intercept)    15.24143 -0.359695
## AOA            -0.35969  0.011054

Now multiply the model matrix by the covariance matrix and the result by the transpose (t()) of the model matrix. This yields a new matrix (%*% is used to multiply matrices in R):

mm %*% vcov_matrix %*% t(mm)

##        1      2
## 1 7.3366 4.2625
## 2 4.2625 3.6756

Handily, the square roots of the elements on the main diagonal (i.e., 7.3366 and 3.6756) are the standard errors for the two conditional means:

mm %*% vcov_matrix %*% t(mm) %>% 
  diag() %>% 
  sqrt()

##      1      2 
## 2.7086 1.9172

2.2.3 Extra: The brute-force approach

It is also possible to construct confidence bands using a technique known as the ‘bootstrap’. I explain how to do this in my German-language introductory statistics booklet. See also Some illustations of bootstrapping.

2.3 Step 3: Plot!

While it is possible to add the actual observations from the dataset to this plot, I don’t do this here. The reason is that I don’t think this makes much sense for the multiple regression models, logistic models and mixed-effects models we’ll turn to next. (Perhaps that’ll change after I’ve read and understood Fox and Weisberg (2018).)

ggplot(data = ci_95,
       aes(x = AOA,
           y = fit,
           ymin = lwr,
           ymax = upr)) +
  geom_ribbon(fill = "lightgrey") +
  geom_line()

What you can do is add a “rug” that show were the predictor values are located. This is useful to communicate to your audience that any association you found isn’t driven by just a couple of cases with outlying predictor values.

ggplot() +
  geom_ribbon(data = ci_95,
              aes(x = AOA,
                  ymin = lwr,
                  ymax = upr),
              fill = "grey90") +
  geom_line(data = ci_95,
            aes(x = AOA,
                y = fit)) +
  geom_rug(data = dekeyser,
           aes(x = AOA),
           sides = "b")

3 Predictions about individual cases vs. conditional means

An important point that some people gloss over is that confidence bands show the model’s uncertainty about the conditional means (or “expected values”). They do not show the model’s uncertainty about where new observations with a given set of predictor values may end up. To visualise this, you need prediction intervals and prediction bands.

Here I’ll (quickly) show how you can compute and visualise prediction intervals/bands, but if you’re interested in the maths, refer to Weisberg (2005, Sections 2.8 and 3.6).

To compute a 95% prediction interval for observations with an age of acquisition of 29, use

predict(dekeyser.lm, 
        newdata = data.frame(AOA = 29),
        interval = "prediction",
        level = 0.95)

##      fit   lwr    upr
## 1 155.09 122.2 187.98

In other words, if you’d collect GJT data for a large number of participants with an age of acquisition of 29 – and if this model is correct (a big “if”!) – you’d find that 95% of them had GJT scores between 122 and 188.

The following code computes prediction intervals for all AOA values in new_AOA and combines them with the 95% confidence intervals computed previously. It also renames the lwr and upr columns to make it clear which intervals they refer to.

pred_ci <- predict(dekeyser.lm,
                   newdata = new_AOA,
                   interval = "prediction",
                   level = 0.95) %>% 
  as_tibble() %>% 
  rename(pred_lwr = lwr,
         pred_upr = upr) %>% 
  bind_cols(ci_95 %>% select(AOA, lwr, upr), .) %>% 
  rename(ci_lwr = lwr,
         ci_upr = upr)
pred_ci

Now plot:

ggplot(data = pred_ci,
       aes(x = AOA,
           y = fit)) +
  geom_ribbon(aes(ymin = pred_lwr,
                  ymax = pred_upr),
              fill = "grey90") +
  geom_ribbon(aes(ymin = ci_lwr,
                  ymax = ci_upr),
              fill = "grey80") +
  geom_line() +
  xlab("age of L2 acquisition") +
  ylab("L2 grammar test score") +
  ggtitle("Age trend in L2 grammar score",
          subtitle = "(dark grey: 95% confidence band, light grey: 95% prediction band")

A prediction interval is invariably larger than the corresponding confidence interval. (Just like the standard deviation of a variable is larger than the standard error of the variable’s mean.)
Some assumptions such as homoskedasticity and normality aren’t too problematic when violated (within reason) for the mean trend and the confidence band. But prediction intervals are more sensitive to violated assumptions. In this example, the regression model captures the mean trend in the data pretty well, but it predicts impossible data: The maximum possible score on the grammar test was 204, yet the prediction interval for participants with low AOAs goes beyond this value. For more about assumptions and their relation to capturing mean trends vs. making predictions, see Before worrying about model assumptions, think about model relevance.

I think that in most strictly academic settings, researchers will be interested in the mean trends, so we’ll mainly focus on confidence bands. That said, in more applied settings, prediction intervals may be quite useful. What’s important in either case is that you don’t confuse them.

4 More examples

Let’s now turn to some more complex examples. We’ll start by visualising linear regression models that contain several predictors, then turning our attention to generalised linear models that can accommodate non-continuous outcomes and finally we’ll discuss mixed-effects models.

4.1 Several continuous predictors

4.1.1 How-to

To illustrate how the uncertainty about effects in multiple linear regression models can be visualised, we’ll work with a dataset from Vanhove and Berthele (2015). (They analysed their data differently from how we will for pedagogical purposes). The dataset can be downloaded from https://janhove.github.io/visualise_uncertainty/VanhoveBerthele2015.csv.

The dataset contains, among other variables, the following information:

TotCorrect: Number of correctly translated Swedish words (out of 45) by one of the German-speaking participants. Outcome variable.
WST: The participant’s score on a German vocabulary test.
English.Cloze: The participant’s score on an English cloze test.
Raven: The participant’s score on a set of Raven’s matrices (intelligence test).
BWDS: The participant’s backward digit span (short-term memory).
NrLang: Number of languages known by the participant other than German and Swiss-German.

In our first model, we’ll fit the outcome in terms of these five more or less continuous predictors:

vb <- read_csv("https://janhove.github.io/visualise_uncertainty/VanhoveBerthele2015.csv")

## Parsed with column specification:
## cols(
##   Subject = col_double(),
##   Modality = col_character(),
##   Sex = col_character(),
##   Age = col_double(),
##   NrLang = col_double(),
##   BWDS = col_double(),
##   WST = col_double(),
##   Raven = col_double(),
##   English.Cloze = col_double(),
##   TotCorrect = col_double()
## )

# There are some missing values in the predictors, so remove these:
vb <- vb %>% 
  select(TotCorrect, WST, English.Cloze, Raven, BWDS, NrLang, Sex, Age) %>% 
  drop_na()

vb.lm <- lm(TotCorrect ~ WST + English.Cloze +
              Raven + BWDS + NrLang,
            data = vb)
tidy(vb.lm)

The principle behind outputting an estimated conditional mean from this model is the same as for the simple regression model: Take the regression equation and plug in the desired predictor values. We’ll work directly with predict() here, though. So if you wanted to know what, according to the model, the mean TotCorrect would be for a large group of participants that all have a WST of 32, and English.Cloze of 8, a Raven of 14, a BWDS of 9 and a NrLang of 4, here’s what you’d do:

new_df <- data.frame(
  WST = 32,
  English.Cloze = 8,
  Raven = 14,
  BWDS = 9,
  NrLang = 4
)
predict(vb.lm,
        newdata = new_df,
        interval = "confidence",
        level = 0.95)

##    fit  lwr  upr
## 1 13.2 11.4 14.9

We can draw effect plots just like before, but there’s one complication: In models with multiple predictors, the outcome varies as a function of all the predictors simultaneously. One solution to this complication is to draw partial effect plots. These show how (according to the model) the conditional mean of the outcome varies along one predictor when all of the other predictors are fixed at some value set by the analyst. Typically, the other predictors are fixed at their sample mean or median; here we’ll fix them at their means.

Compute these means:

vb_means <- vb %>% 
  summarise_if(is.numeric, mean, na.rm = TRUE)
vb_means

To draw the partial effect plot for the WST predictor, you construct a data frame in which WST varies (typically along its sample range) and in which the other predictors are fixed at their sample means:

new_df <- expand.grid(
  NrLang = vb_means$NrLang,
  BWDS = vb_means$BWDS,
  # 50 WST values between the sample minimum and maximum:
  WST = seq(min(vb$WST, na.rm = TRUE),
            max(vb$WST, na.rm = TRUE),
            length.out = 50),
  Raven = vb_means$Raven,
  English.Cloze = vb_means$English.Cloze
)
new_df

Now use predict() to add the conditional means and their confidence intervals, just like before:

vb_ci <- predict(vb.lm, newdata = new_df,
                 interval = "confidence",
                 level = 0.95) %>% 
  as_tibble() %>% 
  bind_cols(new_df, .)
vb_ci

And plot the results just like before:

ggplot() +
  geom_ribbon(data = vb_ci,
              aes(x = WST,
                  ymin = lwr,
                  ymax = upr),
              fill = "grey90") +
  geom_line(data = vb_ci,
            aes(x = WST,
                y = fit)) +
  geom_rug(data = vb,
           aes(x = WST),
           sides = "b")

If instead you want to plot a partial effect plot for NrLang, vary NrLang along its sample range and fix the other variables:

new_df <- expand.grid(
  NrLang = seq(min(vb$NrLang, na.rm = TRUE),
               max(vb$NrLang, na.rm = TRUE),
               by = 1),
  BWDS = vb_means$BWDS,
  WST = vb_means$WST,
  Raven = vb_means$Raven,
  English.Cloze = vb_means$English.Cloze
)

vb_ci <- predict(vb.lm, newdata = new_df,
                 interval = "confidence",
                 level = 0.95) %>% 
  as_tibble() %>% 
  bind_cols(new_df, .)

ggplot() +
  geom_ribbon(data = vb_ci,
              aes(x = NrLang,
                  ymin = lwr,
                  ymax = upr),
              fill = "grey90") +
  geom_line(data = vb_ci,
            aes(x = NrLang,
                y = fit)) +
  geom_rug(data = vb,
           aes(x = NrLang),
           sides = "b")

4.1.2 Comments and caveats

McElreath (2016, Chapter 5) calls these plots counterfactual plots. They show how, according to the model, the conditional mean of the outcome varies with one predictor while keeping the other predictors fixed at some value. In the real world, however, it may not be possible to vary one predictor while keeping the others fixed. For instance, participants with high WST tended to have high English.Cloze values, too. If such correlations between predictors are very strong, it may not be possible for there to be a participant with a low or high value in the focal predictor whilst having just an average value on the other predictors. See also Collinearity isn’t a disease that needs curing.
You can fix the non-focal predictors at different values. In multiple linear regression, this shifts the regression line up- or downwards, but the shape remains the same. (This isn’t true for generalised linear models!) The confidence bands will be a bit wider, though: They’re narrowest when the non-focal predictors are fixed at their means:

# Illustration
new_df <- expand.grid(
  # 3 values for NrLang
  NrLang = c(1, 4, 9),
  BWDS = vb_means$BWDS,
  # 50 WST values between the sample minimum and maximum:
  WST = seq(min(vb$WST, na.rm = TRUE),
            max(vb$WST, na.rm = TRUE),
            length.out = 50),
  Raven = vb_means$Raven,
  English.Cloze = vb_means$English.Cloze
)

vb_ci <- predict(vb.lm, newdata = new_df,
                 interval = "confidence",
                 level = 0.95) %>% 
  as_tibble() %>% 
  bind_cols(new_df, .)

# Left: WST effect for NrLang == 1,
# Middle: WST effect for NrLang == 4,
# Right: WST effect for NrLang == 9
# Confidence band narrowest near mean (NrLang == 4),
# wider the farther away from the mean (NrLang == 9).
# Lines run parallel, though.
ggplot(data = vb_ci,
       aes(x = WST,
           y = fit,
           ymin = lwr,
           ymax = upr)) +
  geom_ribbon(fill = "grey90") +
  geom_line() +
  facet_wrap(~ NrLang)

4.2 Dealing with categorical predictors

4.2.1 A categorical predictor as the focal predictor

We’ll work with the same dataset as before and add the binary Sex variable to the model, just for illustration purposes:

vb.lm <- lm(TotCorrect ~ WST + English.Cloze +
              Raven + BWDS + NrLang + Sex,
            data = vb)
tidy(vb.lm)

Now construct data frame that contains the values for Sex that occur in the dataset and in which the other predictors are fixed at their sample means:

new_df <- expand.grid(
  NrLang = vb_means$NrLang,
  BWDS = vb_means$BWDS,
  WST = vb_means$WST,
  Raven = vb_means$Raven,
  English.Cloze = vb_means$English.Cloze,
  Sex = unique(vb$Sex)
)
new_df

And apply the same procedure as before:

vb_ci <- predict(vb.lm, newdata = new_df,
                 interval = "confidence",
                 level = 0.95) %>% 
  as_tibble() %>% 
  bind_cols(new_df, .)

Since the predictor is categorical, use geom_pointrange() instead of geom_ribbon().

ggplot(data = vb_ci,
       aes(x = Sex,
           y = fit,
           ymax = upr,
           ymin = lwr)) +
  geom_pointrange()

To reiterate, these are ceteris paribus plots. The plot does not show that men are on average worse than women at translating Swedish words into German. (This does happen to be true for this dataset and others like it, but the plot doesn’t show it.) What it does show is that if you have a large group of participants with the same NrLang, BWDS, WST, Raven and English.Cloze results, the men in this group will perform worse on average than the women.

4.2.2 A categorical predictor as a non-focal predictor

What if you want to draw an effect plot for a continuous predictor but there are categorical predictors in the model as well?

Option 1: Draw the plot for all values of the categorical predictors. This is useful when you have an interaction between the categorical predictor and the focal predictor (more on that later), but it adds little when you don’t:

new_df <- expand.grid(
  NrLang = seq(1, 9, by = 1),
  BWDS = vb_means$BWDS,
  WST = vb_means$WST,
  Raven = vb_means$Raven,
  English.Cloze = vb_means$English.Cloze,
  Sex = unique(vb$Sex)
)
vb_ci <- predict(vb.lm, newdata = new_df,
                 interval = "confidence",
                 level = 0.95) %>% 
  as_tibble() %>% 
  bind_cols(new_df, .)

ggplot(data = vb_ci,
       aes(x = NrLang,
           y = fit,
           ymin = lwr,
           ymax = upr)) +
  geom_ribbon(fill = "grey90") +
  geom_line() +
  facet_wrap(~ Sex)

# or:
ggplot(data = vb_ci,
       aes(x = NrLang,
           y = fit,
           ymin = lwr,
           ymax = upr,
           fill = Sex,
           colour = Sex)) +
  geom_ribbon(alpha = 1/5) +
  geom_line()

Option 2: Fix the categorical predictor at some value. There are more women than men in the dataset, so we could treat female as the default sex. The resulting partial effect plot below shows the conditional means for women with varying Raven values but whose other predictor values correspond to the sample mean.

new_df <- expand.grid(
  NrLang = vb_means$NrLang,
  BWDS = vb_means$BWDS,
  WST = vb_means$WST,
  Raven = seq(min(vb$Raven, na.rm = TRUE),
              max(vb$Raven, na.rm = TRUE),
              by = 1),
  English.Cloze = vb_means$English.Cloze,
  Sex = "female"
)

vb_ci <- predict(vb.lm, newdata = new_df,
                 interval = "confidence",
                 level = 0.95) %>% 
  as_tibble() %>% 
  bind_cols(new_df, .)

ggplot() +
  geom_ribbon(data = vb_ci,
              aes(x = Raven,
                  ymin = lwr,
                  ymax = upr),
              fill = "grey90") +
  geom_line(data = vb_ci,
            aes(x = Raven,
                y = fit))

Option 3: You can also choose some compromise between the different categorical values. This is easier to grasp and to do if you recode the categorical predictor as a numeric predictor:

vb$n.Male <- ifelse(vb$Sex == "female", 0, 1)
vb %>% 
  sample_n(size = 5) %>% 
  select(Sex, n.Male)

Now refit the model using the recoded predictor. The model estimates don’t change.

vb.lm <- lm(TotCorrect ~ WST + English.Cloze +
              Raven + BWDS + NrLang + n.Male,
            data = vb)
tidy(vb.lm)

But now you can compute the mean value for the categorical predictor. (0.44 means that 44% of the participants were men.)

vb_means <- vb %>% 
  summarise_if(is.numeric, mean, na.rm = TRUE)
vb_means

So to draw a partial effect plot that represents the average participant (i.e., 44% male and 56% female), plug in the mean value for n.Male:

new_df <- expand.grid(
  NrLang = vb_means$NrLang,
  BWDS = vb_means$BWDS,
  WST = vb_means$WST,
  Raven = seq(min(vb$Raven, na.rm = TRUE),
              max(vb$Raven, na.rm = TRUE),
              by = 1),
  English.Cloze = vb_means$English.Cloze,
  n.Male = vb_means$n.Male
)

vb_ci <- predict(vb.lm, newdata = new_df,
                 interval = "confidence",
                 level = 0.95) %>% 
  as_tibble() %>% 
  bind_cols(new_df, .)

ggplot() +
  geom_ribbon(data = vb_ci,
              aes(x = Raven,
                  ymin = lwr,
                  ymax = upr),
              fill = "grey90") +
  geom_line(data = vb_ci,
            aes(x = Raven,
                y = fit))

The line is somewhat lower than the one in the previous graph, but the confidence band is a bit narrower. (Confidence bands are narrower when you fix the predictors at their sample means.)

4.3 t tests are models, too

We’ll work with the same dataset, just for illustration purposes. Let’s say we originally ran a t-test on the TotCorrect data and compared the men to the women:

# "var.equal = TRUE" runs Student's t-test rather than Welch'
t.test(TotCorrect ~ Sex, data = vb, var.equal = TRUE)

## 
##  Two Sample t-test
## 
## data:  TotCorrect by Sex
## t = 3, df = 157, p-value = 0.003
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.906 4.250
## sample estimates:
## mean in group female   mean in group male 
##                 17.6                 15.0

The same results (\(t = 3.0\), \(p = 0.003\)) can be obtained if you fit the data in a simple linear regression model:

ttest.lm <- lm(TotCorrect ~ Sex, data = vb)
tidy(ttest.lm)

This model can be visualised just like before:

new_df <- data.frame(
  Sex = unique(vb$Sex)
)
vb_ci <- predict(ttest.lm, newdata = new_df,
                 interval = "confidence",
                 level = 0.95) %>% 
  as_tibble() %>% 
  bind_cols(new_df, .)
ggplot(data = vb_ci,
       aes(x = Sex,
           y = fit,
           ymax = upr,
           ymin = lwr)) +
  geom_pointrange()

For a comparatively simple model such as this, it’s a good idea to (also, simultaneously, or just) show the distribution of the raw data (see Tutorial: Drawing a boxplot).

4.4 Dealing with interactions

4.4.1 Between categorical variables

Berthele (2012) asked teacher trainees to rate the academic Potential (6-point scale) of a young boy based on a recording of that boy speaking in a foreign language (French). The recordings either did or did not contain some instances of code-switching into the boy’s native language (Swiss German; CS with vs. without), and the teacher trainees were either told that the boy’s Name was Luca (a typical Swiss German name) or they were told that he was called Dragan (a name associated with the Balkan countries). The question was how the presence of code-switches affected the trainees’ ratings depending on whether they believed the boy to have a migration background or not.

Download the dataset from https://janhove.github.io/visualise_uncertainty/Berthele2012.csv.

b2012 <- read.csv("https://janhove.github.io/visualise_uncertainty/Berthele2012.csv")
b2012 %>% 
  sample_n(10)

We’ll treat the 6-point Potential ratings as continuous here.

b2012.lm <- lm(Potential ~ CS * Name, data = b2012)
tidy(b2012.lm)

To visualise the model effects and their uncertainty, we again construct a data frame with the model’s predictors. Since the effect of CS depends on the value of Name and vice versa, we vary both of these variables:

new_df <- expand.grid(
  CS = unique(b2012$CS),
  Name = unique(b2012$Name)
)
new_df

Then again add the fitted means and their confidence intervals and plot them:

vb_ci <- predict(b2012.lm, newdata = new_df,
                 interval = "confidence",
                 level = 0.95) %>% 
  as_tibble() %>% 
  bind_cols(new_df, .)

# Option one: using facetting:
ggplot(data = vb_ci,
       aes(x = CS,
           y = fit,
           ymax = upr,
           ymin = lwr)) +
  geom_pointrange() +
  facet_wrap(~ Name)

# Option two: using colours and/or different symbols
ggplot(data = vb_ci,
       aes(x = CS,
           y = fit,
           ymax = upr,
           ymin = lwr,
           colour = Name,
           group = Name)) +
  geom_pointrange(position = position_dodge(width = 0.2)) +
  geom_line(position = position_dodge(width = 0.2))

Incidentally, these confidence intervals are not the same confidence intervals you’d obtain if you just computed the mean and standard error in each cell separately: The linear model assumes that the variability is the same in each cell (in the population) and so pools this information across the whole dataset. To the extent that this assumption is more realistic than the tacit assumption made when computing the standard errors separately for each cell (namely, that knowing about the variability in one cell doesn’t help you at all in estimating the variability in the other cells), these confidence intervals will be more or less accurate.

4.4.2 Between a categorical and a continuous variable

Just for illustration purposes, let’s fit the vb2015 data in a model containing an interaction between Sex and Raven:

vb.lm <- lm(TotCorrect ~ Sex*Raven + WST + English.Cloze +
              NrLang + BWDS, data = vb)
tidy(vb.lm)

Same procedure as before:

new_df <- expand.grid(
  NrLang = vb_means$NrLang,
  BWDS = vb_means$BWDS,
  WST = vb_means$WST,
  Raven = seq(min(vb$Raven, na.rm = TRUE),
              max(vb$Raven, na.rm = TRUE),
              by = 1),
  English.Cloze = vb_means$English.Cloze,
  Sex = unique(vb$Sex)
)

vb_ci <- predict(vb.lm, newdata = new_df,
                 interval = "confidence",
                 level = 0.95) %>% 
  as_tibble() %>% 
  bind_cols(new_df, .)

ggplot(data = vb_ci,
       aes(x = Raven,
           y = fit,
           ymin = lwr,
           ymax = upr,
           colour = Sex,
           fill = Sex)) +
  geom_ribbon(alpha = 1/5) + 
  geom_line()

# or
ggplot(data = vb_ci,
       aes(x = Raven,
           y = fit,
           ymin = lwr,
           ymax = upr)) +
  geom_ribbon(fill = "grey90") + 
  geom_line() +
  facet_wrap(~ Sex)

4.4.3 Between continuous variables

I’ve rarely seen people fit interactions between continuous variables outside of generalised additive models, so I won’t go into these.

4.5 Ordinary logistic regression

Vanhove and Berthele (2013) asked 101 Swiss speakers of German to translate 180 written words from Danish, Frisian, Dutch and Swedish into German; these words all had cognates in German, French and/or English. Here we’ll analyse the responses of a single participant (DB3) to see how his translation accuracy varied in terms of how frequent the words’ German and English cognates were (frequency was log-transformed: log.FreqGermanic) and in terms of how orthographically similar the words were to their German and English cognates (MinLevGermanic; 0 = maximally similar, 1 = maximally dissimilar).

Download the dataset from https://janhove.github.io/visualise_uncertainty/VanhoveBerthele2013.csv. Then read it in and retain only participant DB3‘s responses. (The reason for retaining just one participant’s responses is that you’d have to use mixed-effects modelling to deal with several participants’ responses. We’ll do this below, though.)

# Read in data
vb13 <- read_csv("https://janhove.github.io/visualise_uncertainty/VanhoveBerthele2013.csv")

## Parsed with column specification:
## cols(
##   Subject = col_character(),
##   Stimulus = col_character(),
##   Lx = col_character(),
##   Correct = col_double(),
##   MinLevGermanic = col_double(),
##   log.FreqGermanic = col_double(),
##   Sex = col_character(),
##   EngReading = col_double()
## )

# Retain only the observations for participant `DB3`:
db3 <- vb13 %>% 
  filter(Subject == "DB3")

db3 %>% 
  sample_n(5)

Now fit the model.

db3.glm <- glm(Correct ~ MinLevGermanic + log.FreqGermanic,
               data = db3,
               family = "binomial")

tidy(db3.glm)

The same steps are involved in drawing the first effect plot. When fitting a model with glm(), however, you can’t compute confidence intervals for the fitted values on the fly. Moreover, you have to specify the type of the fitted values. I use "link" here, which outputs the fitted values in log-odds:

# Effect display for MinLevGermanic
db3_lev <- expand.grid(
  # Fix clog.Freq at its sample mean
  log.FreqGermanic = mean(db3$log.FreqGermanic),
  # Vary MinLevGermanic along its sample range
  MinLevGermanic = seq(from = min(db3$MinLevGermanic),
                       to   = max(db3$MinLevGermanic),
                       length.out = 50)
)

# Fill in predicted values in LOG-ODDS
db3_lev$Prediction <- predict(db3.glm, db3_lev, type = "link")
db3_lev %>% 
  slice(1:5)

If you plot these values, you see that the relationship between the fitted values and the predictor is linear, but keep in mind that this is if you express the fitted values in log-odds:

ggplot(db3_lev,
       aes(x = MinLevGermanic,
           y = Prediction)) +
  geom_line()

Since hardly anyone thinks in log-odds, it makes sense to convert these values to probabilities using the logistic function (plogis():

db3_lev$p_Prediction <- plogis(db3_lev$Prediction)

ggplot(db3_lev,
       aes(x = MinLevGermanic,
           y = p_Prediction)) +
  geom_line() +
  ylab("Probability of correct translation")

Confidence intervals for the fitted values can’t be computed directly, but predict() can produce the standard errors (in log-odds):

Confidence band:

db3_lev$SE.Prediction <- predict(db3.glm, db3_lev, type = "link", se.fit = TRUE)$se.fit
db3_lev %>% 
  slice(1:5)

A 95% confidence interval roughly corresponds to the fitted value plus/minus two standard errors. Its bounds can then be converted to the probability scale. Compute the confidence intervals in log-odds before converting them to probabilities; do not add or substract the standard errors directly from the fitted values on the probability scale!

Note that while the confidence intervals lie symmetrically around the fitted values if expressed in log-odds, they are asymmetric if expressed in probabilities:

db3_lev <- db3_lev %>% 
  mutate(ci_lo = Prediction - 2 * SE.Prediction,
         ci_hi = Prediction + 2 * SE.Prediction,
         p_Prediction = plogis(Prediction),
         p_ci_lo = plogis(ci_lo),
         p_ci_hi = plogis(ci_hi))
db3_lev %>% 
  slice(1:5)

Now draw the effect plot:

ggplot(db3_lev,
       aes(x = MinLevGermanic,
           y = p_Prediction,
           ymin = p_ci_lo,
           ymax = p_ci_hi)) +
  geom_ribbon(fill = "grey90") +
  geom_line() +
  ylab("Probability of correct translation")

The same plot can be produced using the effects package by setting rescale.axis = FALSE. (rescale.axis = TRUE stretches or squeezes the y-axis so that the effect looks linear.)

library(effects)

## Loading required package: carData

## lattice theme set by effectsTheme()
## See ?effectsTheme for details.

plot(effect("MinLevGermanic", db3.glm),
     rescale.axis = FALSE)

Importantly, partial effect plots for GLMs

yield nonlinear effects if plotted on the response scale (cf. supra)
confidence bands are asymmetric if plotted on the response scale (cf. supra)
don’t run parallel to one another if other predictors are fixed at other values if plotted on the response scale (cf. ultra).

# Effect display for MinLevGermanic
db3_lev <- expand.grid(
  # Fix clog.Freq at its 30th and 90th percentile
  log.FreqGermanic = quantile(db3$log.FreqGermanic, probs = c(0.30, 0.90)),
  # Vary MinLevGermanic along its sample range
  MinLevGermanic = seq(from = min(db3$MinLevGermanic),
                       to   = max(db3$MinLevGermanic),
                       length.out = 50)
)

# Fill in predicted values in LOG-ODDS
db3_lev$Prediction <- predict(db3.glm, db3_lev, type = "link")
db3_lev$SE.Prediction <- predict(db3.glm, db3_lev, type = "link", se.fit = TRUE)$se.fit

db3_lev <- db3_lev %>% 
  mutate(ci_lo = Prediction - 2 * SE.Prediction,
         ci_hi = Prediction + 2 * SE.Prediction,
         p_Prediction = plogis(Prediction),
         p_ci_lo = plogis(ci_lo),
         p_ci_hi = plogis(ci_hi))

ggplot(db3_lev,
       aes(x = MinLevGermanic,
           y = p_Prediction,
           ymin = p_ci_lo,
           ymax = p_ci_hi)) +
  geom_ribbon(fill = "grey90") +
  geom_line() +
  ylab("Probability of correct translation") +
  facet_wrap(~ factor(log.FreqGermanic))

# This plot makes it easier to see that the
# lines don't run parallel to each other.
ggplot(db3_lev,
       aes(x = MinLevGermanic,
           y = p_Prediction,
           ymin = p_ci_lo,
           ymax = p_ci_hi,
           colour = factor(log.FreqGermanic))) +
  geom_line() +
  ylab("Probability of correct translation")

4.6 Mixed-effect models

We’ll use the sleepstudy dataset that comes with the lme4 package, so there’s no need for you to download another dataset. The Subject IDs are just numbers, but I tell R to treat them as characters:

df_sleep <- sleepstudy %>% 
  as_tibble() %>% 
  mutate(Subject = as.character(Subject))
df_sleep

The dataset contains the average Reaction time (in milliseconds) of 18 participants as they were deprived of sleep for 9 consecutive Days:

ggplot(df_sleep) + 
  aes(x = Days, y = Reaction) + 
  stat_smooth(method = "lm", se = FALSE) +
  geom_point() +
  facet_wrap("Subject") +
  labs(x = "Days", y = "Average reaction time (ms)") + 
  scale_x_continuous(breaks = seq(0, 10, 2))

## `geom_smooth()` using formula 'y ~ x'

We’ll fit these data in both frequentist and Bayesian mixed-effects models using the lmer() and brm() functions from the lme4 and brms packages, respectively, and then we’ll draw a number of effect plots.

4.6.1 `lmer()`

First fit the model:

sleep.lmer <- lmer(Reaction ~ 1 + Days + (1 + Days | Subject),
                   data = df_sleep)
tidy(sleep.lmer)

# or: summary(sleep.lmer)

We can use this model to generate predictions in much the same way as we did for the previous models. However, this time we need to decide if we want to compute these predictions for a specific participant or if we want to predict the population average effect. Typically, the latter will be of interest: Where, according to the model, would the mean Reaction value for each day end up if we collected new data for new participants?

Computing these values is pretty easy:

new_sleep <- expand.grid(
  Days = 0:9
)
new_sleep$Prediction <- predict(sleep.lmer, newdata = new_sleep, re.form = NA)
new_sleep

re.form = NA tells predict() to ignore participant-specific random effects; see ?predict.merMod. You can set re.form = NULL to incorporate random effects (i.e., to compute conditional means for a given participant), but then you need to add a Subject column to new_sleep. However, I don’t know how to add confidence intervals to participant-specific predictions, so we’re not going to do this here.

predict() doesn’t compute confidence intervals for population average predictions by lmer()-fitted models, either, but you can apply the same computation described in the section called Extra: More about the analytic approach:

# Construct model matrix
mm <- model.matrix(~ Days, data = new_sleep)
# Extract variance-covariance matrix
vcov_sleep.lmer <- vcov(sleep.lmer)

# Compute standard errors for predictions
new_sleep$SE <- mm %*% vcov_sleep.lmer %*% t(mm) %>% 
  diag() %>% 
  sqrt()

# 95% confidence interval = Prediction +/- 2 SEs:
new_sleep <- new_sleep %>% 
  mutate(ci_lo = Prediction - 2 * SE,
         ci_hi = Prediction + 2 * SE)

And plot as before:

ggplot(data = new_sleep,
       aes(x = Days,
           y = Prediction,
           ymin = ci_lo,
           ymax = ci_hi)) +
  geom_ribbon(fill = "grey90") +
  geom_line() +
  scale_x_continuous(breaks = seq(0, 10, 2))

The same can be accomplished using the effects package:

plot(effect('Days', sleep.lmer))

4.6.2 `brm()`

Fitting the corresponding brm() model takes a lot longer, but you can draw a variety of effect plots afterwards:

sleep.brm <- brm(Reaction ~ 1 + Days + (1 + Days|Subject),
                 data = df_sleep,
                 # large iter so that output varies less from run to run;
                 # you don't have to do this
                 warmup = 1000, iter = 11000,
                 control = list(adapt_delta = 0.95))

## Compiling the C++ model

## Start sampling

summary(sleep.brm)

##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: Reaction ~ 1 + Days + (1 + Days | Subject) 
##    Data: df_sleep (Number of observations: 180) 
## Samples: 4 chains, each with iter = 11000; warmup = 1000; thin = 1;
##          total post-warmup samples = 40000
## 
## Group-Level Effects: 
## ~Subject (Number of levels: 18) 
##                     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)          26.80      6.81    15.43    42.18 1.00    19115    25841
## sd(Days)                6.56      1.50     4.17     9.98 1.00    17349    22237
## cor(Intercept,Days)     0.09      0.30    -0.48     0.68 1.00    10276    16524
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept   251.23      7.31   236.63   265.64 1.00    23656    27988
## Days         10.42      1.69     7.04    13.73 1.00    18474    23321
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma    25.92      1.56    23.08    29.19 1.00    39636    30015
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

Use the tabs to navigate between different kinds of plots.

4.6.2.1 Conditional means for the population

This produces a similar plot to the lmer() one we just drew: What, according to the model, would the mean Reaction value for each day be if you sampled a large number of new participants? The fitted() function, if applied to models fitted using brm() automatically computes 95% credible intervals.

new_sleep <- expand.grid(
  Days = 0:9
)

# use "re_formula = NA" for population average effect:
average_sleep_population <- fitted(sleep.brm, newdata = new_sleep, re_formula = NA) %>% 
  as_tibble() %>%
  bind_cols(new_sleep, .)
average_sleep_population

Plot as before:

ggplot(data = average_sleep_population,
       aes(x = Days,
           y = Estimate,
           ymin = Q2.5,
           ymax = Q97.5)) +
  geom_ribbon(fill = "lightgrey") +
  geom_line() +
  ylim(180, 500) +
  ggtitle("Overall population effect",
          subtitle = "(i.e., random effects ignored)") +
  scale_x_continuous(breaks = seq(0, 10, 2))

Incidentally, I’ve set the limits of the y-axis to 180 and 550 milliseconds to facilitate comparisons between the different plots we’re going to draw.

Arguably, all of the plots we’ve drawn so far overemphasise the point estimates for the fitted values (i.e., the regression line) as well as the end points of the 95% uncertainty interval. To de-emphasise these, you could compute and overlay several uncertainty bands (here: 50%, 80% and 95% credible bands):

new_sleep <- expand.grid(
  Days = 0:9
)

average_sleep_population <- fitted(sleep.brm, newdata = new_sleep, re_formula = NA,
                    probs = c(0.025, 0.10, 0.25, 0.75, 0.90, 0.975)) %>% 
  as_tibble() %>%
  bind_cols(new_sleep, .)

ggplot(data = average_sleep_population,
       aes(x = Days,
           y = Estimate)) +
  geom_ribbon(aes(ymin = Q2.5,
                  ymax = Q97.5),
              fill = "grey90") +
  geom_ribbon(aes(ymin = Q10,
                  ymax = Q90),
              fill = "grey80") +
  geom_ribbon(aes(ymin = Q25,
                  ymax = Q75),
              fill = "grey70") +
  ylim(180, 500) +
  scale_x_continuous(breaks = seq(0, 10, 2))

I’m not sure if you’ll get them through peer review like this :), but the plot does emphasise that Bayesian models (such as the ones fitted using brm()) produce distributions of estimates. (You could actually do the same for lmer()-fitted models by multiplying the predictions’ standard errors by different numbers, e.g., 2, 1.28 and 0.67 for 95%, 80% and 50% confidence bands.)

4.6.2.2 Extra: Conditional means for a known participant

This calculates and then draws what according to the model the average reaction time would be if you gathered a lot of data on a particular participant that was present in the dataset. Here we’ll treat Subject 308 as the focal participant.

new_sleep <- expand.grid(
  Days = 0:9,
  Subject = "308"
)

# "re_formula = NULL" to incorporate participant-specific
# random effects in the predictions
average_sleep_old <- fitted(sleep.brm, new_sleep, re_formula = NULL) %>% 
  as_tibble() %>% 
  bind_cols(new_sleep, .)
average_sleep_old

And plot:

ggplot(data = average_sleep_old,
       aes(x = Days,
             y = Estimate,
             ymin = Q2.5,
             ymax = Q97.5)) +
  geom_ribbon(fill = "lightgrey") +
  geom_line() +
  ylim(180, 500) +
  scale_x_continuous(breaks = seq(0, 10, 2)) +
  ggtitle("Estimated effect of sleep deprivation for Subject 308")

4.6.2.3 Extra: Conditional means for an unknown participant

We can even use the model to predict where the mean reaction times for each day for a new, unknown participant will end up:

new_sleep <- expand.grid(
  Days = 0:9,
  Subject = "new participant" # Subject not in original dataset
)

# use both "re_formula = NULL" and
# "allow_new_levels = TRUE"
average_sleep_new <- fitted(sleep.brm, new_sleep, re_formula = NULL,
                        allow_new_levels = TRUE) %>% 
  as_tibble() %>% 
  bind_cols(new_sleep, .)

average_sleep_new

The predicted averages are the same as the predicted population averages:

cbind(average_sleep_population$Estimate, average_sleep_new$Estimate)

##       [,1] [,2]
##  [1,]  251  251
##  [2,]  262  262
##  [3,]  272  272
##  [4,]  283  283
##  [5,]  293  293
##  [6,]  303  303
##  [7,]  314  314
##  [8,]  324  324
##  [9,]  335  335
## [10,]  345  345

However, the uncertainty about where the average values of one particular but unknown person might end up is considerably larger than the uncertainty about where the average values of a large group of unknown people might end up:

cbind(average_sleep_population$Est.Error, average_sleep_new$Est.Error)

##        [,1] [,2]
##  [1,]  7.31 24.4
##  [2,]  7.29 25.3
##  [3,]  7.65 27.6
##  [4,]  8.34 30.8
##  [5,]  9.30 34.7
##  [6,] 10.44 39.1
##  [7,] 11.71 43.9
##  [8,] 13.08 48.9
##  [9,] 14.52 54.1
## [10,] 16.00 59.4

ggplot(data = average_sleep_new,
       aes(x = Days,
           y = Estimate,
           ymin = Q2.5,
           ymax = Q97.5)) +
  geom_ribbon(fill = "lightgrey") +
  geom_line() +
  ylim(180, 500) +
  scale_x_continuous(breaks = seq(0, 10, 2)) +
  ggtitle("Estimated effect of sleep deprivation for unknown person")

I think that for the most part researchers are interested in the population average effects, not in individual effects. But if it is in fact the individual effects that are of interest, make sure to take into account the added uncertainty!

4.6.2.4 Extra: Predicted values for a known participant

We already saw how you can compute and visualise the predicted averages for known and unknown participants. But occasionally, it may be necessary to visualise where individual measurements for known or unknown participants may end up. To compute prediction intervals, apply predict() to the brm()-fitted model:

new_sleep <- expand.grid(
  Days = 0:9,
  Subject = "308"
)

# "re_formula = NULL" to incorporate subject-specific random effects
predict_sleep_old <- predict(sleep.brm, new_sleep, re_formula = NULL) %>% 
  as_tibble() %>% 
  bind_cols(new_sleep, .)

The average predicted values are the same as before:

cbind(average_sleep_old$Estimate, predict_sleep_old$Estimate)

##       [,1] [,2]
##  [1,]  254  254
##  [2,]  274  274
##  [3,]  293  293
##  [4,]  313  313
##  [5,]  332  332
##  [6,]  352  352
##  [7,]  372  372
##  [8,]  391  391
##  [9,]  411  411
## [10,]  431  431

But the uncertainty about where a single value might end up is considerably large than the uncertainty about where an average value might end up:

cbind(average_sleep_old$Est.Error, predict_sleep_old$Est.Error)

##        [,1] [,2]
##  [1,] 13.07 29.1
##  [2,] 11.17 28.3
##  [3,]  9.57 27.9
##  [4,]  8.41 27.3
##  [5,]  7.91 27.3
##  [6,]  8.18 27.2
##  [7,]  9.16 27.6
##  [8,] 10.65 28.1
##  [9,] 12.47 28.7
## [10,] 14.50 29.8

ggplot(data = predict_sleep_old,
       aes(x = Days,
           y = Estimate,
           ymin = Q2.5,
           ymax = Q97.5)) +
  geom_ribbon(fill = "lightgrey") +
  geom_line() +
  ylim(180, 500) +
  scale_x_continuous(breaks = seq(0, 10, 2))

4.6.2.5 Extra: Predicted values for an unknown participant

new_sleep <- expand.grid(
  Days = 0:9,
  Subject = "new participant"
)

# "re_formula = NULL" to incorporate subject-specific random effects
# "allow_new_levels = TRUE" to allow predictions for new participants
predict_sleep_new <- predict(sleep.brm, new_sleep, re_formula = NULL,
                             allow_new_levels = TRUE) %>% 
  as_tibble() %>% 
  bind_cols(new_sleep, .)

The average predicted values are the same as before:

cbind(average_sleep_new$Estimate, predict_sleep_new$Estimate)

##       [,1] [,2]
##  [1,]  251  252
##  [2,]  262  262
##  [3,]  272  272
##  [4,]  283  283
##  [5,]  293  293
##  [6,]  303  304
##  [7,]  314  314
##  [8,]  324  325
##  [9,]  335  335
## [10,]  345  346

But the uncertainty about where a single value might end up is considerably large than the uncertainty about where an average value might end up:

cbind(average_sleep_new$Est.Error, predict_sleep_new$Est.Error)

##       [,1] [,2]
##  [1,] 24.4 35.4
##  [2,] 25.3 36.1
##  [3,] 27.6 37.6
##  [4,] 30.8 39.8
##  [5,] 34.7 42.8
##  [6,] 39.1 46.6
##  [7,] 43.9 50.3
##  [8,] 48.9 54.8
##  [9,] 54.1 59.4
## [10,] 59.4 64.2

ggplot(data = predict_sleep_new,
       aes(x = Days,
           y = Estimate,
           ymin = Q2.5,
           ymax = Q97.5)) +
  geom_ribbon(fill = "lightgrey") +
  geom_line() +
  ylim(170, 500) +
  scale_x_continuous(breaks = seq(0, 10, 2))

These plots are difficult to draw when using lmer():

There is no option for computing standard errors of predictions because it is difficult to define an efficient method that incorporates uncertainty in the variance parameters; we recommend bootMer for this task (source: ?predict.merMod)

4.7 Logistic mixed effects models

We’ll use the vb13 dataset again of which we analysed a small portion in Section Ordinary logistic regression.

4.7.1 `glmer()`

We’ll just model the effect of MinLevGermanic as including a by-subject random slope for log.FreqGermanic causes either convergence errors or warnings about singular fits.

vb.glmer <- glmer(Correct ~ 1 + MinLevGermanic +
                    (1 + MinLevGermanic | Subject) +
                    (1 | Stimulus),
                  data = vb13, family = "binomial")
tidy(vb.glmer) # or: summary(vb.glmer)

The steps involved in computing the predicted probability of translation success are the same as before, but keep in mind that we disregard subject-specific and stimulus-specific variation. In other words, the plot shows the modelled probability of translation success for participants of average skill and for stimuli of average difficulty (for their MinLevGermanic value):

new_lev <- expand.grid(
  MinLevGermanic = seq(0, 1, 0.05)
)

# re.form = NA: disregard subject- and stimulus-specific variability;
# the output is in log-odds
new_lev$Prediction <- predict(vb.glmer, new_lev, type = "link", re.form = NA)

Compute the fitted values’ standard errors using the model matrix and the model’s variance-covariance matrix:

mm <- model.matrix(~ MinLevGermanic, new_lev)
vcov_vb.glmer <- vcov(vb.glmer)
new_lev$SE <- mm %*% vcov_vb.glmer %*% t(mm) %>% 
  diag() %>% 
  sqrt()

Multiply the standard errors to obtain confidence intervals, then transform the fitted values and the confidence intervals to the probability scale:

new_lev <- new_lev %>% 
  mutate(ci_lo = Prediction - 2*SE,
         ci_hi = Prediction + 2*SE,
         p_Prediction = plogis(Prediction),
         p_ci_lo = plogis(ci_lo),
         p_ci_hi = plogis(ci_hi))

ggplot(data = new_lev,
       aes(x = MinLevGermanic,
           y = p_Prediction,
           ymin = p_ci_lo,
           ymax = p_ci_hi)) +
  geom_ribbon(fill = "grey90") +
  geom_line()

4.7.2 Extra: `brm()`

To circumvent glmer()’s warnings and error messages, we could fit the data using brm(). But fitting a logistic mixed model on over 18,000 data points takes a lot of time. For this reason, I reduce the size of the dataset to 1,250 data points for the following illustration. Obviously you wouldn’t do this in your own analysis!

set.seed(123)
vb13_small <- vb13 %>%
  filter(Subject %in% sample(unique(vb13$Subject), 25, replace = FALSE)) %>%
  filter(Stimulus %in% sample(unique(vb13$Stimulus), 50, replace = FALSE))
vb13_small <- droplevels(vb13_small)

I added control = list(adapt_delta = 0.95) to deal with warnings about divergent transitions:

vb.brm <- brm(Correct ~ 1 + MinLevGermanic + log.FreqGermanic +
                (1 + MinLevGermanic + log.FreqGermanic | Subject) +
                (1 | Stimulus),
              data = vb13_small, family = "bernoulli",
              control = list(adapt_delta = 0.95))

## Compiling the C++ model

## Start sampling

summary(vb.brm)

##  Family: bernoulli 
##   Links: mu = logit 
## Formula: Correct ~ 1 + MinLevGermanic + log.FreqGermanic + (1 + MinLevGermanic + log.FreqGermanic | Subject) + (1 | Stimulus) 
##    Data: vb13_small (Number of observations: 1250) 
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup samples = 4000
## 
## Group-Level Effects: 
## ~Stimulus (Number of levels: 50) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     1.62      0.21     1.25     2.10 1.00     1272     2199
## 
## ~Subject (Number of levels: 25) 
##                                      Estimate Est.Error l-95% CI u-95% CI Rhat
## sd(Intercept)                            1.35      0.35     0.69     2.10 1.00
## sd(MinLevGermanic)                       1.17      0.76     0.07     2.89 1.00
## sd(log.FreqGermanic)                     0.11      0.08     0.00     0.29 1.00
## cor(Intercept,MinLevGermanic)            0.09      0.45    -0.77     0.88 1.00
## cor(Intercept,log.FreqGermanic)          0.05      0.44    -0.78     0.86 1.00
## cor(MinLevGermanic,log.FreqGermanic)    -0.03      0.48    -0.87     0.84 1.00
##                                      Bulk_ESS Tail_ESS
## sd(Intercept)                            2206     2267
## sd(MinLevGermanic)                        955     1944
## sd(log.FreqGermanic)                     1304     1814
## cor(Intercept,MinLevGermanic)            4058     2678
## cor(Intercept,log.FreqGermanic)          4527     2745
## cor(MinLevGermanic,log.FreqGermanic)     1852     2842
## 
## Population-Level Effects: 
##                  Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept            2.03      0.82     0.44     3.66 1.00     1381     2405
## MinLevGermanic      -8.14      1.48   -11.02    -5.21 1.00     1361     1972
## log.FreqGermanic     0.28      0.16    -0.03     0.58 1.00     1457     1697
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

The most useful plot for researchers is probably the one showing the population average effect, i.e., the plot showing the estimated probability of translation success for a participant of average skill and for items varying in their MinLevGermanic value, having the sample mean log.FreqGermanic value and being of otherwise average difficulty:

new_lev <- expand.grid(
  MinLevGermanic = seq(0, 1, 0.05),
  log.FreqGermanic = mean(vb13_small$log.FreqGermanic)
)

# use "re_formula = NA" for population average effect
lev_population <- fitted(vb.brm, new_lev, re_formula = NA) %>% 
  as_tibble() %>%
  bind_cols(new_lev, .)
lev_population

The estimated and uncertainty intervals are already expressed in probabilities, so there’s no need to apply the logistic function:

ggplot(data = lev_population,
       aes(x = MinLevGermanic,
           y = Estimate,
           ymin = Q2.5,
           ymax = Q97.5)) +
  geom_ribbon(fill = "grey90") +
  geom_line()

You can also produce similar plots for a specific participant in the dataset. Since the participants vary in their sensibility to the focal predictor (MinLevGermanic), I use re_formula = ~ (1 + MinLevGermanic + log.FreqGermanic | Subject) to incorporate these sources of by-participant variability in the estimates.

new_lev <- expand.grid(
  MinLevGermanic = seq(0, 1, 0.05),
  log.FreqGermanic = mean(vb13_small$log.FreqGermanic),
  Subject = "CB3"
)

lev_CB3 <- fitted(vb.brm, new_lev, 
                  re_formula = ~ (1 + MinLevGermanic + log.FreqGermanic | Subject)) %>% 
  as_tibble() %>%
  bind_cols(new_lev, .)

ggplot(data = lev_CB3,
       aes(x = MinLevGermanic,
           y = Estimate,
           ymin = Q2.5,
           ymax = Q97.5)) +
  geom_ribbon(fill = "grey90") +
  geom_line()

We can also ask the model to predict where the effect curve of a new, unknown participant might end up:

new_lev <- expand.grid(
  MinLevGermanic = seq(0, 1, 0.05),
  log.FreqGermanic = mean(vb13_small$log.FreqGermanic),
  Subject = "new participant"
)

# "allow_new_levels = TRUE", because unknown participant
lev_new <- fitted(vb.brm, new_lev, 
                  re_formula = ~ (1 + MinLevGermanic + log.FreqGermanic | Subject),
                  allow_new_levels = TRUE) %>% 
  as_tibble() %>%
  bind_cols(new_lev, .)

Clearly, the uncertainty about where an unknown participant’s effect curve lies is enormous.

ggplot(data = lev_new,
       aes(x = MinLevGermanic,
           y = Estimate,
           ymin = Q2.5,
           ymax = Q97.5)) +
  geom_ribbon(fill = "grey90") +
  geom_line()

5 Caveats

5.1 Other things may not be equal

When you have several predictors in one model, the effect plots are ceteris paribus (other things equal) plots. However, it may be rare or indeed impossible for one predictor to vary while the other predictors remain constant. For instance, you could use the number of tokens, the number of types and the ratio of the number of types and tokens in a text as the predictors for that text’s perceived quality. While it is possible mathematically to have such a model output the ‘effect’ of the number of tokens while keeping the number of types and the type-token ratio constant, it is impossible in the actual world for two texts to have a different number of tokens, but both the same number of types and the same type-token ratio: If you change the number of tokens but keep the number of types constant, then the type-token ratio will also vary.

See Collinearity isn’t a disease that needs curing.

5.2 Your model may be misspecified

A statistical model can be highly confident about the answer to a question without realising that the question doesn’t make any sense. A case in point would be the following model:

ggplot(data = vb,
       aes(x = Age,
           y = TotCorrect)) +
  geom_smooth(method = "lm") +
  geom_point(shape = 1)

## `geom_smooth()` using formula 'y ~ x'

A model that fits TotCorrect as a linear function of Age is highly confident that this relationship is essentially flat. And this is the correct answer. Unfortunately, the model doesn’t realise that the question is ill-posed: There clearly is a relationship between TotCorrect and Age – it just isn’t linear.

See Before worrying about model assumptions, think about model relevance.

Another relevant caveat is that your model may not account for all major sources of variability and hence uncertainty. For instance, if you analyse the data from a cluster-randomised experiment without taking the clusters into account, you’re bound to underestimate your uncertainty. But that’s not the model’s fault.

5.3 Other models may yield different pictures

Different statistical models may perform roughly equally well in terms of fit to the data yet may paint quite different pictures about the relationships between the predictors and the outcomes. See Breiman’s (2001) Rashomon effect.

Moreover, we typically fit several models to the data and then (formally or informally) select one of these for presentation. Almost invariably, the fact that one out of many models was selected for presentation is not taken into account in the uncertainty intervals around the model’s estimates and predictions. This is entirely understandable as it is hard to take this additional uncertainty into account. (I would hardly know where to begin myself.) Some references that may be useful are McElreath (2016, Section 6.5, on Bayesian model averaging), Efron (2014, which I don’t think I’ve fully understood yet), and Heinze et al. (2018, but see le Cessie et al. 2019).

6 References

Including references in talk.

Berthele, Raphael. 2012. The influence of code-mixing and speaker information on perception and assessment of foreign language proficiency: An experimental study. International Journal of Bilingualism 16(4). 453-466.

Breiman, Leo. 2001. Statistical modeling: The two cultures. Statistical Science 16(3). 199-231.

DeKeyser, Robert, Iris Alfi-Shabtay and Dorit Ravid. 2010. Cross-linguistic evidence for the nature of age effects in second language acquisition. Applied Psycholinguistics 31(3). 413-438.

Efron, Bradley. 2014. Estimation and accuracy after model selection. Journal of the American Statistical Association 109(507). 991-1007.

Fox, John. 2003. Effect displays in R for generalised linear models. Journal of Statistical Software 8. 1-27.

Fox, John and Sanford Weisberg. 2018. Visualizing fit and lack of fit in complex regression models with predict effect plots and partial residuals. Journal of Statistical Software 87(9).

Goldstein, Alex, Adam Kapelner, Justin Bleich and Emil Pitkin. 2015. Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation. Journal of Computational and Graphical Statistics 24(1). 44-65.

Greenwell, Brandon M. 2017. pdp: An R package for constructing partial dependence plots. R Journal 9(1). 421-436.

Heinze, Georg, Christine Wallisch and Daniele Dunkler. 2018. Variable selection - A review and recommendations for the practicing statistician. Biometrical Journal 60. 431-449.

le Cessie, Saskia, Kim Luijken and Els Goetghebeur. 2019. Regarding “Variable selection - A review and recommendations for the practicing statistician” by G. Heinze, C. Wallisch, and D. Dunkler. Biometrical Journal.

McElreath, Richard. 2016. Statistical rethinking: A Bayesian course with examples in R and Stan. Boca Raton, FL: CRC Press.

Vanhove, Jan. 2013. The critical period hypothesis in second language acquisition: A statistical critique and a reanalysis. PLOS ONE 8(7). e69172.

Vanhove, Jan and Raphael Berthele. 2013. Factoren bij het herkennen van cognaten in onbekende talen: algemeen of taalspecifiek?. Taal en Tongval 65. 171-210.

Vanhove, Jan and Raphael Berthele. 2015. The lifespan development of cognate guessing skills in an unknown related language. International Review of Applied Linguistics in Language Teaching 53(1). 1-38.

Weisberg, Sanford. 2005. Applied linear regression (3rd edn). Hoboken, NJ: Wiley.

7 Session info

devtools::session_info()

## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value                       
##  version  R version 4.0.0 (2020-04-24)
##  os       Ubuntu 18.04.4 LTS          
##  system   x86_64, linux-gnu           
##  ui       X11                         
##  language (EN)                        
##  collate  en_US.UTF-8                 
##  ctype    en_US.UTF-8                 
##  tz       Europe/Brussels             
##  date     2020-06-29                  
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package        * version  date       lib source        
##  abind            1.4-5    2016-07-21 [1] CRAN (R 4.0.0)
##  assertthat       0.2.1    2019-03-21 [1] CRAN (R 4.0.0)
##  backports        1.1.8    2020-06-17 [1] CRAN (R 4.0.0)
##  base64enc        0.1-3    2015-07-28 [1] CRAN (R 4.0.0)
##  bayesplot        1.7.2    2020-05-28 [1] CRAN (R 4.0.0)
##  blob             1.2.1    2020-01-20 [1] CRAN (R 4.0.0)
##  boot             1.3-25   2020-04-26 [4] CRAN (R 4.0.0)
##  bridgesampling   1.0-0    2020-02-26 [1] CRAN (R 4.0.0)
##  brms           * 2.13.0   2020-05-27 [1] CRAN (R 4.0.0)
##  Brobdingnag      1.2-6    2018-08-13 [1] CRAN (R 4.0.0)
##  broom          * 0.5.6    2020-04-20 [1] CRAN (R 4.0.0)
##  callr            3.4.3    2020-03-28 [1] CRAN (R 4.0.0)
##  carData        * 3.0-4    2020-05-22 [1] CRAN (R 4.0.0)
##  cellranger       1.1.0    2016-07-27 [1] CRAN (R 4.0.0)
##  cli              2.0.2    2020-02-28 [1] CRAN (R 4.0.0)
##  coda             0.19-3   2019-07-05 [1] CRAN (R 4.0.0)
##  colorspace       1.4-1    2019-03-18 [1] CRAN (R 4.0.0)
##  colourpicker     1.0      2017-09-27 [1] CRAN (R 4.0.0)
##  crayon           1.3.4    2017-09-16 [1] CRAN (R 4.0.0)
##  crosstalk        1.1.0.1  2020-03-13 [1] CRAN (R 4.0.0)
##  curl             4.3      2019-12-02 [1] CRAN (R 4.0.0)
##  DBI              1.1.0    2019-12-15 [1] CRAN (R 4.0.0)
##  dbplyr           1.4.4    2020-05-27 [1] CRAN (R 4.0.0)
##  desc             1.2.0    2018-05-01 [1] CRAN (R 4.0.0)
##  devtools         2.3.0    2020-04-10 [1] CRAN (R 4.0.0)
##  digest           0.6.25   2020-02-23 [1] CRAN (R 4.0.0)
##  dplyr          * 1.0.0    2020-05-29 [1] CRAN (R 4.0.0)
##  DT               0.13     2020-03-23 [1] CRAN (R 4.0.0)
##  dygraphs         1.1.1.6  2018-07-11 [1] CRAN (R 4.0.0)
##  effects        * 4.1-4    2019-11-15 [1] CRAN (R 4.0.0)
##  ellipsis         0.3.1    2020-05-15 [1] CRAN (R 4.0.0)
##  estimability     1.3      2018-02-11 [1] CRAN (R 4.0.0)
##  evaluate         0.14     2019-05-28 [1] CRAN (R 4.0.0)
##  fansi            0.4.1    2020-01-08 [1] CRAN (R 4.0.0)
##  farver           2.0.3    2020-01-16 [1] CRAN (R 4.0.0)
##  fastmap          1.0.1    2019-10-08 [1] CRAN (R 4.0.0)
##  forcats        * 0.5.0    2020-03-01 [1] CRAN (R 4.0.0)
##  fs               1.4.1    2020-04-04 [1] CRAN (R 4.0.0)
##  generics         0.0.2    2018-11-29 [1] CRAN (R 4.0.0)
##  ggplot2        * 3.3.2    2020-06-19 [1] CRAN (R 4.0.0)
##  ggridges         0.5.2    2020-01-12 [1] CRAN (R 4.0.0)
##  glue             1.4.1    2020-05-13 [1] CRAN (R 4.0.0)
##  gridExtra        2.3      2017-09-09 [1] CRAN (R 4.0.0)
##  gtable           0.3.0    2019-03-25 [1] CRAN (R 4.0.0)
##  gtools           3.8.2    2020-03-31 [1] CRAN (R 4.0.0)
##  haven            2.3.0    2020-05-24 [1] CRAN (R 4.0.0)
##  hms              0.5.3    2020-01-08 [1] CRAN (R 4.0.0)
##  htmltools        0.5.0    2020-06-16 [1] CRAN (R 4.0.0)
##  htmlwidgets      1.5.1    2019-10-08 [1] CRAN (R 4.0.0)
##  httpuv           1.5.4    2020-06-06 [1] CRAN (R 4.0.0)
##  httr             1.4.1    2019-08-05 [1] CRAN (R 4.0.0)
##  igraph           1.2.5    2020-03-19 [1] CRAN (R 4.0.0)
##  inline           0.3.15   2018-05-18 [1] CRAN (R 4.0.0)
##  jsonlite         1.6.1    2020-02-02 [1] CRAN (R 4.0.0)
##  knitr            1.28     2020-02-06 [1] CRAN (R 4.0.0)
##  labeling         0.3      2014-08-23 [1] CRAN (R 4.0.0)
##  later            1.0.0    2019-10-04 [1] CRAN (R 4.0.0)
##  lattice          0.20-41  2020-04-02 [4] CRAN (R 4.0.0)
##  lifecycle        0.2.0    2020-03-06 [1] CRAN (R 4.0.0)
##  lme4           * 1.1-23   2020-04-07 [1] CRAN (R 4.0.0)
##  loo              2.2.0    2019-12-19 [1] CRAN (R 4.0.0)
##  lubridate        1.7.8    2020-04-06 [1] CRAN (R 4.0.0)
##  magrittr         1.5      2014-11-22 [1] CRAN (R 4.0.0)
##  markdown         1.1      2019-08-07 [1] CRAN (R 4.0.0)
##  MASS             7.3-51.6 2020-04-26 [4] CRAN (R 4.0.0)
##  Matrix         * 1.2-18   2019-11-27 [4] CRAN (R 4.0.0)
##  matrixStats      0.56.0   2020-03-13 [1] CRAN (R 4.0.0)
##  memoise          1.1.0    2017-04-21 [1] CRAN (R 4.0.0)
##  mgcv             1.8-31   2019-11-09 [4] CRAN (R 4.0.0)
##  mime             0.9      2020-02-04 [1] CRAN (R 4.0.0)
##  miniUI           0.1.1.1  2018-05-18 [1] CRAN (R 4.0.0)
##  minqa            1.2.4    2014-10-09 [1] CRAN (R 4.0.0)
##  mitools          2.4      2019-04-26 [1] CRAN (R 4.0.0)
##  modelr           0.1.8    2020-05-19 [1] CRAN (R 4.0.0)
##  munsell          0.5.0    2018-06-12 [1] CRAN (R 4.0.0)
##  mvtnorm          1.1-1    2020-06-09 [1] CRAN (R 4.0.0)
##  nlme             3.1-147  2020-04-13 [4] CRAN (R 4.0.0)
##  nloptr           1.2.2.1  2020-03-11 [1] CRAN (R 4.0.0)
##  nnet             7.3-14   2020-04-26 [4] CRAN (R 4.0.0)
##  pillar           1.4.4    2020-05-05 [1] CRAN (R 4.0.0)
##  pkgbuild         1.0.8    2020-05-07 [1] CRAN (R 4.0.0)
##  pkgconfig        2.0.3    2019-09-22 [1] CRAN (R 4.0.0)
##  pkgload          1.1.0    2020-05-29 [1] CRAN (R 4.0.0)
##  plyr             1.8.6    2020-03-03 [1] CRAN (R 4.0.0)
##  prettyunits      1.1.1    2020-01-24 [1] CRAN (R 4.0.0)
##  processx         3.4.2    2020-02-09 [1] CRAN (R 4.0.0)
##  promises         1.1.0    2019-10-04 [1] CRAN (R 4.0.0)
##  ps               1.3.3    2020-05-08 [1] CRAN (R 4.0.0)
##  purrr          * 0.3.4    2020-04-17 [1] CRAN (R 4.0.0)
##  R6               2.4.1    2019-11-12 [1] CRAN (R 4.0.0)
##  Rcpp           * 1.0.4.6  2020-04-09 [1] CRAN (R 4.0.0)
##  RcppParallel     5.0.1    2020-05-06 [1] CRAN (R 4.0.0)
##  readr          * 1.3.1    2018-12-21 [1] CRAN (R 4.0.0)
##  readxl           1.3.1    2019-03-13 [1] CRAN (R 4.0.0)
##  remotes          2.1.1    2020-02-15 [1] CRAN (R 4.0.0)
##  reprex           0.3.0    2019-05-16 [1] CRAN (R 4.0.0)
##  reshape2         1.4.4    2020-04-09 [1] CRAN (R 4.0.0)
##  rlang            0.4.6    2020-05-02 [1] CRAN (R 4.0.0)
##  rmarkdown        2.2      2020-05-31 [1] CRAN (R 4.0.0)
##  rprojroot        1.3-2    2018-01-03 [1] CRAN (R 4.0.0)
##  rsconnect        0.8.16   2019-12-13 [1] CRAN (R 4.0.0)
##  rstan            2.19.3   2020-02-11 [1] CRAN (R 4.0.0)
##  rstantools       2.0.0    2019-09-15 [1] CRAN (R 4.0.0)
##  rstudioapi       0.11     2020-02-07 [1] CRAN (R 4.0.0)
##  rvest            0.3.5    2019-11-08 [1] CRAN (R 4.0.0)
##  scales           1.1.1    2020-05-11 [1] CRAN (R 4.0.0)
##  sessioninfo      1.1.1    2018-11-05 [1] CRAN (R 4.0.0)
##  shiny            1.5.0    2020-06-23 [1] CRAN (R 4.0.0)
##  shinyjs          1.1      2020-01-13 [1] CRAN (R 4.0.0)
##  shinystan        2.5.0    2018-05-01 [1] CRAN (R 4.0.0)
##  shinythemes      1.1.2    2018-11-06 [1] CRAN (R 4.0.0)
##  StanHeaders      2.21.0-5 2020-06-09 [1] CRAN (R 4.0.0)
##  statmod          1.4.34   2020-02-17 [1] CRAN (R 4.0.0)
##  stringi          1.4.6    2020-02-17 [1] CRAN (R 4.0.0)
##  stringr        * 1.4.0    2019-02-10 [1] CRAN (R 4.0.0)
##  survey           4.0      2020-04-03 [1] CRAN (R 4.0.0)
##  survival         3.1-12   2020-04-10 [4] CRAN (R 4.0.0)
##  testthat         2.3.2    2020-03-02 [1] CRAN (R 4.0.0)
##  threejs          0.3.3    2020-01-21 [1] CRAN (R 4.0.0)
##  tibble         * 3.0.1    2020-04-20 [1] CRAN (R 4.0.0)
##  tidyr          * 1.1.0    2020-05-20 [1] CRAN (R 4.0.0)
##  tidyselect       1.1.0    2020-05-11 [1] CRAN (R 4.0.0)
##  tidyverse      * 1.3.0    2019-11-21 [1] CRAN (R 4.0.0)
##  usethis          1.6.1    2020-04-29 [1] CRAN (R 4.0.0)
##  vctrs            0.3.1    2020-06-05 [1] CRAN (R 4.0.0)
##  withr            2.2.0    2020-04-20 [1] CRAN (R 4.0.0)
##  xfun             0.14     2020-05-20 [1] CRAN (R 4.0.0)
##  xml2             1.3.2    2020-04-23 [1] CRAN (R 4.0.0)
##  xtable           1.8-4    2019-04-21 [1] CRAN (R 4.0.0)
##  xts              0.12-0   2020-01-19 [1] CRAN (R 4.0.0)
##  yaml             2.2.1    2020-02-01 [1] CRAN (R 4.0.0)
##  zoo              1.8-8    2020-05-02 [1] CRAN (R 4.0.0)
## 
## [1] /home/jan/R/x86_64-pc-linux-gnu-library/4.0
## [2] /usr/local/lib/R/site-library
## [3] /usr/lib/R/site-library
## [4] /usr/lib/R/library

Visualising statistical uncertainty using model-based graphs

Jan Vanhove (jan.vanhove@unifr.ch, https://janhove.github.io)

BICLCE 2019, How to (:) Open Linguistics; 26–28 September 2019 [page updated on 29 June 2020]