Chapter 25 Mixed models

Mixed models are used when there is repetitions in the response due to (here) the person conducting the trial. The two days are repetitions, and hence we can use all the data (not splitting in to days), but need to account for the person in the model.

# subset the data
x <- pasta %>% 
  filter(str_detect(StationName,'mush')) 

mdl <- lmer(data = x, Consumption~I_like_taste_of_pasta_with_mushrooms + Day +  (1|Person)) 
summary(mdl)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: Consumption ~ I_like_taste_of_pasta_with_mushrooms + Day + (1 |  
##     Person)
##    Data: x
## 
## REML criterion at convergence: 319.8
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.2986 -0.6573 -0.1820  0.4409  2.0326 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  Person   (Intercept) 1325     36.40   
##  Residual             4871     69.79   
## Number of obs: 30, groups:  Person, 15
## 
## Fixed effects:
##                                      Estimate Std. Error      df t value
## (Intercept)                           -56.368    142.864  25.000  -0.395
## I_like_taste_of_pasta_with_mushrooms   30.552     21.146  23.728   1.445
## Day                                    -4.097     25.523  13.943  -0.161
##                                      Pr(>|t|)
## (Intercept)                             0.697
## I_like_taste_of_pasta_with_mushrooms    0.162
## Day                                     0.875
## 
## Correlation of Fixed Effects:
##             (Intr) I_____
## I_lk_ts____ -0.957       
## Day         -0.320  0.055

This is the joined effect between the two days. Think of an average of the two slopes - one for each day -. Here taking into account that each person has provided two responses of the consumption of pasta with mushrooms.

This can also be accomplished using the tidyverse setup engined by the broom.mixed package.

In principle, we simply do not loop over Day, but include it in the formula along with person.

tbmixed <- pastalong %>% 
  filter(!is.na(StationName )) %>% 
  group_by(StationName,question) %>% 
  do(lmer(data = ., Consumption~answnum + Day + (1|Person)) %>% tidy(conf.int = T))

The output here is a bit different than the lm() model. But it is still the slope of answnum which carries the interesting stuff.

tbmixed %>% 
  filter(term=='answnum') %>% 
  dplyr::select(-effect,-group) %>% 
kable(x = .,caption = 'All mixed linear models', digits = 2, format = 'simple')

Table 25.1: All mixed linear models
StationName	question	term	estimate	std.error	statistic	df	p.value	conf.low	conf.high
Pasta with legumes	I_like_taste_of_pasta_with_legumes	answnum	37.17	11.37	3.27	19.75	0.00	13.43	60.90
Pasta with legumes	I_like_taste_of_pasta_with_mushrooms	answnum	30.95	23.93	1.29	26.91	0.21	-18.16	80.06
Pasta with legumes	Pasta_with_legumes_is_visually_appealing	answnum	13.60	13.12	1.04	15.85	0.32	-14.23	41.42
Pasta with legumes	Pasta_with_mushrooms_is_visually_appealing	answnum	-7.68	19.63	-0.39	24.65	0.70	-48.14	32.77
Pasta with mushroom	I_like_taste_of_pasta_with_legumes	answnum	2.08	11.29	0.18	16.79	0.86	-21.76	25.92
Pasta with mushroom	I_like_taste_of_pasta_with_mushrooms	answnum	30.55	21.15	1.44	23.73	0.16	-13.12	74.22
Pasta with mushroom	Pasta_with_legumes_is_visually_appealing	answnum	10.26	11.21	0.91	15.70	0.37	-13.55	34.07
Pasta with mushroom	Pasta_with_mushrooms_is_visually_appealing	answnum	8.16	16.73	0.49	20.13	0.63	-26.72	43.03

tbmixed %>% 
  filter(term=='answnum') %>% 
  ggplot(data = ., aes(question,estimate,ymin = conf.low, ymax = conf.high)) + 
  geom_errorbar(width = 0.1) +geom_point()+ 
  geom_hline(yintercept = 0) + 
  coord_flip() +facet_grid(~StationName) + 
  theme(legend.position = 'bottom')

Do the associations match as expected?

25.1 With several variables

We can add several predictors to the model, here that could several likert-scale questions, and maybe demographics with the consumption as response. This is in principle the same for both linear models and linear mixed models.

x <- pasta %>% 
  filter(str_detect(StationName,'mush'))

mdl <- lmer(data = x, Consumption~I_like_taste_of_pasta_with_mushrooms + 
       Pasta_with_mushrooms_is_visually_appealing + Day + (1|Person))

summary(mdl)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: 
## Consumption ~ I_like_taste_of_pasta_with_mushrooms + Pasta_with_mushrooms_is_visually_appealing +  
##     Day + (1 | Person)
##    Data: x
## 
## REML criterion at convergence: 311.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.1024 -0.6475 -0.2450  0.4295  2.0689 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  Person   (Intercept) 1168     34.18   
##  Residual             5144     71.73   
## Number of obs: 30, groups:  Person, 15
## 
## Fixed effects:
##                                            Estimate Std. Error      df t value
## (Intercept)                                 -41.738    146.340  22.135  -0.285
## I_like_taste_of_pasta_with_mushrooms         41.162     27.855  25.781   1.478
## Pasta_with_mushrooms_is_visually_appealing  -12.242     20.899  23.494  -0.586
## Day                                          -8.286     27.189  14.743  -0.305
##                                            Pr(>|t|)
## (Intercept)                                   0.778
## I_like_taste_of_pasta_with_mushrooms          0.152
## Pasta_with_mushrooms_is_visually_appealing    0.564
## Day                                           0.765
## 
## Correlation of Fixed Effects:
##             (Intr) I_____ P_____
## I_lk_ts____ -0.605              
## Pst_wth____ -0.177 -0.645       
## Day         -0.354 -0.130  0.263

mdl %>% tidy(conf.int = T)

## # A tibble: 6 × 10
##   effect   group    term     estimate std.error statistic    df p.value conf.low
##   <chr>    <chr>    <chr>       <dbl>     <dbl>     <dbl> <dbl>   <dbl>    <dbl>
## 1 fixed    <NA>     (Interc…   -41.7      146.     -0.285  22.1   0.778   -345. 
## 2 fixed    <NA>     I_like_…    41.2       27.9     1.48   25.8   0.152    -16.1
## 3 fixed    <NA>     Pasta_w…   -12.2       20.9    -0.586  23.5   0.564    -55.4
## 4 fixed    <NA>     Day         -8.29      27.2    -0.305  14.7   0.765    -66.3
## 5 ran_pars Person   sd__(In…    34.2       NA      NA      NA    NA         NA  
## 6 ran_pars Residual sd__Obs…    71.7       NA      NA      NA    NA         NA  
## # ℹ 1 more variable: conf.high <dbl>

Try to interpret the slopes? Are the slopes significantly different from 0 (i.e. the point of no association). .. And hey! Why is the slope for visual all of a sudden negative?… Does that mean that consumption increase the less you like the visual appearance? .. Or what?

Complete the same analysis with legumes.