Just when you think you understand most of what there is to understand about Mixed Effect Models (and you feel all smug about it), you encounter something that challenges – don’t mind me being dramatic here – everything you thought you ever knew about them.

The Lay of the Land

In our little astronaut study, we collected data on the perceived traveled distance from 12 astronauts in 5 test sessions (two of them onboard the International Space Station), in three of which we tested them in two different postures (Sitting and Lying). In each test session and posture we tested participants three times in each of 10 distances. Pretty complex data but y’know, nothing a little mixed effects model can’t handle.

My initial analysis

Our hypotheses really relate only to the posture and the exposure to microgravity, so I set up the model in the following way:

# Model1 <- lmer(Ratio ~ PointInTime*Posture2 + (PointInTime + Posture2 + as.factor(Distance_Real)| id),
#                                     data = Task2_Astronauts %>%
#                                       mutate(Posture2 = case_when(
#                                         Posture == "Sitting" ~ "1Sitting",
#                                         TRUE ~ Posture)),
#                                     REML = FALSE,
#                                     control = lmerControl(optimizer = "bobyqa",
#                                                           optCtrl = list(maxfun = 2e5)))
load("Model1.RData")

Ratio: dependent variable, ratio of presented distance and response distance
Posture2: posture (Sitting, Lying, Space)
PointInTime: test session (BDC 1 (Earth), Space 1, Space 2, BDC 2 (Earth), BDC 3 (Earth))
Distance_Real: presented distance (6 to 24m in steps of 2m)
id: Participant ID

Since the distance didn’t really matter to us (following our hypotheses) but I still wanted to allocate variability to this source as much as warranted, I added random slopes for the presented distance (Distance_Real) per participant (id) – but not as fixed effect.

The Statistical Weirdness (?!)

Now in principle, if you have random slopes for a certain variable per a certain grouping variable (in our case the participant), and you also have this variable as a fixed effect (like Posture2 and PointInTime), the random effect slopes (of which we get one per participant) should add up to zero: the fixed effect captures group-wide effects while the random slopes per participant capture to what extent each participant differs from this average. This is, to some extent, true if we look at the random slopes for Posture2 per ID:

Ranefs_Model1 = ranef(Model1)
Ranefs_Model1$id$Posture2Lying

##  [1] -0.005241311 -0.024258265  0.080092023 -0.091645785  0.048171712
##  [6]  0.019932352  0.024018573 -0.043306412  0.084845389 -0.032249824
## [11] -0.041496352  0.009501675

mean(Ranefs_Model1$id$Posture2Lying)

## [1] 0.002363648

However, when we look at the same thing for the difference between BDC 1 and, for example, BDC 2, we get a very different picture:

Ranefs_Model1 = ranef(Model1)
Ranefs_Model1$id$`PointInTimeSession 4 (BDC 2)`

##  [1] -0.11186842 -0.19108941 -0.13270094 -0.61690659 -0.31983742 -0.19835978
##  [7] -0.45556264  1.13616781 -0.05930019 -0.31883011 -0.07153968 -0.06103040

mean(Ranefs_Model1$id$`PointInTimeSession 4 (BDC 2)`)

## [1] -0.1167381

Here we see that the mean random slope for this difference contrast is about -0.11 across all 12 astronauts. Compare this to a corresponding fixed effect parameters of +0.14 for the same difference contrast:

summary(Model1)$coef["PointInTimeSession 4 (BDC 2)",]

##    Estimate  Std. Error          df     t value    Pr(>|t|) 
##  0.14498685  0.06541121 19.51771592  2.21654432  0.03871216

… which is, stupidly, significantly different from 0 (as per Satterthwaite-approximated p values from lmerTest). So the model basically invents a significant difference by making the random effects, on average, too low and then compensating for that by raising the fixed effect parameter estimate. This of course doesn’t affect the model fit because, you know, if we subtract -1000 from each of the relevant random effects but then ADD 1000 to the fixed effect, the fit should overall be the same. But of course that’s, uh, bullshit because this difference is full-on hallucinated.

What causes this???

I was super confused as to where this discrepancy was coming from but we tried a couple of things (thanks to Rob Allison) and it turns out that, when we add Distance_Real as a fixed effect, everything is as it should be.

# Model2 <- lmer(Ratio ~ PointInTime*Posture2 + as.factor(Distance_Real) + (PointInTime + Posture2 + as.factor(Distance_Real)| id),
#                                     data = Task2_Astronauts %>%
#                                       mutate(Posture2 = case_when(
#                                         Posture == "Sitting" ~ "1Sitting",
#                                         TRUE ~ Posture)),
#                                     REML = FALSE,
#                                     control = lmerControl(optimizer = "bobyqa",
#                                                           optCtrl = list(maxfun = 2e5)))
load("Model2.RData")

Posture is fine:

Ranefs_Model2 = ranef(Model2)
Ranefs_Model2$id$Posture2Lying

##  [1] -0.007761663 -0.026212996  0.078095331 -0.094188657  0.046037046
##  [6]  0.017634526  0.021538760 -0.045780320  0.082209019 -0.034956477
## [11] -0.043927023  0.007312455

mean(Ranefs_Model2$id$Posture2Lying)

## [1] -6.066848e-14

And the comparison between test sessions as well:

Ranefs_Model2 = ranef(Model2)
Ranefs_Model2$id$`PointInTimeSession 4 (BDC 2)`

##  [1]  0.004311661 -0.073300687 -0.015951370 -0.500705246 -0.202416094
##  [6] -0.081062358 -0.338656024  1.252857484  0.056565407 -0.202826130
## [11]  0.045486487  0.055696870

mean(Ranefs_Model2$id$`PointInTimeSession 4 (BDC 2)`)

## [1] 9.127754e-13

Accordingly, the fixed effect parameter estimate is much lower here (+0.03; notedly the sum of the mean of the random slopes and the fixed effect from Model1) and not hallucinatedly significant:

summary(Model2)$coef["PointInTimeSession 4 (BDC 2)",]

##   Estimate Std. Error         df    t value   Pr(>|t|) 
##  0.0282997  0.1199541 12.1321814  0.2359210  0.8174297

(The same is true if we take away the random slopes for Distance_Real per participant, by the way.)

The morals of this story are…

Don’t do random effects without fixed effects kids, not even once
Don’t ever trust results that don’t really match with your plots
No I do not understand why this is a thing – mind you, the presence or absence of a fixed effect (Distance_Real) affects fitted random slopes for an unrelated variable (PointInTime for the grouping variable id)
If you have a good explanation, yes I would love to hear it.
An extremely simply solution would be, in my naivest of minds, to subtract the average of the offending random effects (-0.11 for PointInTime: BDC 2 per id) from each individual fitted random slope and then add this average to the fixed effect parameter (+0.14 for PointInTime BDC 2) estimate (which would yield an appropriate, very likely non-significant +0.03 for the fixed effect parameter estimate for PointInTime BDC 2)

Lonely Random Effects and Hallucinated Fixed Effects

Björn Jörges

The Lay of the Land

My initial analysis

The Statistical Weirdness (?!)

What causes this???

The morals of this story are…