Missing Data Mechanisms
Three types of missingness (Rubin, 1976):
MCAR: Missing Completely at Random
Definition: Probability of missingness is the same for all observations.
Mathematically: \(P(R | Y_{obs}, Y_{mis}) = P(R)\)
Example: Data lost due to random computer error.
For MICE: Complete case analysis is unbiased under MCAR, but MICE improves efficiency.
MAR: Missing at Random
Definition: Probability of missingness depends on observed data but not on the missing values themselves.
Mathematically: \(P(R | Y_{obs}, Y_{mis}) = P(R | Y_{obs})\)
Example: Younger people less likely to report income (age observed, income missing).
For MICE: This is the key assumption. MICE produces valid results under MAR.
MNAR: Missing Not at Random
Definition: Probability of missingness depends on the unobserved (missing) values.
Example: People with higher incomes less likely to report income.
For MICE: MICE may produce biased results under MNAR. Consider sensitivity analyses.
Practical Implications
Making MAR plausible: Include variables that:
Predict missingness
Correlate with incomplete variables
Help explain why data is missing
See Understanding Missing Data for practical guidance.