Method Details

Brief technical overview of the five imputation methods in mice-py.

PMM: Predictive Mean Matching

Algorithm:

Key feature: Imputed values come from observed data (prevents impossible values).

Best for: Numeric variables, preserving distributions, data with outliers.

Parameters: pmm_donors (default: 5), pmm_matchtype, pmm_ridge

Algorithm:

Key feature: Automatically captures interactions and non-linear patterns.

Best for: Non-linear relationships, interactions, categorical variables.

Parameters: cart_max_depth, cart_min_samples_split, cart_min_samples_leaf

Algorithm:

Key feature: More stable than single tree, handles complexity well.

Best for: Complex patterns, high-dimensional data, many interactions.

Parameters: rf_n_estimators (default: 100), rf_max_depth, rf_max_features

Algorithm:

Key feature: Uses local structure of data, good for skewed distributions.

Best for: Small samples, skewed distributions, when PMM struggles.

Parameters: midas_donors (default: 5), midas_ridge

Algorithm:

Key feature: Simplest method, preserves marginal distribution exactly.

Best for: Initial imputation, categorical variables with many levels, quick exploration.

Parameters: None

See Imputation Methods for practical selection guidance.