MICE - Multiple Imputation by Chained Equations

Multiple imputation by chained equation implemented from scratch.

Example 1: iris dataset

Load the iris data from sklearn and introduce missing values with pyampute package

from sklearn.datasets import load_iris
from pyampute.ampute import MultivariateAmputation

iris = load_iris(as_frame=True, return_X_y=False)["data"]
ma = MultivariateAmputation()
X_amp = ma.fit_transform(iris.to_numpy()) # pyampute requires the input as numpy array

Now we can apply MICE in the amputed dataset

from src import mice
imp = mice.mice(X, n_iterations = 20, m_imputations = 10, seed=42)

Example 2: distribution plot for the sample data

After imputation you should make diagnostic plots and check the distribution of the multiply imputed datasets comparing with the complete case data. Bellow you can find the plot for the example we provide in /tests directory:

import seaborn as sns
import matplotlib.pyplot as plt

p = 3 # column to be plotted
custom_lines = [plt.Line2D([0], [0], color="red", lw=4),
                plt.Line2D([0], [0], color="grey", lw=4),
                plt.Line2D([0], [0], color="blue", lw=4)]

fig, ax = plt.subplots()

for m in range(len(imp)):
    sns.kdeplot(imp[m][:, p], label="Imputed", color="black", lw=0.2, ax=ax)
sns.kdeplot(X_amp[:,p], label="Missing", color="blue", ax=ax)
sns.kdeplot(df.to_numpy()[:, p], label="Complete", color="red",ax=ax)
plt.xlabel("Age (years)")
ax.legend(custom_lines, ['Complete', 'Imputed', 'Missing'], loc="upper left")
plt.savefig("qol_distribution_mice.png")

Beware

This is a low performance implementation meant for pedagogical purposes only. There are several limitations and improvements that can be made, for research please use one of the available packages for multiple imputation:

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.github/workflows		.github/workflows
data		data
examples		examples
src		src
tests		tests
LICENSE		LICENSE
README.md		README.md
_config.yml		_config.yml
deploy.py		deploy.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MICE - Multiple Imputation by Chained Equations

Example 1: iris dataset

Example 2: distribution plot for the sample data

Beware

About

Releases

Packages

Languages

License

phydev/mice

Folders and files

Latest commit

History

Repository files navigation

MICE - Multiple Imputation by Chained Equations

Example 1: iris dataset

Example 2: distribution plot for the sample data

Beware

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages