mice runtime on large datasets #404
Unanswered
amirdol
asked this question in
Getting started
Replies: 1 comment 1 reply
-
This should not take 10 hours.
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi mice contributors!
I have a research project which requires imputation and I'm not sure what is the best way to approach a problem I encountered. Any help will be greatly appreciated.
I have a large data set of 500K rows, with ~20 variables that have ~30% missing values on average.
Running any mice algorithm (cart, norm etc.) with m=5 iterations is a very very long process (can take ~10 hours).
I'm using the
ignore
argument in order to train my model on a smaller subset of the data (~150K rows) with the assumptions this speeds up the imputation process (is this right?)Are there any other suggestions re. how to shorten the time it takes the mice function to complete it's run?
Beta Was this translation helpful? Give feedback.
All reactions