Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build Variable Importance Function #4

Open
gaffney2010 opened this issue Jan 21, 2018 · 0 comments
Open

Build Variable Importance Function #4

gaffney2010 opened this issue Jan 21, 2018 · 0 comments

Comments

@gaffney2010
Copy link
Collaborator

Per Dan:

Build in variable importance function that uses:
built in functions with sci-kit learn
Shapley Value based importance (run-time would be 2^n (number of models to fit) where n is the number of predictors/features in the model)
Perhaps we could use correlation to make a network so that instead of testing all coalitions, we only test those with high correlation
The assumption would be that the contribution of independent variables woud be roughly additive. (this seems fair)
We would still look at all possible subsets, but for uncorrelated variables, we could just add up their contributions
If Shaply Value importance is fit on training and evaluated on holdout, then after we calculate Shapley we could just remove all variables with a negative shapley value
This would be an alternative to forward/backward regression for variable selection

Figure out a way to evaluate variable importance when using dummy variables
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant