Project-Customer-Churn-Python

#Data cleaning

1.Checked missing/nulls

2.Found correlations of variables & importance of variables

3.Deal with outliers

4.Handle skewed numerical features with log +1 transformation

#EDA

Hist for numeric variables
Boxplot for categorical variables

#Data Transformation (for model processing )

Encoded response variable (RENEW) into factor 1 and factor 0
Reduced the number of categorical predictors levels (4)
Created dummy variables to represent binary factors for categorical predictors

(replace: M2EXCFLG; HOMEFCTYCHANGE; RECENTMOVING );;

(one hot encoding: F2HOMRGN ) ;;each categorical feature with n categories is transformed into n binary features

Data Normalization

#Feature selection: for numerical predictors vs categorical response (RENEW)

Univariate feature selection works by selecting the best features based on univariate statistical tests

F_classif: ANOVA F-value between label/feature for classification tasks

Wrapper-based: consider the selection of a set of features as a search problem

Recursive feature elimination (RFE) is to select features by recursively considering smaller and smaller sets of features

Embedded: use algorithms that have built-in feature selection methods

Tree-based feature selection (RF)

Principal Components Analysis (PCA)

#Build Machine Learning Models

Evaluate the model using various metrics (including precision and recall):

1.Logistic Regression: predictors should be independent; numerical variables only

After f-test feature selection +LogisticRegressionCV: accuracy=0.7;recall=0.7

After RFE feature selection +LogisticRegressionCV: accuracy=0.7;recall=0.7

Drawback: Logistic Regression learns a linear decision surface that separates your classes. It could be possible that our 2 classes may not be linearly separable. In such a case, I need to look at other classifiers such Support Vector Machines which are able to learn more complex decision boundaries. I also started looking at Tree-Based classifiers such as Decision Trees which can learn rules from the dataset

Random Forest: works for the mixed dataset(Numerical features+Categorical features); can deal with the missing values and outliers

After f-test feature selection: accuracy=0.75;recall=0.75

After f-test feature selection+ Grid search: accuracy = 0.75;recall=0.75

After tree-based feature selection+ Grid search: accuracy = 0.74;recall=0.74

3.KNN: numeric features only; require features to be scaled; sensitive to the feature selection

Larger K value leads to smoother decision boundary (less complex model). Smaller K leads to more complex model (may lead to overfitting)

After f-test feature selection: k=61, accuracy=0.70;recall=0.70

After RFE feature selection: k=62, accuracy=0.70;recall=0.69

After tree-based feature selection: k=70, accuracy=0.70; recall=0.70

SVM: require features to be scaled; choose proper kernels functions; numerical variables only; It is not suitable for large dataset( longer training time); sensitive to kernels functions

C (penalty number) : represents the misclassification and error. A smaller value of C creates a small-margin hyperplane and a larger value of C creates a larger-margin hyperplane

Gamma: A lower value of Gamma will loosely fit the training dataset, whereas a higher value of gamma will exactly fit the training dataset, which causes over-fitting

After f-test feature selection: accuracy=0.70;precision=0.70;recall=0.70

After RFE feature selection: accuracy=0.69;precision=0.68;recall=0.69

After tree-based feature selection: accuracy=0.69;precision=0.69;recall=0.69

Grid search: {'C': 10, 'gamma': 0.001, 'kernel': 'rbf'}; accuracy=0.71; precision=0.70;recall=0.71

PCA (n=4) + SVM: accuracy=0.70;precision=0.70;recall=0.70

#Design and Build Neural Network using TensorFlow (Sequential Model)

Use the subset from f-test feature selection
Desigin multi-layers sequential model in keras with the rectified linear activation and the sigmoid on the output layer

3.Leverage the model performance by adding 0.2 dropout, proper batch size and epoch size, and adapt the Adam optimization algorithm to update network weights iterative based on training data

Tune the model by adding early stoping, resulting in 72% model accuracy with 32 batch sizes and 200 epochs

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
DM_Churn_EDA.pdf		DM_Churn_EDA.pdf
DM_Churn_EDA.py		DM_Churn_EDA.py
DM_Churn_Models.pdf		DM_Churn_Models.pdf
DM_Churn_Models.py		DM_Churn_Models.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project-Customer-Churn-Python

About

Releases

Packages

Languages

yfwne01/Project-DM_Customer-Churn-Python

Folders and files

Latest commit

History

Repository files navigation

Project-Customer-Churn-Python

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages