diff --git a/notes/2024-10-17.md b/notes/2024-10-17.md index c00498b..b183bac 100644 --- a/notes/2024-10-17.md +++ b/notes/2024-10-17.md @@ -109,7 +109,7 @@ X_train, X_test, y_train, y_test = train_test_split(iris_df[feature_vars], This function returns multiple values, the docs say that it returns [twice as many](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html#:~:text=splittinglist%2C%20length%3D2%20*%20len(arrays)) as it is passed. We passed two separate things, the features and the labels separated, so we get train and test each for both. ```{note} -If you get different numbers fort the index than I do here or run the train test split multipe times and see things change, you have a different ranomd seed above. +If you get different numbers for the index than I do here or run the train test split multipe times and see things change, you have a different ranomd seed above. ``` ```{code-cell} ipython3 @@ -298,7 +298,8 @@ Gaussian Naive Bayes is a very simple model, but it is a {term}`generative` mode ```{code-cell} ipython3 N = 20 -gnb_df = pd.DataFrame(np.concatenate([np.random.multivariate_normal(th, sig*np.eye(4),N) +n_features = len(feature_vars) +gnb_df = pd.DataFrame(np.concatenate([np.random.multivariate_normal(th, sig*np.eye(n_features),N) for th, sig in zip(gnb.theta_,gnb.var_)]), columns = gnb.feature_names_in_) gnb_df['species'] = [ci for cl in [[c]*N for c in gnb.classes_] for ci in cl]