ENH: Add default losses to KerasClassifier and KerasRegressor #208

stsievert · 2021-02-27T03:02:27Z

What does this PR implement?
It adds loss="categorical_crossentropy" to KerasClassifier by default. It adds some protection in case the user passes multiple classes in y but the model is only configured for one class. This implementation covers the following:

multi-class classification with multiple outputs in the final layer of the model (e.g., Dense(10) for MNIST).
binary classification with a single output (e.g., Dense(1)).

I test both these cases.

Reference issues/PR
This PR closes #206

adriangb

Thank you for the PR. I left some more general questions.

adriangb · 2021-02-27T04:56:51Z

tests/test_simple_usage.py

+    * binary classification
+    * one hot classification
+    * single class classification


Does this mean we do not support 1 output with multiple classes? Am I getting confused by the usage of outputs vs. output units.

The most common way to set up single target multi-class problems in Keras is with output=Dense(n_classes, activation="softmax") and one of categorical_crossentropy or sparse_categorical_crossentropy:

def clf(): model = tf.keras.Sequential() model.add(tf.keras.layers.Input(shape=(FEATURES,))) model.add(tf.keras.layers.Dense(N_CLASSES, activation="softmax")) model.compile(loss="cce") # or "scce" return model

Would this use no longer be supported?

I think I see now. This should work!

I've made the uses cases where this works more clear in 9358d6e. It works for all major use cases.

This PR simply changes the default loss; it doesn't change compatibility in any way.

adriangb · 2021-02-27T04:59:40Z

scikeras/wrappers.py

+    def _fit_keras_model(self, *args, **kwargs):
+        try:
+            super()._fit_keras_model(*args, **kwargs)
+        except ValueError as e:
+            if (
+                self.loss == "categorical_crossentropy"
+                and hasattr(self, "model_")
+                and 1 in {o.shape[1] for o in getattr(self.model_, "outputs", [])}
+            ):
+                raise ValueError(
+                    "The model is configured to have one output, but the "
+                    f"loss='{self.loss}' is expecting multiple outputs "
+                    "(which is often used with one-hot encoded targets). "
+                    "More detail on Keras losses: https://keras.io/api/losses/"
+                ) from e
+            else:
+                raise e


This seems like it should live in _check_model_compatibility, or they should be merged in some way.

This error message only provides marginal utility: it protects against cases when the model has one output but there are multiple classes.

It can not go in _check_model_compatibility; I wait for an error to be raised before issuing this warning (otherwise a model with a single output raises an error).

Got it. Is there a specific error message we can check for, like if "some Keras error" in str(e)?

getattr(self.model_, "outputs", [])

Is this necessary? model_ should always have an outputs attribute, except in the case described in #207, but that should be a separate check/error.

f"loss='{self.loss}' is expecting multiple outputs "

Can you clarify what you mean by a loss expecting a number of outputs? My understanding is that Keras "broadcasts" losses to outputs, so if you give it a scalar loss (ie.. loss="bce") with 2 outputs (i.e. len(model_.outputs) == 2), it will implicitly compile the model with loss=[original_loss] * len(outputs). But you can actually map losses to outputs manually, by passing loss=["bce", "mse"] or loss={"out1": "bce", "out2": "mse"}. From the tests, it seems like by "loss is expecting multiple outputs" you mean that there is a single output unit but multiple classes, which I feel like could be confused with the above concept of configuring a loss for multiple outputs.

I'm also curious about the iteration through outputs (1 in {o.shape[1] for o in self.model_.outputs}). SciKeras does not support >1 output out of the box (users need to override target_encoder) so it seems a bit strange to try to account for that when using the default loss. I feel that using the default loss should only be supported for the simple single-output cases that target_encoder supports.

As a side note: I think giving users better errors and validating their inputs like you are doing here can be a very valuable part of SciKeras, but currently it is done in an ad-hoc manner via _check_model_compatibility, etc. I think if we add more of these types of things, it would be nice to have an organized interface for it. I opened #209 to try to brainstorm ideas for this.

adriangb · 2021-02-28T19:34:51Z

Should we do anything for models with a single output and a single linear output unit and 2 classes? This is a subset of your error catching / test, but Keras won't raise a ValueError for it. Instead, it will train and not learn anything. The same model with loss="bce" will easily predict with 100% accuracy:

import numpy as np
import tensorflow as tf

from scikeras.wrappers import KerasClassifier


N_CLASSES = 2
FEATURES = 1
n_eg = 10000
y = np.random.choice(N_CLASSES, size=n_eg).astype(int)
X = y.reshape(-1, 1).astype("float32")


def clf():
    model = tf.keras.Sequential()
    model.add(tf.keras.layers.Input(shape=(FEATURES,)))
    model.add(tf.keras.layers.Dense(4))
    model.add(tf.keras.layers.Dense(1))
    return model


model = KerasClassifier(clf, loss="categorical_crossentropy", epochs=50)

model.fit(X, y).score(X, y)  # ~ 0.5

adriangb

Looks like you're making some progress, thank you for keeping at it. If you're budding up against annoying / hard to debug tests let me know, I'm happy to jump in and try to iron it out.

scikeras/_types.py

stsievert · 2021-03-02T04:11:54Z

I've got most of the issues resolved:

A warning is issued for a binary classification target but loss="categorical_crossentropy"; the user might have compiled their own model.
I removed the GitHub action dependence; why should the tests depend on linting? It's really annoying.

The tests on fully compliant Scikit-learn estimators are failing now. I'd appreciate some help.

adriangb

A warning is issued for a binary classification target but loss="categorical_crossentropy"; the user might have compiled their own model.

👍

I removed the GitHub action dependence; why should the tests depend on linting? It's really annoying.

This is a somewhat arbitrary choice. I'm open to removing it, but I'd like that done in a separate PR. If you use the git pre-commit hook, you should never have a problem being annoyed by linting because you won't even be able to make a commit that fails linting (unless you manually override it).

The tests on fully compliant Scikit-learn estimators are failing now

I'll take a look.

scikeras/wrappers.py

tests/test_simple_usage.py

adriangb · 2021-03-02T04:33:34Z

tests/test_simple_usage.py

+    est = KerasRegressor(model=shallow_net, model__single_output=True)
+    assert est.loss == "mse"
+    est.partial_fit(X, y)
+    assert est.model_.loss.__name__ == "mean_squared_error"


Is the assertion that the "long" name for the loss was used in the model necessary here? I don't see the same assertion for classifiers.

This assert statement is present to make sure the BaseWrapper.loss is mirror in BaseWrapper.model_.loss. I'll add a test for KerasClassifier too.

Maybe this is good use case for scikeras.utils.loss_name?

stsievert · 2021-03-02T22:01:44Z

Tests are passing now; I manually specified loss=None. I think loss="auto" should decide between binary_crossentropy and categorical_crossentropy, and I'm not a huge fan of changing the package implementation based on the tests.

On the whole, this PR makes SciKeras more usable, though technically backwards incompatible for a very narrow use case. That relies on a couple points:

Simple usage works now.
- Before: For some tests in test_simple_usage.py, the master branch raises a "ValueError: No valid loss function found. You must provide a loss function to train" for most of the tests in test_simple_usage.py.
- After: If I specify a commonly used model (e.g., a regressor with one output node or classifier with N_CLASSES output nodes) and supply training data, my Keras model will learn.
- Technically, this is backward incompatible if the user is compiling their own model and depending on loss is None in their function.
One less error raised:
- Before: An exception is raised if BaseWrapper.loss and BaseWrapper.model_.loss are different.
- After: a warning is issued for the same case. (if the user is compiling their own model, I presume they have a special use case or know what they're doing).

adriangb · 2021-03-03T20:17:36Z

I believe this PR is not breaking basic wrapper backwards compatibility:

import numpy as np
from scikeras.wrappers import KerasClassifier
from tensorflow import keras

num_classes = 3
n_features = 4
n_samples = 100

X = np.random.random(size=(n_samples, n_features))
y = np.random.randint(low=0, high=num_classes, size=(n_samples,))

def get_model():
    model = keras.Sequential(
        [
            keras.Input((n_features,)),
            keras.layers.Dense(n_features),
            keras.layers.Dense(num_classes, activation="softmax"),
        ]
    )
    model.compile(loss="sparse_categorical_crossentropy")
    return model

# Works via Keras
get_model().fit(X, y)

# Works via Keras wrappers
keras.wrappers.scikit_learn.KerasClassifier(get_model).fit(X, y)

# Does not work with SciKeras stsievert:clf-default-loss but does work on SciKeras master
KerasClassifier(get_model).fit(X, y)  # InvalidArgumentError: logits and labels must have the same first dimension

This can be "fixed" by using KerasClassifier(get_model, loss=None) like you do in the tests, but this is a breaking change so I'd like to call it out explicitly before moving forward.

As to why this happens, it is. because if loss="categorical_crossentropy" but the user passes non one-hot encoded labels, SciKeras (and the TF wrappers before it) will automatically one-hot encode the labels. So in this case, the labels get one-hot encoded but then the loss function is expecting non one-hot encoded labels.

...in user compiled model with N_CLASSES outputs

stsievert · 2021-03-04T03:09:19Z

This error only happens when both of the following occur:

When the user compiles the model themselves w/ loss="sparse_categorical_crossentropy".
When the default loss is not changed from categorical_crossentropy.

I'm going to raise an exception if the Keras model specifies loss="sparse_categorical_crossentropy" and, in the error message, tell the user to how to clear the error (changing the loss parameter in KerasClassifier.fit is not allowed by the Scikit-learn API).

adriangb · 2021-03-04T04:34:54Z

Can't this also be solved by just defaulting to "auto"?

This also makes me wonder if the design decision to put the description of the data and encoding / decoding into the same place was flawed. Or if perhaps inspecting the compiled model was better than relying on the user's input.

stsievert · 2021-03-04T05:03:16Z

Yes, loss="auto" is a good idea. It simplifies the tests, and also allows choosing loss="bce" if there is one output node, there's only two unique values in y and the user has not compiled the model.

adriangb · 2021-03-04T05:10:58Z

Yes, loss="auto" is a good idea.

I'll try to cook up an implementation of that in the next couple of days. Regardless, thank you for all of the great work you've done on this PR so far, I think it's done a lot to illuminate our options and challenges with this feature.

stsievert · 2021-03-04T05:21:20Z

I'd like to see a new function inserted after model creation and before model compiling to choose the some compile arguments based on the model architecture. It'd be useful to have this function to choose the loss function from the number of output neurons. That'd help a lot with the loss="auto" implementation and choosing between loss="bce" and loss="cce".

adriangb · 2021-03-04T05:27:41Z

We have _compile_model, although it currently also decides if it should compile the model or not. Perhaps we can split that into _should_compile_model and _compile_model where _compile_model is a designated dependency injection point for insertion of arbitrary compile logic based upon model architecture and any other pre-determined parameters (like the output from target_encoder)? Then SciKeras can by default use this to handle loss="auto", but users can also override it to support other uses.

stsievert · 2021-03-04T17:06:49Z

I'm thinking of this pseudo-code:

assert len(self.model_.outputs) == 1
out = self.model_.outputs[0]

if self.loss == "auto":
    if out.shape[1] == 1 and y_n_unique <= 2:
        loss = "binary_crossentropy"
    elif out.shape[1] > 1 and y_encoded.ndim == 1:
        loss = "sparse_categorical_crossentropy"
    elif out.shape[1] > 1 and y_encoded.ndim > 1:
        loss = "categorical_crossentropy"

self._compile_model(loss=loss)

This code isn't exact – I'm not sure if all the conditions for the loss functions are right.

This requires that _build_fn be called (for self.model_) and that the data be encoded (for metadata about y_encoded). It looks like this code could go an overwrite of _ensure_compiled_model (which needs a better name). KerasClassifier would also have to override _get_compile_kwargs to return a default loss value (the promise of compile_kwargs is that they're all valid arguments to self.model_.compile, right?).

adriangb · 2021-03-04T17:13:34Z

This seems pretty similar to what I am proposing in #210 , please check that PR if you can.

See thread in #208 (review) Since we use GHA as a FOSS project, we really don't pay for usage. This will increase usage because tests will run even if linting fails, but will help PR feedback TAT because failed linting won't preclude getting test data back.

adriangb · 2021-03-06T19:13:44Z

I removed the GitHub action dependence; why should the tests depend on linting? It's really annoying.

Resolved via #217

stsievert added 4 commits February 26, 2021 20:45

Add default loss to KerasClassifier

4be5d0c

update message/tests

dccd92b

black

1d80b57

isort

1f45285

adriangb reviewed Feb 27, 2021

View reviewed changes

stsievert added 2 commits February 28, 2021 12:40

better test

9358d6e

Add default loss for KerasRegressor

6cc112e

stsievert changed the title ~~ENH: Add default loss to KerasClassifier~~ ENH: Add default losses to KerasClassifier and KerasRegressor Feb 28, 2021

black

80618bf

adriangb mentioned this pull request Feb 28, 2021

ENH: programmatic validations and error handling #209

Open

stsievert added 6 commits February 28, 2021 21:53

catch binary cross entropy

60b2404

black

dcb0823

Clean type hints in __init__

0faadd9

isort

0449481

change KerasRegressor.__init__

ed4c1f5

tests run

c58ec74

adriangb reviewed Mar 2, 2021

View reviewed changes

scikeras/_types.py Show resolved Hide resolved

stsievert added 7 commits March 1, 2021 20:05

MAINT

e73710d

add right loss back

4e7e09f

Try removing binary_crossentropy check

2e830ff

black

e1ea339

remove annoying 'needs linting'

36e6499

Uncomment error

8310834

warn for user compiled models

6ee8b50

stsievert added 3 commits March 1, 2021 22:18

Union[T, None] → Optional[T]

b88b74e

DOC: complete docstring

3a3a536

DOC: complete docstring

9808cf2

adriangb reviewed Mar 2, 2021

View reviewed changes

update warning with more recommendations

0cf7610

stsievert added 6 commits March 3, 2021 20:09

TST: all classification losses

d4c3eea

...in user compiled model with N_CLASSES outputs

Re-initialize

7fab517

tmp

59e7012

loss_name is None

0386e4e

black

3a46538

Remove backticks

8f2b00b

stsievert added 7 commits March 3, 2021 21:12

typing for utils/*_name

e80338b

raise

7e23480

API: loss_name / metric_name return None

94df48a

try cce

35e1a6c

catch loss is not None

f092b7a

tmp

5af8b4c

typo

7b38bc8

adriangb mentioned this pull request Mar 4, 2021

Add loss="auto" as the default loss #210

Open

1 task

adriangb mentioned this pull request Mar 4, 2021

DOC: Better type hints #211

Open

adriangb mentioned this pull request Mar 6, 2021

MAINT: remove test dependance on linting #217

Merged

stsievert mentioned this pull request Mar 8, 2021

MAINT: better error message about one-hot encoded targets w/ loss="auto" #218

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Add default losses to KerasClassifier and KerasRegressor #208

ENH: Add default losses to KerasClassifier and KerasRegressor #208

stsievert commented Feb 27, 2021 •

edited

Loading

adriangb left a comment

adriangb Feb 27, 2021

adriangb Feb 27, 2021

stsievert Feb 28, 2021

adriangb Feb 27, 2021

stsievert Feb 28, 2021

adriangb Feb 28, 2021

adriangb Feb 28, 2021

adriangb commented Feb 28, 2021

adriangb left a comment

stsievert commented Mar 2, 2021 •

edited

Loading

adriangb left a comment

adriangb Mar 2, 2021

stsievert Mar 2, 2021

adriangb Mar 2, 2021

stsievert commented Mar 2, 2021 •

edited

Loading

adriangb commented Mar 3, 2021 •

edited

Loading

stsievert commented Mar 4, 2021

adriangb commented Mar 4, 2021

stsievert commented Mar 4, 2021

adriangb commented Mar 4, 2021

stsievert commented Mar 4, 2021

adriangb commented Mar 4, 2021

stsievert commented Mar 4, 2021 •

edited

Loading

adriangb commented Mar 4, 2021

adriangb commented Mar 6, 2021

ENH: Add default losses to KerasClassifier and KerasRegressor #208

Are you sure you want to change the base?

ENH: Add default losses to KerasClassifier and KerasRegressor #208

Conversation

stsievert commented Feb 27, 2021 • edited Loading

adriangb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adriangb commented Feb 28, 2021

adriangb left a comment

Choose a reason for hiding this comment

stsievert commented Mar 2, 2021 • edited Loading

adriangb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stsievert commented Mar 2, 2021 • edited Loading

adriangb commented Mar 3, 2021 • edited Loading

stsievert commented Mar 4, 2021

adriangb commented Mar 4, 2021

stsievert commented Mar 4, 2021

adriangb commented Mar 4, 2021

stsievert commented Mar 4, 2021

adriangb commented Mar 4, 2021

stsievert commented Mar 4, 2021 • edited Loading

adriangb commented Mar 4, 2021

adriangb commented Mar 6, 2021

stsievert commented Feb 27, 2021 •

edited

Loading

stsievert commented Mar 2, 2021 •

edited

Loading

stsievert commented Mar 2, 2021 •

edited

Loading

adriangb commented Mar 3, 2021 •

edited

Loading

stsievert commented Mar 4, 2021 •

edited

Loading