Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Recursive config interpolation with custom object resolvers #1758

Open
jalabort opened this issue Aug 10, 2021 · 8 comments · May be fixed by #2269
Open

[Feature Request] Recursive config interpolation with custom object resolvers #1758

jalabort opened this issue Aug 10, 2021 · 8 comments · May be fixed by #2269

Comments

@jalabort
Copy link

jalabort commented Aug 10, 2021

🚀 Feature Request

First of all, thanks for open sourcing this amazing project!

Is there a reason why we need to force interpolations to be eagearly resolved by this line, rather than lazily resolved as they are needed, when instantiating objects?

Looking at the commits history I can see this line was introduced in #1487 as part of a fix for #1483 and seems to also be related to #1295 as per this comment.

Motivation

The reason I am asking is because eagerly resolving the config prevents the instantiation of this minimal example:

import hydra
from omegaconf import OmegaConf


class Encoder:
    def __init__(self, param1: bool, param2: int) -> None:
        OmegaConf.register_new_resolver(
            "encoder",
            lambda x: getattr(self, x),
            replace=True,
            use_cache=False,
        )

        self.param1 = param1
        self.param2 = param2

    @property
    def out_channels(self) -> int:
        if self.param1:
            return 10
        else:
            return 100


class Decoder:
    def __init__(self, in_channels: int, param2: int) -> None:
        OmegaConf.register_new_resolver(
            "decoder",
            lambda x: getattr(self, x),
            replace=True,
            use_cache=False,
        )
        self.in_channels = in_channels
        self.param2 = param2

    @property
    def out_channels(self) -> int:
        return self.in_channels // 2


class Model:
    def __init__(self, encoder: Encoder, decoder: Decoder):
        param2 = 5
        self.encoder = encoder or Encoder(param1=False, param2=param2)
        self.decoder = decoder or Decoder(
            in_channels=self.encoder.out_channels, param2=param2
        )


if __name__ == "__main__":
    cfg = OmegaConf.create(
        {
            "_target_": "__main__.Model",
            "encoder": {"_target_": "__main__.Encoder", "param1": True, "param2": 5},
            "decoder": {
                "_target_": "__main__.Decoder",
                "in_channels": "${encoder:out_channels}",
                "param2": "${..encoder.param2}",
            },
        }
    )

    m = hydra.utils.instantiate(cfg)
    assert m.decoder.out_channels == 5

More specifically, the eager resolution of the config prevents the correct instantiation of the decoder, in this example, because "in_channels": "${encoder:out_channels}" cannot be resolved (the resolver for theencoder does not yet exist!).

This example is a simplification of some of my encoder/decoder model configs but, hopefully, it communicates the idea in a clear manner.

Pitch

Removing the eager resolution of the config by removing this line would allow this kind of recursive config interpolation with custom object resolvers to be correctly instantiated by the current hydra.utils.instantitate function. I am not sure what the consequences if doing that would be and, hence, I filled out this issue.

Alternatively, this type of recursive config interpolation with custom object resolvers might be an antipattern that I am not aware of. In this case, it would be good to have a little explanation regarding why this might be problematic.

@jalabort jalabort added the enhancement Enhanvement request label Aug 10, 2021
@jalabort jalabort changed the title [Feature Request] [Feature Request] Recursive config interpolation with custom object resolvers Aug 10, 2021
@Jasha10
Copy link
Collaborator

Jasha10 commented Aug 10, 2021

Hi @jalabort,

This is an interesting idea. In theory, it should be possible to use Encoder.__init__(self, ...) to register a resolver that points to self -- so long as the call to OmegaConf.register_new_resolver happens before any interpolations referencing the new resolver are dereferenced.

In practice, it may be safer to register your resolvers before using them in interpolations.

I believe that the line you mentioned was introduced to enable arguments toinstantiate to be detached from their parent configs; if OmegaConf.resolve is not called first, then detaching a config from its parent config would break any relative interpolations that go through the parent node (see this part of the diff from the same PR that you mentioned).
Detaching config objects from their parent configs seems to be a fix for some issues with serialization (à la #1483 and #1295). I suspect that removing the line in question would require re-addressing some of the logic introduced by that PR.

@omry
Copy link
Collaborator

omry commented Aug 11, 2021

Is there a reason why we need to force interpolations to be eagearly resolved by this line, rather than lazily resolved as they are needed, when instantiating objects?

Yes, as Jasha said - this is related to issues with config nodes being attached to the config tree, causing issues in some scenarios.
There are no plans of changing this behavior.

@jalabort
Copy link
Author

jalabort commented Aug 11, 2021

Thanks for your swift replies @Jasha10 and @omry!

Would you be open to consider a PR where the eager resolution happens inside the _call_target function?

Essentially, removing line 175 from _instantiate2.py and changing the function _call_target on that same file to:

def _call_target(_target_: Callable, *args, **kwargs) -> Any:  # type: ignore
    """Call target (type) with args and kwargs."""
    try:
        args, kwargs = _extract_pos_args(*args, **kwargs)
        # Detaching configs from parent.
        # The parent link can cause issues when serializing objects in
        # some scenarios.
        for arg in args:
            if OmegaConf.is_config(arg):
                # Ensure everything is resolved
                OmegaConf.resolve(arg)
                arg._set_parent(None)
        for v in kwargs.values():
            if OmegaConf.is_config(v):
                # Ensure everything is resolved
                OmegaConf.resolve(v)
                v._set_parent(None)

        return _target_(*args, **kwargs)
    except Exception as e:
        raise type(e)(
            f"Error instantiating '{_convert_target_to_string(_target_)}' : {e}"
        ).with_traceback(sys.exc_info()[2])

This change seems to generate the expected output for the examples given in #1483 and #1295 and it also covers my usecase of recursive config interpolation with custom object resolvers. I had a look at your contributing and develoment guide sections and I believe the current tests for the instantiation part also pass (pytest tests/instantiate) with the proposed change.

When you have time, let me know your thoughts on this; I'd be happy to open a PR and try to contribute this change if you think it has merit.

Thanks again!

@omry
Copy link
Collaborator

omry commented Aug 11, 2021

I think making instantiate more nuanced is not something I want to do.
As you have noticed, the code is complicated and supports many different use cases simultaneously.
Such a solution would only add more constraints to an already very complicated system, and in return it will work for your particular use case but will also introduce inconsistent behavior for everyone else.

I am open to consider a per-node configuration to turn off eager interpolation resolution similar to flags like _recursive_ and _conver_mode_ .
Keep in mind that this will not make it to Hydra 1.1 and we have not yet started to develop 1.2 (It will take months before we start working on the next version).

@jalabort
Copy link
Author

jalabort commented Aug 12, 2021

Understood.

I'd be happy to contribute a per-node configuration solution to enable this feature request; if you could give me a gentle nudge when the time is right for me to do so that'd be appreciated.

Meanwhile I'll patch up your _call_target function on my end.

Thanks for the prompt reply once again.

@omry omry added this to the Hydra 1.2.0 milestone Aug 12, 2021
@Jasha10
Copy link
Collaborator

Jasha10 commented Sep 19, 2021

I'd be happy to contribute a per-node configuration solution to enable this feature request; if you could give me a gentle nudge when the time is right for me to do so that'd be appreciated.

@jalabort FYI work on Hydra 1.2 has begun.

@ibro45
Copy link

ibro45 commented Oct 11, 2022

This is a super useful feature that I've actually been waiting for for a while, do you think it'll make it into 1.3.0? Thanks!

@Jasha10
Copy link
Collaborator

Jasha10 commented Oct 18, 2022

Hi @ibro45,
I think it's unlikely we'll be able to implement this in time for 1.3 due to bandwidth constraints.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants