Is hierarchical configs desirable or boiler plate? #2836
-
I am new to hydra and one of the things that confuses me is the duality between Python code and hydra config. I have seen projects using hydra that have a However, this means that (1) default values are potentially specified in two places (in the code and config) and (2) forces developers to go back and forth between the code and configs to really understand what is going on. So my question is: is this pattern considered desirable, or if it is done only because it is necessary to enable configuration of a deeply nested parameter. As a concrete example, consider # main.py
class DefaultCutoff:
def __init__(self, threshold: float = 0.1):
self.threshold = threshold
class DefaultPostProcessor:
def __init__(self, cutoff):
self.cutoff = cutoff
class App:
def __init__(self, pre_processor = None, main_processor = None, post_processor = None):
self.pre_processor = pre_processor or DefaultPreProcessor()
self.main_processor = main_processor or DefaultMainProcessor()
self.post_processor = post_processor or DefaultPostProcessor() In order to be able to configure app:
_target_: main.App
post_processor:
_target_: main.DefaultPostProcessor
cutoff:
_target_: main.DefaultCutoff
threshold: 0.1 # default value is duplicated! In an ideal world, I would like to keep the default value in the code. It helps the reader understand the meaning of the value because it is close to the docstring. Also the reader doesn't have to jump to the corresponding config file to understand what a piece of code does most of the time. However, this is not the case when we instantiate a nested object with As an alternative, I can imagine putting everything in the default argument # main.py
class DefaultCutoff:
def __init__(self, threshold: float = 0.1):
self.threshold = threshold
class DefaultPostProcessor:
def __init__(self, cutoff_factory = DefaultCutoff):
self.cutoff = cutoff_factory()
class App:
def __init__(
self,
pre_processor_factory = DefaultPreProcessor,
main_processor_factory = DefaultMainProcessor,
post_processor_factory = DefaultPostProcessor,
):
self.pre_processor = pre_processor_factory()
self.main_processor = main_processor_factory()
self.post_processor = post_processor_factory() This is functionally the same as the first app:
_target_: main.App
post_processor_factory:
_partial_: True
cutoff_factory:
_partial_: True
threshold: 0.5 Here, I am using app:
_target_: main.App
post_processor_factory:
cutoff_factory:
threshold: 0.5 because when the default value is a callable, I believe something like this or similar (with a support from hydra) could make it easier for developers to understand the code (because of the presence of the default values), reduce duplication between code and configs, and reduce the config file to describe only the non-default values. I would love to hear your thoughs! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
Are you familiar with structured configs? In many cases you should be able to leverage them to keep everything in the code. If this isn't enough, you may also be interested in hydra-zen. |
Beta Was this translation helpful? Give feedback.
-
I believe hydra-zen has. |
Beta Was this translation helpful? Give feedback.
I believe hydra-zen has.