Use a parameter multiple times in hydra config file

Question:

I am currently trying to replace the usage of argparse with hydra files to set the hyperparameters of a deep learning neural network.

I succeeded in using a config.yaml file linked to a hydra main file to run a training and a prediction.

However, I am loading three .py files for the process and there are some common parameters between them (file path, number of labels for example).

Is there a way of using a parameter several times in a config.yaml file supported by hydra ?

Main file structure:

import time
from omegaconf import DictConfig, OmegaConf
from segmentation_monai import split, train, predict
import hydra
import warnings
from segmentation_monai import split
warnings.filterwarnings('ignore', category=UserWarning)

@hydra.main(config_path='.', config_name="config_bis")

def my_param(cfg:DictConfig) -> None:

    if cfg.split.run: split.main(cfg.split)
    if cfg.train.run: train.main(cfg.train)
    if cfg.predict.run: predict.main(cfg.predict)

if __name__ == "__main__":
    my_param()

Config file:

split:
  run: False
#  mandatory:
  root_path: D:/breast_seg/db_test
  data_dim: 3
  train_dim: 3
  [...]

train:
  run: False
# mandatory:
  root_path: D:/breast_seg/db_test
  data_dim: 3
  train_dim: 3
  [...]

predict:
  run: True
# mandatory:
  root_path: D:/breast_seg/db_test
  data_dim: 3
  train_dim: 3
  [...]

Thank you.

Asked By: ofares

||

Answers:

You can use the same parameter multiple in the config using OmegaConf interpolations.


# Extracting to an individual config node. 
# You can also reuse one of your own nodes for this.
data:
  room_path: D:/breast_seg/db_test
  data_dim: 3
  train_dim: 3

split:
  run: False
#  mandatory:
  root_path: ${data.root_path}
  data_dim: ${data.data_dim}
  train_dim: ${data.train_dim}
  [...]

train:
  run: False
# mandatory:
  root_path: ${data.root_path}
  data_dim: ${data.data_dim}
  train_dim: ${data.train_dim}
  [...]

predict:
  run: True
# mandatory:
  root_path: ${data.root_path}
  data_dim: ${data.data_dim}
  train_dim: ${data.train_dim}
  [...]
Answered By: Omry Yadan

The accepted answer violates DRY in a significant manner.

Add a single parameter to the common data set and all three locations must likewise have edits.

(I’m learning Hydra myself… for a strange reason, I’ve struggled in getting it, but working through this has helped…)

The OP question: Is there a way of using a parameter several times in a config.yaml file supported by hydra ?

Hydra can easily solve this issue, in a very clean manner in first example. Second will show how it can be expanded to have different params for each of the split, train, and predict.

Primary Example

The below was derived from details found at Hydra Overriding Packages Doc

First let’s look at the output:

cfg:
------------
split:
  params:
    root_path: D:/breast_seg/db_test
    data_dim: 3
    train_dim: 3
  run: true
train:
  params:
    root_path: D:/breast_seg/db_test
    data_dim: 3
    train_dim: 3
  run: false
predict:
  params:
    root_path: D:/breast_seg/db_test
    data_dim: 3
    train_dim: 3
  run: false

split.main:
------------
root_path: D:/breast_seg/db_test
data_dim: 3
train_dim: 3

This shows that each of the three config sets (split, train, predict) are receiving the common data params.

Note that the common data, and in the custom params in the next example, are all held under the params key. This enables the param, run, to be simply used as a switch to invoke said functionality, e.g. cfg.split.run, and the actual params for the functionality is only passed, e.g. cfg.split.params.

The code which produced the above:

# ----- myapp.py
import time
from omegaconf import DictConfig, OmegaConf
import hydra

config_name = 'config_with_base_plus_custom.yaml'   # secondary example
config_name = 'config.yaml'                         # primary example

@hydra.main(version_base='1.2', config_path='conf', config_name=config_name )
def my_param( cfg : DictConfig ) -> None:

    resolve = True

    print(f'cfg:n------------n{OmegaConf.to_yaml(cfg)}n')

    if cfg.split.run:   print(f'split.main:n------------n{OmegaConf.to_yaml(cfg.split.params)}')
    if cfg.train.run:   print(f'train.main:n------------n{OmegaConf.to_yaml(cfg.train.params)}')
    if cfg.predict.run: print(f'predict.main:n------------n{OmegaConf.to_yaml(cfg.predict.params)}')


if __name__ == "__main__":
    my_param()

Directory structure and yaml files:

|- myapp.py
|- conf
   |- config.yaml
   |- params
      |- common.yaml

The @split.params places the config found in params/common.yaml in package split.params. Likewise for the other two key sets. See the reference Hydra doc.

# ----- config.yaml
defaults:
    - [email protected]   : common
    - [email protected]   : common
    - [email protected] : common
    - _self_

split:
    run: True

train:
    run: False

predict:
    run: False
# ----- common.yaml
root_path: 'D:/breast_seg/db_test'
data_dim: 3
train_dim: 3

This is really clean and DRY.

Need another common param? Simply place it in common.yaml and it will be populated in appropriate locations.

Secondary Extended Example

Now let’s assume that one wants to have extended params for split that are basic and also able to extend.

In the myapp.py, swap the two config_name lines.

Extend the directory structure and add two yaml files:

|- myapp.py
|- conf
   |- config.yaml
   |- config_with_base_plus_custom.yaml
   |- params
      |- common.yaml
      |- split_base.yaml
      |- split_custom.yaml

The config.yaml is unused, it is from prior example.

The common.yaml is used and remains unchanged.

The other three files are as follows:

# ----- config_with_base_plus_custom.yaml (an expansion of original config.yaml)
defaults:
    - [email protected]   : common
    - [email protected]   : common
    - [email protected] : common
    - override [email protected]   : split_custom
    - _self_

split:
    run: True

train:
    run: False

predict:
    run: False
# ----- split_base
split_paramA: 'localhost'
split_paramB: 'base paramB'
split_paramC: ???
split_paramD: 'base paramD'
# ----- split_custom.yaml
defaults:
    - split_base
    - common

split_paramC: 'fills in required paramC'
split_paramD: 'custom paramD overrides base paramD'
split_paramE: 'unique to split custom'

The output is as follows:

cfg:
------------
split:
  params:
    split_paramA: localhost
    split_paramB: base paramB
    split_paramC: fills in required paramC
    split_paramD: custom paramD overrides base paramD
    root_path: D:/breast_seg/db_test
    data_dim: 3
    train_dim: 3
    split_paramE: unique to split custom
  run: true
train:
  params:
    root_path: D:/breast_seg/db_test
    data_dim: 3
    train_dim: 3
  run: false
predict:
  params:
    root_path: D:/breast_seg/db_test
    data_dim: 3
    train_dim: 3
  run: false


split.main:
------------
split_paramA: localhost
split_paramB: base paramB
split_paramC: fills in required paramC
split_paramD: custom paramD overrides base paramD
root_path: D:/breast_seg/db_test
data_dim: 3
train_dim: 3
split_paramE: unique to split custom

So several things to note:

  • The split key continues to have the same common data as the other two keys.
  • The split key gets additional params.
  • Those params are in a base, which also has a delayed, to be filled in later (???) key-value.
  • The params come from both the base and the custom.
  • split_paramA and split_paramB is only in the base.
  • split_paramC is filled in by custom.
  • split_paramD which occurs in both base and custom, is overridden by the custom.
  • split_paramE is only in the custom, not in base.

Personally I think that Hydra provides an excellent, elegant solution, once one can figure it out – its taken me a bit… and still learning.

..Otto

Answered By: Otto Hirr
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.