Store user attributes in Optuna Sweeper plugin for Hydra

Question:

How can I store additional information in an optuna trial when using it via the Hydra sweep plugin?

My use case is as follows:
I want to optimize a bunch of hyperparameters. I am storing all reproducibility information of all experiments (i.e., trials) in a separate database.
I know I can get the best values via optuna.load_study().best_params or even best_trial. However, that only allows me to replicate the experiment – potentially this takes quite some time. To overcome this issue, I need to somehow link it to my own database. I would like to store the ID of my own database somewhere in the trial object.

Without using Hydra, I suppose I’d set User Attributes. However, with Hydra abstracting all that away, there seems no option to do so.

I know that I can just query my own database for the exact combination of best params that optuna found, but that just seems like a difficult solution to a simple problem.

Some minimal code:

from dataclasses import dataclass

import hydra
from hydra.core.config_store import ConfigStore
from omegaconf import MISSING


@dataclass
class TrainConfig:
    x: float | int = MISSING
    y: int = MISSING
    z: int | None = None


ConfigStore.instance().store(name="config", node=TrainConfig)


@hydra.main(version_base=None, config_path="conf", config_name="sweep")
def sphere(cfg: TrainConfig) -> float:
    x: float = cfg.x
    y: float = cfg.y
    return x**2 + y**2


if __name__ == "__main__":
    sphere()
defaults:
  - override hydra/sweeper: optuna
  - override hydra/sweeper/sampler: tpe

hydra:
  sweeper:
    sampler:
      seed: 123
    direction: minimize
    study_name: sphere
    storage: sqlite:///trials.db
    n_trials: 20
    n_jobs: 1
    params:
      x: range(-5.5, 5.5, step=0.5)
      y: choice(-5 ,0 ,5)
      z: choice(0, 3, 5)

x: 1
y: 1
z: 1
Asked By: Michel Kok

||

Answers:

A hacky solution via the custom_search_space.

hydra:
  sweeper:
    sampler:
      seed: 123
    direction: minimize
    study_name: sphere
    storage: sqlite:///trials.db
    n_trials: 20
    n_jobs: 1
    params:
      x: range(-5.5, 5.5, step=0.5)
      y: choice(-5 ,0 ,5)
      z: choice([0, 1], [2, 3], [2, 5])
    custom_search_space: package.run.configure
def configure(_, trial: Trial) -> None:
    trial.set_user_attr("experiment_db_id", 123456)
Answered By: Michel Kok