Weights&Biases Sweep Keras K-Fold Validation
Question:
I’m using Weights&Biases Cloud-based sweeps with Keras.
So first i create a new Sweep within a W&B Project with a config like following:
description: LSTM Model
method: random
metric:
goal: maximize
name: val_accuracy
name: LSTM-Sweep
parameters:
batch_size:
distribution: int_uniform
max: 128
min: 32
epochs:
distribution: constant
value: 200
node_size1:
distribution: categorical
values:
- 64
- 128
- 256
node_size2:
distribution: categorical
values:
- 64
- 128
- 256
node_size3:
distribution: categorical
values:
- 64
- 128
- 256
node_size4:
distribution: categorical
values:
- 64
- 128
- 256
node_size5:
distribution: categorical
values:
- 64
- 128
- 256
num_layers:
distribution: categorical
values:
- 1
- 2
- 3
optimizer:
distribution: categorical
values:
- Adam
- Adamax
- Adagrad
path:
distribution: constant
value: "./path/to/data/"
program: sweep.py
project: SLR
My sweep.py
file looks something like this:
# imports
init = wandb.init(project="my-project", reinit=True)
config = wandb.config
def main():
skfold = StratifiedKFold(n_splits=5,
shuffle=True, random_state=7)
cvscores = []
group_id = wandb.util.generate_id()
X,y = # load data
i = 0
for train, test in skfold.split(X,y):
i=i+1
run = wandb.init(group=group_id, reinit=True, name=group_id+"#"+str(i))
model = # build model
model.fit([...], WandBCallback())
cvscores.append([...])
wandb.join()
if __name__ == "__main__":
main()
Starting this with the wandb agent
command within the folder of sweep.py
.
What i experienced with this setup is, that with the first wandb.init() call a new run is initialized. Okay, i could just remove that. But when calling wandb.init() for the second time it seems to lose track of the sweep it is running in. Online an empty run is listed in the sweep (because of the first wandb.init() call), all other runs are listed inside the project, but not in the sweep.
My goal is to have a run for each fold of the k-Fold cross-validation. At least i thought this would be the right way of doing this.
Is there a different approach to combine sweeps with keras k-fold cross validation?
Answers:
We put together an example of how to accomplish k-fold cross validation:
https://github.com/wandb/examples/tree/master/examples/wandb-sweeps/sweeps-cross-validation
The solution requires some contortions for the wandb library to spawn multiple jobs on behalf of a launched sweep job.
The basic idea is:
- The agent requests a new set of parameters from the cloud hosted parameter server. This is the run called
sweep_run
in the main function.
- Send information about what the folds should process over a multiprocessing queue to waiting processes
- Each spawned process logs to their own run, organized with group and job_type to enable auto-grouping in the UI
- When the process is finished, it sends the primary metric over a queue to the parent sweep run
- The sweep run reads metrics from the child runs and logs it to the sweep run so that the sweep can use that result to impact future parameter choices and/or hyperband early termination optimizations
Example visualizations of the sweep and k-fold grouping can be seen here:
I’m using Weights&Biases Cloud-based sweeps with Keras.
So first i create a new Sweep within a W&B Project with a config like following:
description: LSTM Model
method: random
metric:
goal: maximize
name: val_accuracy
name: LSTM-Sweep
parameters:
batch_size:
distribution: int_uniform
max: 128
min: 32
epochs:
distribution: constant
value: 200
node_size1:
distribution: categorical
values:
- 64
- 128
- 256
node_size2:
distribution: categorical
values:
- 64
- 128
- 256
node_size3:
distribution: categorical
values:
- 64
- 128
- 256
node_size4:
distribution: categorical
values:
- 64
- 128
- 256
node_size5:
distribution: categorical
values:
- 64
- 128
- 256
num_layers:
distribution: categorical
values:
- 1
- 2
- 3
optimizer:
distribution: categorical
values:
- Adam
- Adamax
- Adagrad
path:
distribution: constant
value: "./path/to/data/"
program: sweep.py
project: SLR
My sweep.py
file looks something like this:
# imports
init = wandb.init(project="my-project", reinit=True)
config = wandb.config
def main():
skfold = StratifiedKFold(n_splits=5,
shuffle=True, random_state=7)
cvscores = []
group_id = wandb.util.generate_id()
X,y = # load data
i = 0
for train, test in skfold.split(X,y):
i=i+1
run = wandb.init(group=group_id, reinit=True, name=group_id+"#"+str(i))
model = # build model
model.fit([...], WandBCallback())
cvscores.append([...])
wandb.join()
if __name__ == "__main__":
main()
Starting this with the wandb agent
command within the folder of sweep.py
.
What i experienced with this setup is, that with the first wandb.init() call a new run is initialized. Okay, i could just remove that. But when calling wandb.init() for the second time it seems to lose track of the sweep it is running in. Online an empty run is listed in the sweep (because of the first wandb.init() call), all other runs are listed inside the project, but not in the sweep.
My goal is to have a run for each fold of the k-Fold cross-validation. At least i thought this would be the right way of doing this.
Is there a different approach to combine sweeps with keras k-fold cross validation?
We put together an example of how to accomplish k-fold cross validation:
https://github.com/wandb/examples/tree/master/examples/wandb-sweeps/sweeps-cross-validation
The solution requires some contortions for the wandb library to spawn multiple jobs on behalf of a launched sweep job.
The basic idea is:
- The agent requests a new set of parameters from the cloud hosted parameter server. This is the run called
sweep_run
in the main function. - Send information about what the folds should process over a multiprocessing queue to waiting processes
- Each spawned process logs to their own run, organized with group and job_type to enable auto-grouping in the UI
- When the process is finished, it sends the primary metric over a queue to the parent sweep run
- The sweep run reads metrics from the child runs and logs it to the sweep run so that the sweep can use that result to impact future parameter choices and/or hyperband early termination optimizations
Example visualizations of the sweep and k-fold grouping can be seen here: