How to customize ImageDataGenerator in order to modify the target variable values?
Question:
I have a dataset of images annotated with coordinates representing facial keypoints. As I would like to augment the dataset, I am looking for a way to implement the ImageDataGenerator to modify the target variables according to image transformation (for example, if the image is flipped horizontally, the x coordinates should be set to image_width – x or if the image is padded, the x coordinates should be set to x – padding.
I would be thankful for any suggestion or resource that might illustrate any similar example.
Thanks,
Niko
Answers:
The easiest and most obvious way is to modify the Keras code (you can find full implementation of ImageDataGenerator
in here). However, Keras does provide an elegant API to deal with this, although it is not very well documented.
Writing a generator
We would need to write a new Keras generator, inheriting from Iterator
class. The Iterator
class itself is just a convenient child class from Sequence
, a detailed tutorial of which can be found here.
from keras.preprocessing.image import Iterator, ImageDataGenerator
class MyIterator(Iterator):
"""This is a toy example of a wrapper around ImageDataGenerator"""
def __init__(self, n, batch_size, shuffle, seed, **kwargs):
super().__init__(n, batch_size, shuffle, seed)
# Load any data you need here (CSV, HDF5, raw stuffs). The code
# below is just a pseudo-code for demonstration purpose.
input_images = ...
ground_truth = ...
# Here is our beloved image augmentator <3
self.generator = ImageDataGenerator(**kwargs)
def _get_batches_of_transformed_samples(self, index_array):
"""Gets a batch of transformed samples from array of indices"""
# Get a batch of image data
batch_x = input_images[index_array].copy()
batch_y = ground_truth[index_array].copy()
# Transform the inputs and correct the outputs accordingly
for i, (x, y) in enumerate(zip(batch_x, batch_y)):
transform_params = self.generator.get_random_transform(x.shape)
batch_x[i] = self.generator.apply_transform(x, transform_params)
batch_y[i] = process_outputs_accordingly(y, transform_params)
return batch_x, batch_y
Why we inherit from Keras Iterator?
It is recommended to have your generators inherited from keras.utils.Sequence
(also the base class of other classes). It allows the data be loaded in parallel between several threads.
I want the same training API!
You can write a custom generator with methods flow
, flow_from_directory
, and flow_from_dataframe
— core functions of Keras API.
To illustrate how to create a Keras Image Generator that will modify the target labels when an image is flipped, I have created a GitHub repo at: https://github.com/frobertpixto/tf_keras_generator_with_targets
The repo contains:
- A Jupyter notebook with the Generator class and calls to the Generator to illustrate the data augmentation.
- An example of how to call
model.fit
using the Generator
I defined a simple function to wrap a generator by changing it’s __next__
function, all you need is to define a function that takes a batch of data x,y
and outputs a modified version :
def my_wrapper(x, y):
return transform(x), y
wrap_generator(train_generator, my_wrapper)
Here’s the definition of wrap_generator
:
def wrap_generator(gen, wrapper, restore_original=False):
'''this decorator wraps the generator's __next__ with a given wrapper function.
NOTE:
calling this multiple times won't stack up multiple wrappers, instead, the very first/original
__next__ method is stored and wrapped with a new function each time.
to restore the original __next__ function, call map_generator(gen,None, restore_original=True)
Parameters
----------
gen : tensoflow.keras.preprocessing.image.DataImageGenerator
a DataImageGenerator fully initialized ()
wrapper: function (x,y) -> (x,y)
a simple function that takes one batch of data (like the ones the generator produces)
and outputs a modified version of it
restore_original: boolean
if true, the original __next__ will be restored
Returns:
the same generator object (for convenience)
Example of usage:
def my_wrapper(x, y):
return transform(x), y
wrap_generator(train_generator, my_wrapper)
'''
# store the original __next__ method in the generator (if not done before)
if not hasattr(gen, '_original_next'):
gen._original_next = gen.__next__
# Restore original generator if demanded
if restore_original:
gen.__next__ = gen._original_next
del gen._original_next
return gen
# wrap original _original_next method with the wrapper function
def fixed_next():
x,y = gen._original_next()
return wrapper(x,y)
gen.__next__ = fixed_next
return gen
I have a dataset of images annotated with coordinates representing facial keypoints. As I would like to augment the dataset, I am looking for a way to implement the ImageDataGenerator to modify the target variables according to image transformation (for example, if the image is flipped horizontally, the x coordinates should be set to image_width – x or if the image is padded, the x coordinates should be set to x – padding.
I would be thankful for any suggestion or resource that might illustrate any similar example.
Thanks,
Niko
The easiest and most obvious way is to modify the Keras code (you can find full implementation of ImageDataGenerator
in here). However, Keras does provide an elegant API to deal with this, although it is not very well documented.
Writing a generator
We would need to write a new Keras generator, inheriting from Iterator
class. The Iterator
class itself is just a convenient child class from Sequence
, a detailed tutorial of which can be found here.
from keras.preprocessing.image import Iterator, ImageDataGenerator
class MyIterator(Iterator):
"""This is a toy example of a wrapper around ImageDataGenerator"""
def __init__(self, n, batch_size, shuffle, seed, **kwargs):
super().__init__(n, batch_size, shuffle, seed)
# Load any data you need here (CSV, HDF5, raw stuffs). The code
# below is just a pseudo-code for demonstration purpose.
input_images = ...
ground_truth = ...
# Here is our beloved image augmentator <3
self.generator = ImageDataGenerator(**kwargs)
def _get_batches_of_transformed_samples(self, index_array):
"""Gets a batch of transformed samples from array of indices"""
# Get a batch of image data
batch_x = input_images[index_array].copy()
batch_y = ground_truth[index_array].copy()
# Transform the inputs and correct the outputs accordingly
for i, (x, y) in enumerate(zip(batch_x, batch_y)):
transform_params = self.generator.get_random_transform(x.shape)
batch_x[i] = self.generator.apply_transform(x, transform_params)
batch_y[i] = process_outputs_accordingly(y, transform_params)
return batch_x, batch_y
Why we inherit from Keras Iterator?
It is recommended to have your generators inherited from keras.utils.Sequence
(also the base class of other classes). It allows the data be loaded in parallel between several threads.
I want the same training API!
You can write a custom generator with methods flow
, flow_from_directory
, and flow_from_dataframe
— core functions of Keras API.
To illustrate how to create a Keras Image Generator that will modify the target labels when an image is flipped, I have created a GitHub repo at: https://github.com/frobertpixto/tf_keras_generator_with_targets
The repo contains:
- A Jupyter notebook with the Generator class and calls to the Generator to illustrate the data augmentation.
- An example of how to call
model.fit
using the Generator
I defined a simple function to wrap a generator by changing it’s __next__
function, all you need is to define a function that takes a batch of data x,y
and outputs a modified version :
def my_wrapper(x, y):
return transform(x), y
wrap_generator(train_generator, my_wrapper)
Here’s the definition of wrap_generator
:
def wrap_generator(gen, wrapper, restore_original=False):
'''this decorator wraps the generator's __next__ with a given wrapper function.
NOTE:
calling this multiple times won't stack up multiple wrappers, instead, the very first/original
__next__ method is stored and wrapped with a new function each time.
to restore the original __next__ function, call map_generator(gen,None, restore_original=True)
Parameters
----------
gen : tensoflow.keras.preprocessing.image.DataImageGenerator
a DataImageGenerator fully initialized ()
wrapper: function (x,y) -> (x,y)
a simple function that takes one batch of data (like the ones the generator produces)
and outputs a modified version of it
restore_original: boolean
if true, the original __next__ will be restored
Returns:
the same generator object (for convenience)
Example of usage:
def my_wrapper(x, y):
return transform(x), y
wrap_generator(train_generator, my_wrapper)
'''
# store the original __next__ method in the generator (if not done before)
if not hasattr(gen, '_original_next'):
gen._original_next = gen.__next__
# Restore original generator if demanded
if restore_original:
gen.__next__ = gen._original_next
del gen._original_next
return gen
# wrap original _original_next method with the wrapper function
def fixed_next():
x,y = gen._original_next()
return wrapper(x,y)
gen.__next__ = fixed_next
return gen