Random orthogonal, 90 degrees rotation with ImageDataGenerator
Question:
I use following code to train my CNN model with invoice images.
train_datagen = ImageDataGenerator(
rescale = 1. / 255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True
)
test_datagen = ImageDataGenerator(rescale = 1. / 255)
train_generator = train_datagen.flow_from_directory(train_data_dir,
target_size =(img_width, img_height),
batch_size = batch_size)
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size =(img_width, img_height),
batch_size = batch_size)
model.fit_generator(train_generator,
steps_per_epoch = nb_train_samples // batch_size,
epochs = epochs, validation_data = validation_generator,
validation_steps = nb_validation_samples // batch_size)
The problem is I used only upright images in my training data set. All my images are like following image:
After the training when I want to send an image like below, my model fails to predict its right class.
As you see below, I send horizontal_flip = True to ImageDataGenerator
train_datagen = ImageDataGenerator(
rescale = 1. / 255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True
)
How can I change my code so that it can predict even flipped images. Or should I use manually flipped images within my training dataset?
Answers:
I would rotate the images randomly with ImageDataGenerator
. Just specify the following argument:
rotation_range: Int. Degree range for random rotations.
Or, you can pass a preprocessing function to ImageDataGenerator
which gives you more flexibility.
def orthogonal_rot(image):
return np.rot90(image, np.random.choice([-1, 0, 1]))
train_generator = ImageDataGenerator(
preprocessing_function=orthogonal_rot)
This function will rotate by either -90, 0, or 90 degrees.
(The np.rot90()
function is rotating the image 90 degrees times the second parameter. Accordingly -1 is -90 degrees, 0 is no rotation, 1 is 90 degrees and 2 would be 180 degrees.)
If you need right angle rotations only, it can be set with a preprocessing function that uses the keras apply_affine_transform
function. Then, you can pass the preprocessing function to the ImageDataGenerator
via the preprocessing_function
argument. Using this approach, you can also go with the same fill_mode
for the right rotation and the data generation.
Documentation apply_affine_transform
Documentation ImageDataGenerator
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.preprocessing.image import apply_affine_transform
FILL_MODE = 'nearest'
def right_angle_rotate(input_image):
angle = random.choice([0, 90, 180, 270])
if angle != 0:
input_image = apply_affine_transform(
input_image, theta=angle, fill_mode=FILL_MODE)
return input_image
data_gen = ImageDataGenerator(
fill_mode=FILL_MODE,
preprocessing_function=right_angle_rotate)
However, the numpy.rot90
function will cause an exception, if your input images are rectangle images, as the input size will not match after 90° and 270° rotations.
I use following code to train my CNN model with invoice images.
train_datagen = ImageDataGenerator(
rescale = 1. / 255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True
)
test_datagen = ImageDataGenerator(rescale = 1. / 255)
train_generator = train_datagen.flow_from_directory(train_data_dir,
target_size =(img_width, img_height),
batch_size = batch_size)
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size =(img_width, img_height),
batch_size = batch_size)
model.fit_generator(train_generator,
steps_per_epoch = nb_train_samples // batch_size,
epochs = epochs, validation_data = validation_generator,
validation_steps = nb_validation_samples // batch_size)
The problem is I used only upright images in my training data set. All my images are like following image:
After the training when I want to send an image like below, my model fails to predict its right class.
As you see below, I send horizontal_flip = True to ImageDataGenerator
train_datagen = ImageDataGenerator(
rescale = 1. / 255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True
)
How can I change my code so that it can predict even flipped images. Or should I use manually flipped images within my training dataset?
I would rotate the images randomly with ImageDataGenerator
. Just specify the following argument:
rotation_range: Int. Degree range for random rotations.
Or, you can pass a preprocessing function to ImageDataGenerator
which gives you more flexibility.
def orthogonal_rot(image):
return np.rot90(image, np.random.choice([-1, 0, 1]))
train_generator = ImageDataGenerator(
preprocessing_function=orthogonal_rot)
This function will rotate by either -90, 0, or 90 degrees.
(The np.rot90()
function is rotating the image 90 degrees times the second parameter. Accordingly -1 is -90 degrees, 0 is no rotation, 1 is 90 degrees and 2 would be 180 degrees.)
If you need right angle rotations only, it can be set with a preprocessing function that uses the keras apply_affine_transform
function. Then, you can pass the preprocessing function to the ImageDataGenerator
via the preprocessing_function
argument. Using this approach, you can also go with the same fill_mode
for the right rotation and the data generation.
Documentation apply_affine_transform
Documentation ImageDataGenerator
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.preprocessing.image import apply_affine_transform
FILL_MODE = 'nearest'
def right_angle_rotate(input_image):
angle = random.choice([0, 90, 180, 270])
if angle != 0:
input_image = apply_affine_transform(
input_image, theta=angle, fill_mode=FILL_MODE)
return input_image
data_gen = ImageDataGenerator(
fill_mode=FILL_MODE,
preprocessing_function=right_angle_rotate)
However, the numpy.rot90
function will cause an exception, if your input images are rectangle images, as the input size will not match after 90° and 270° rotations.