how to use to_categorical when using ImageDataGenerator

Question:

I am using Keras for classifying images (multiple classes) and I’m using ImageDataGenerator. It automatically finds all of classes, and it doesn’t seem to write labels in any variable. I figured I need to use to_categorical to store my labels in matrix form, but I just don’t know where to use it.

Here is a snippet of my code:

...
datagen = ImageDataGenerator(
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    rescale=1./255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest')

# generator for training
train_generator = datagen.flow_from_directory(
train_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical')

# generator for validation
val_generator = datagen.flow_from_directory(
val_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical')

# generator for testing
test_generator = datagen.flow_from_directory(
test_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical')

# train
model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples // batch_size,
epochs=epochs,
validation_data=val_generator,
validation_steps=nb_validation_samples // batch_size)

Generators just say “Found 442 images belonging to 5 classes.” or smth like that. How can I use to_categorical on my labels?

Asked By: Kamil Saitov

||

Answers:

Since you are passing class_mode='categorical' you dont have to manually convert the labels to one hot encoded vectors using to_categorical().

The Generator will return labels as categorical.

Answered By: Sreeram TP

It might be useful (even after two years) to also mention that if you want specific order for one-hot vectors, you can feed that through classes argument.
For example if you want "dog"=[1,0] and "cat"=[0,1], then explicitly set:
classes=["dog", "cat"].

Answered By: Ala Tarighati

Answers above are clear enough but for more information, you can check if your label is in categorical form.

code below shows one actual array of label, label.shape

for batch_tuple in train_generator:
    print(batch_tuple[1][0],batch_tuple[1][0].shape)
    break

output shows:

[0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] (40,)
Answered By: gulf1324