Strange results while training with keras

Question:

I am trying to train a unet model on braTS18 dataset (medical data with nifiti images) using keras with tensorflow. However I am getting very strange results:

enter image description here

enter image description here

as you can see, accuracy starts with 96% and gets to 99% at the third epoch. Also the validation loss doesn’t go lower ever. Also there is nothing predicted by the trained model.

I have split the data in different ways (20% train 60% validation, or 60% train 20% validation) but didn’t work. I think the problem might be with my model or with the data generator. Here are the codes:

unet model

def unet_model(filters=16, dropout=0.1, batch_normalize=True):

    # Build U-Net model
    inputs = Input((img_height, img_width, img_channels), name='main_input')
    s = Lambda(lambda x: x / 255) (inputs)

    c1 = Conv2D(filters, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same', name = 'c1') (s)
    c1 = Dropout(0.1) (c1)
    c1 = Conv2D(filters, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same', name = 'c1_d') (c1)
    p1 = MaxPooling2D((2, 2)) (c1)

    c2 = Conv2D(2*filters, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same', name = 'c2') (p1)
    c2 = Dropout(0.1) (c2)
    c2 = Conv2D(2*filters, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same', name = 'c2_d') (c2)
    p2 = MaxPooling2D((2, 2)) (c2)

    c3 = Conv2D(4*filters, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same', name = 'c3') (p2)
    c3 = Dropout(0.2) (c3)
    c3 = Conv2D(4*filters, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same', name = 'c3_d') (c3)
    p3 = MaxPooling2D((2, 2)) (c3)

    c4 = Conv2D(8*filters, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same', name = 'c4') (p3)
    c4 = Dropout(0.2) (c4)
    c4 = Conv2D(8*filters, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same', name = 'c4_d') (c4)
    p4 = MaxPooling2D(pool_size=(2, 2)) (c4)

    c5 = Conv2D(16*filters, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same', name = 'c5') (p4)
    c5 = Dropout(0.3) (c5)
    c5 = Conv2D(16*filters, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same', name = 'c5_d') (c5)

    u6 = Conv2DTranspose(8*filters, (2, 2), strides=(2, 2), padding='same', name = 'u6') (c5)
    u6 = concatenate([u6, c4])
    c6 = Conv2D(8*filters, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same', name = 'c6') (u6)
    c6 = Dropout(0.2) (c6)
    c6 = Conv2D(8*filters, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same', name = 'c6_d') (c6)

    u7 = Conv2DTranspose(4*filters, (2, 2), strides=(2, 2), padding='same', name = 'u7') (c6)
    u7 = concatenate([u7, c3])
    c7 = Conv2D(4*filters, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same', name = 'c7') (u7)
    c7 = Dropout(0.2) (c7)
    c7 = Conv2D(4*filters, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same', name = 'c7_d') (c7)

    u8 = Conv2DTranspose(2*filters, (2, 2), strides=(2, 2), padding='same', name = 'u8') (c7)
    u8 = concatenate([u8, c2])
    c8 = Conv2D(2*filters, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same', name = 'c8') (u8)
    c8 = Dropout(0.1) (c8)
    c8 = Conv2D(2*filters, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same', name = 'c8_d') (c8)

    u9 = Conv2DTranspose(filters, (2, 2), strides=(2, 2), padding='same', name = 'u9') (c8)
    u9 = concatenate([u9, c1], axis=3)
    c9 = Conv2D(filters, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same', name = 'c9') (u9)
    c9 = Dropout(0.1) (c9)
    c9 = Conv2D(filters, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same', name = 'c9_d') (c9)

    outputs = Conv2D(1, (1, 1), activation='sigmoid', name = 'output') (c9)

    adam = optimizers.Adam(lr=lr, beta_1=beta1, decay=lr_decay, amsgrad=False)

    model = Model(inputs=[inputs], outputs=[outputs])
    model.compile(optimizer=adam, loss='binary_crossentropy', metrics=['accuracy',dice,jaccard])

    plot_model(model, to_file=os.path.join(save_dir +"model.png"))
    if os.path.exists(os.path.join(save_dir +"model.txt")):
        os.remove(os.path.join(save_dir +"model.txt"))
    with open(os.path.join(save_dir +"model.txt"),'w') as fh:
        model.summary(positions=[.3, .55, .67, 1.], print_fn=lambda x: fh.write(x + 'n'))

    model.summary()

    return model

and here is the code for data generator:

def generate_data(X_data, Y_data, batch_size):

    samples_per_epoch = total_folders
    number_of_batches = samples_per_epoch/batch_size
    counter=0

    while True:

        X_batch = X_data[batch_size*counter:batch_size*(counter+1)]
        Y_batch = Y_data[batch_size*counter:batch_size*(counter+1)]

        counter += 1

        yield X_batch, Y_batch

        if counter >= number_of_batches:
            counter = 0
...
in the main function
...

if __name__ == "__main__":

    callbacks = [
    EarlyStopping(patience=1000, verbose=1),
    ReduceLROnPlateau(factor=0.1, patience=3, min_lr=0.00001, verbose=1),
    ModelCheckpoint(save_dir + 'model.{epoch:02d}-{val_loss:.2f}.h5', verbose=1, save_best_only=True, save_weights_only=True)
    ]

    model = unet_model(filters=16, dropout=0.05, batch_normalize=True)


    H = model.fit_generator(generate_data(X_train,Y_train,batch_size), 
                        epochs= epochs,
                        steps_per_epoch = total_folders/batch_size, 
                        validation_data=generate_data(X_test,Y_test,batch_size*2),
                        callbacks=callbacks,
                        validation_steps= total_folders/batch_size*2)

what am I doing wrong here?

Answers:

I believe your problem is your loss function/metrics. If most patients do not have any tumor, and the accuracy or jaccard distance equally regard both classes, your model will return high values of accuracy and low values of jaccard index by simpy saying everything is backgorund/healthy. You can check this by you implement a custom loss that always returns the class label for background and compare it to your current results. To solve your problem implement something like a jaccard distance that gives lower weight to the background. An overview of different metrics that may be more suitable than accuracy can be found here.

Also, maybe I didn’t understand the dataset, but shouldn’t you segment different kinds of tumors and therefore use categorical instead of binary classification?

Answered By: a-doering