Reset weights in Keras layer

Question:

I’d like to reset (randomize) the weights of all layers in my Keras (deep learning) model. The reason is that I want to be able to train the model several times with different data splits without having to do the (slow) model recompilation every time.

Inspired by this discussion, I’m trying the following code:

# Reset weights
for layer in KModel.layers:
    if hasattr(layer,'init'):
        input_dim = layer.input_shape[1]
        new_weights = layer.init((input_dim, layer.output_dim),name='{}_W'.format(layer.name))
        layer.trainable_weights[0].set_value(new_weights.get_value())

However, it only partly works.

Partly, becuase I’ve inspected some layer.get_weights() values, and they seem to change. But when I restart the training, the cost values are much lower than the initial cost values on the first run. It’s almost like I’ve succeeded resetting some of the weights, but not all of them.

Asked By: Tor

||

Answers:

Try set_weights.

for example:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import print_function
import numpy as np
np.random.seed(1234)
from keras.layers import Input
from keras.layers.convolutional import Convolution2D
from keras.models import Model

print("Building Model...")
inp = Input(shape=(1,None,None))
x   = Convolution2D(1, 3, 3, border_mode='same', init='normal',bias=False)(inp)
output = Convolution2D(1, 3, 3, border_mode='same', init='normal',bias=False)(x)
model_network = Model(input=inp, output=output)

w = np.asarray([ 
    [[[
    [0,0,0],
    [0,2,0],
    [0,0,0]
    ]]]
    ])

for layer_i in range(len(model_network.layers)):
    print (model_network.layers[layer_i])

for layer_i in range(1,len(model_network.layers)):
    model_network.layers[layer_i].set_weights(w)



input_mat = np.asarray([ 
    [[
    [1.,2.,3.,10.],
    [4.,5.,6.,11.],
    [7.,8.,9.,12.]
    ]]
    ])

print("Input:")
print(input_mat)
print("Output:")
print(model_network.predict(input_mat))

w2 = np.asarray([ 
    [[[
    [0,0,0],
    [0,3,0],
    [0,0,0]
    ]]]
    ])


for layer_i in range(1,len(model_network.layers)):
    model_network.layers[layer_i].set_weights(w2)

print("Output:")
print(model_network.predict(input_mat))

build a model with say, two convolutional layers

print("Building Model...")
inp = Input(shape=(1,None,None))
x   = Convolution2D(1, 3, 3, border_mode='same', init='normal',bias=False)(inp)
output = Convolution2D(1, 3, 3, border_mode='same', init='normal',bias=False)(x)
model_network = Model(input=inp, output=output)

then define your weights (i’m using a simple w, but you could use np.random.uniform or anything like that if you want)

w = np.asarray([ 
    [[[
    [0,0,0],
    [0,2,0],
    [0,0,0]
    ]]]
    ])

Take a peek at what are the layers inside a model

for layer_i in range(len(model_network.layers)):
    print (model_network.layers[layer_i])

Set each weight for each convolutional layer (you’ll see that the first layer is actually input and you don’t want to change that, that’s why the range starts from 1 not zero).

for layer_i in range(1,len(model_network.layers)):
    model_network.layers[layer_i].set_weights(w)

Generate some input for your test and predict the output from your model

input_mat = np.asarray([ 
    [[
    [1.,2.,3.,10.],
    [4.,5.,6.,11.],
    [7.,8.,9.,12.]
    ]]
    ])

print("Output:")
print(model_network.predict(input_mat))

You could change it again if you want and check again for the output:

w2 = np.asarray([ 
    [[[
    [0,0,0],
    [0,3,0],
    [0,0,0]
    ]]]
    ])

for layer_i in range(1,len(model_network.layers)):
    model_network.layers[layer_i].set_weights(w2)

print("Output:")
print(model_network.predict(input_mat))

Sample output:

Using Theano backend.
Building Model...
<keras.engine.topology.InputLayer object at 0x7fc0c619fd50>
<keras.layers.convolutional.Convolution2D object at 0x7fc0c6166250>
<keras.layers.convolutional.Convolution2D object at 0x7fc0c6150a10>
Weights after change:
[array([[[[ 0.,  0.,  0.],
         [ 0.,  2.,  0.],
         [ 0.,  0.,  0.]]]], dtype=float32)]
Input:
[[[[  1.   2.   3.  10.]
   [  4.   5.   6.  11.]
   [  7.   8.   9.  12.]]]]
Output:
[[[[  4.   8.  12.  40.]
   [ 16.  20.  24.  44.]
   [ 28.  32.  36.  48.]]]]
Output:
[[[[   9.   18.   27.   90.]
   [  36.   45.   54.   99.]
   [  63.   72.   81.  108.]]]]

From your peek at .layers you can see that the first layer is input and the others your convolutional layers.

Answered By: maz

Save the initial weights right after compiling the model but before training it:

model.save_weights('model.h5')

and then after training, “reset” the model by reloading the initial weights:

model.load_weights('model.h5')

This gives you an apples to apples model to compare different data sets and should be quicker than recompiling the entire model.

Answered By: ezChx

If you want to truly re-randomize the weights, and not merely restore the initial weights, you can do the following. The code is slightly different depending on whether you’re using TensorFlow or Theano.

from keras.initializers import glorot_uniform  # Or your initializer of choice
import keras.backend as K

initial_weights = model.get_weights()

backend_name = K.backend()
if backend_name == 'tensorflow': 
    k_eval = lambda placeholder: placeholder.eval(session=K.get_session())
elif backend_name == 'theano': 
    k_eval = lambda placeholder: placeholder.eval()
else: 
    raise ValueError("Unsupported backend")

new_weights = [k_eval(glorot_uniform()(w.shape)) for w in initial_weights]

model.set_weights(new_weights)
Answered By: BallpointBen

Reset all layers by checking for initializers:

def reset_weights(model):
    import keras.backend as K
    session = K.get_session()
    for layer in model.layers: 
        if hasattr(layer, 'kernel_initializer'): 
            layer.kernel.initializer.run(session=session)
        if hasattr(layer, 'bias_initializer'):
            layer.bias.initializer.run(session=session)     

Update: kernel_initializer is kernel.initializer now.

Answered By: Mendi Barel
K.get_session().close()
K.set_session(tf.Session())
K.get_session().run(tf.global_variables_initializer())
Answered By: Ashot Matevosyan

To “random” re-initialize weights of a compiled untrained model in TF 2.0 (tf.keras):

weights = [glorot_uniform(seed=random.randint(0, 1000))(w.shape) if w.ndim > 1 else w for w in model.get_weights()]

Note the “if wdim > 1 else w”. You don’t want to re-initialize the biases (they stay 0 or 1).

Tensorflow 2 answer:

for ix, layer in enumerate(model.layers):
    if hasattr(model.layers[ix], 'kernel_initializer') and 
            hasattr(model.layers[ix], 'bias_initializer'):
        weight_initializer = model.layers[ix].kernel_initializer
        bias_initializer = model.layers[ix].bias_initializer

        old_weights, old_biases = model.layers[ix].get_weights()

        model.layers[ix].set_weights([
            weight_initializer(shape=old_weights.shape),
            bias_initializer(shape=old_biases.shape)])

Original weights:

model.layers[1].get_weights()[0][0]
array([ 0.4450057 , -0.13564804,  0.35884023,  0.41411972,  0.24866664,
        0.07641453,  0.45726687, -0.04410008,  0.33194816, -0.1965386 ,
       -0.38438258, -0.13263905, -0.23807487,  0.40130925, -0.07339832,
        0.20535922], dtype=float32)

New weights:

model.layers[1].get_weights()[0][0]
array([-0.4607593 , -0.13104361, -0.0372932 , -0.34242013,  0.12066692,
       -0.39146423,  0.3247317 ,  0.2635846 , -0.10496247, -0.40134245,
        0.19276887,  0.2652442 , -0.18802321, -0.18488845,  0.0826562 ,
       -0.23322225], dtype=float32)

Answered By: Nicolas Gervais

I have found the clone_model function that creates a cloned network with the same architecture but new model weights.

Example of use:

model_cloned = tensorflow.keras.models.clone_model(model_base)

Comparing the weights:

original_weights = model_base.get_weights()
print("Original weights", original_weights)
print("========================================================")
print("========================================================")
print("========================================================")
model_cloned = tensorflow.keras.models.clone_model(model_base)
new_weights = model_cloned.get_weights()
print("New weights", new_weights)

If you execute this code several times, you will notice that the cloned model receives new weights each time.

Answered By: danielsaromo

For tf2 the simplest way to actually reset weights would be:

tf_model.set_weights(
    clone_model(tf_model).get_weights()
)

clone_model() as mentioned by @danielsaromo returns new model with trainable params initialized from scratch, we use its weights to reinitialize our model thus no model compilation (knowledge about its loss or optimizer) is needed.

There are two caveats though, first is mentioned in clone_model()‘s documentation:

clone_model will not preserve the uniqueness of shared objects within the model (e.g. a single variable attached to two distinct layers will be restored as two separate variables).

Another caveat is that for large models cloning might fail due to memory limit.

Answered By: Emsi

use keras.backend.clear_session()

Answered By: sjoi84