Standartization for input images using in quantized neural networks
Question:
I am working with quantized neural networks (need input image with pixels [0, 255]
) for a while. For the ssd_mobilenet_v1.tflite model the following standartization parameter are given though https://tfhub.dev/tensorflow/lite-model/ssd_mobilenet_v1/1/metadata/2 :
mean: 127.5
std : 127.5
So, with this parameter the common formula normalized_input = (input - mean) / std
don’t make sense for me. When a pixel value is smaller than 128
, then the bracket gets 0
and the normalized input is 0
too. So every value under 128
will leeds to black pixels. This can’t be right or am I wrong?
Thanks for your help. I would love to have a discussion here.
Kind regard Chris
Answers:
I would say that each value in the tensor is normalized based on the mean and std leading to black pixels, which is completely normal behavior:
import tensorflow as tf
mean = 127.5
std = 127.5
input = tf.concat([tf.random.uniform((1, 2, 2, 2)), tf.reshape(tf.repeat(tf.constant(128.0), repeats=4), (1, 2, 2, 1))], axis=-1)
normalized_input = (input - mean) / std
print(input)
print(normalized_input)
tf.Tensor(
[[[[ 0.50647175 0.20693159 128. ]
[ 0.18777049 0.9095379 128. ]]
[[ 0.42894745 0.76806736 128. ]
[ 0.58564055 0.31613588 128. ]]]], shape=(1, 2, 2, 3), dtype=float32)
tf.Tensor(
[[[[-0.9960277 -0.998377 0.00392157]
[-0.9985273 -0.99286634 0.00392157]]
[[-0.99663574 -0.99397594 0.00392157]
[-0.99540675 -0.9975205 0.00392157]]]], shape=(1, 2, 2, 3), dtype=float32)
I have often come across projects where the mean and the std have been calculated based on the whole image dataset and the images have been standardized based on these measures:
import tensorflow as tf
import matplotlib.pyplot as plt
input = tf.concat([tf.random.uniform((1, 2, 2, 2)), tf.reshape(tf.repeat(tf.constant(128.0), repeats=4), (1, 2, 2, 1))], axis=-1)
normalized_input = (input - tf.reduce_mean(input, keepdims=True)) / tf.math.reduce_std(input, keepdims=True)
print(input)
print(normalized_input)
plt.imshow(tf.squeeze(input, axis=0))
plt.imshow(tf.squeeze(normalized_input, axis=0))
tf.Tensor(
[[[[7.1283507e-01 6.4363706e-01 1.2800000e+02]
[1.5691042e-02 2.3734951e-01 1.2800000e+02]]
[[6.6603470e-01 1.3576746e-01 1.2800000e+02]
[3.1267488e-01 9.6504271e-01 1.2800000e+02]]]], shape=(1, 2, 2, 3), dtype=float32)
tf.Tensor(
[[[[-0.70291406 -0.7040649 1.414201 ]
[-0.71450937 -0.7108226 1.414201 ]]
[[-0.70369244 -0.71251214 1.414201 ]
[-0.7095697 -0.69871914 1.414201 ]]]], shape=(1, 2, 2, 3), dtype=float32)
In many other projects you also only see uint8 images being scaled to the range [0, 1], which essentially means that each image is divided by 255. Check this post for more details.
Sorry, Alone !!! I think the Normalize Fn of the Tensorlfow are fractional Fn that considers beta, gamma, and sigma values.
model = tf.keras.models.Sequential([
tf.keras.layers.InputLayer(input_shape=(1, 32, 32, 3)), # input shape to have value 25088 but received input with shape (None, 784)
tf.keras.layers.Normalization(mean=3., variance=2. ,name='Layer_1'),
tf.keras.layers.Normalization(mean=4., variance=6. ,name='Layer_2'),
tf.keras.layers.Dense(256, activation='relu' ,name='Layer_3'),
])
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(6, activation=tf.nn.softmax ,name='Layer_4'))
model.summary()
with tf.compat.v1.variable_scope('Layer_1', reuse=tf.compat.v1.AUTO_REUSE):
v2 = tf.compat.v1.get_variable('v2', shape=[256]) # <tf.Variable 'Layer_1/v2:0' shape=(256,) dtype=float32, numpy=array([-0.06715409, 0.10130859, 0.05591007, -0.05931217, 0.10036706, ...
x1 = tf.compat.v1.get_variable('x', shape=[256]) # <tf.Variable 'Layer_1/x:0' shape=(256,) dtype=float32, numpy=array([-6.63143843e-02, 3.17198113e-02, 1.04614533e-01, -2.30028257e-02, ...
y1 = tf.compat.v1.get_variable('y', shape=[256]) # <tf.Variable 'Layer_1/y:0' shape=(256,) dtype=float32, numpy=array([-0.10782533, 0.01488321, -0.04950972, -0.09561327, 0.10698273, ...
y2 = tf.compat.v1.get_variable('y_', shape=[256]) # <tf.Variable 'Layer_1/y_:0' shape=(256,) dtype=float32, numpy=array([-0.04931336, -0.10670284, -0.10054329, -0.09619174, 0.08752564, ...
mu = tf.compat.v1.get_variable('mu', shape=[256]) # <tf.Variable 'Layer_1/mu:0' shape=(256,) dtype=float32, numpy=array([-0.06098992, 0.02202646, -0.05624849, 0.0602672 , -0.02878931, ...
sigma = tf.compat.v1.get_variable('sigma', shape=[256]) # <tf.Variable 'Layer_1/sigma:0' shape=(256,) dtype=float32, numpy=array([ 2.84786597e-02, 1.00004725e-01, -8.51654559e-02, -5.34656569e-02, ...
gamma = tf.compat.v1.get_variable('gamma', shape=[256]) # <tf.Variable 'Layer_1/gamma:0' shape=(256,) dtype=float32, numpy=array([ 0.10177503, 0.04634983, -0.02325767, 0.04158259, 0.10051229, ...
beta = tf.compat.v1.get_variable('beta', shape=[256]) # <tf.Variable 'Layer_1/beta:0' shape=(256,) dtype=float32, numpy=array([-7.85651207e-02, -4.94908020e-02, 8.88925046e-03, 9.37148184e-03, ...
I am working with quantized neural networks (need input image with pixels [0, 255]
) for a while. For the ssd_mobilenet_v1.tflite model the following standartization parameter are given though https://tfhub.dev/tensorflow/lite-model/ssd_mobilenet_v1/1/metadata/2 :
mean: 127.5
std : 127.5
So, with this parameter the common formula normalized_input = (input - mean) / std
don’t make sense for me. When a pixel value is smaller than 128
, then the bracket gets 0
and the normalized input is 0
too. So every value under 128
will leeds to black pixels. This can’t be right or am I wrong?
Thanks for your help. I would love to have a discussion here.
Kind regard Chris
I would say that each value in the tensor is normalized based on the mean and std leading to black pixels, which is completely normal behavior:
import tensorflow as tf
mean = 127.5
std = 127.5
input = tf.concat([tf.random.uniform((1, 2, 2, 2)), tf.reshape(tf.repeat(tf.constant(128.0), repeats=4), (1, 2, 2, 1))], axis=-1)
normalized_input = (input - mean) / std
print(input)
print(normalized_input)
tf.Tensor(
[[[[ 0.50647175 0.20693159 128. ]
[ 0.18777049 0.9095379 128. ]]
[[ 0.42894745 0.76806736 128. ]
[ 0.58564055 0.31613588 128. ]]]], shape=(1, 2, 2, 3), dtype=float32)
tf.Tensor(
[[[[-0.9960277 -0.998377 0.00392157]
[-0.9985273 -0.99286634 0.00392157]]
[[-0.99663574 -0.99397594 0.00392157]
[-0.99540675 -0.9975205 0.00392157]]]], shape=(1, 2, 2, 3), dtype=float32)
I have often come across projects where the mean and the std have been calculated based on the whole image dataset and the images have been standardized based on these measures:
import tensorflow as tf
import matplotlib.pyplot as plt
input = tf.concat([tf.random.uniform((1, 2, 2, 2)), tf.reshape(tf.repeat(tf.constant(128.0), repeats=4), (1, 2, 2, 1))], axis=-1)
normalized_input = (input - tf.reduce_mean(input, keepdims=True)) / tf.math.reduce_std(input, keepdims=True)
print(input)
print(normalized_input)
plt.imshow(tf.squeeze(input, axis=0))
plt.imshow(tf.squeeze(normalized_input, axis=0))
tf.Tensor(
[[[[7.1283507e-01 6.4363706e-01 1.2800000e+02]
[1.5691042e-02 2.3734951e-01 1.2800000e+02]]
[[6.6603470e-01 1.3576746e-01 1.2800000e+02]
[3.1267488e-01 9.6504271e-01 1.2800000e+02]]]], shape=(1, 2, 2, 3), dtype=float32)
tf.Tensor(
[[[[-0.70291406 -0.7040649 1.414201 ]
[-0.71450937 -0.7108226 1.414201 ]]
[[-0.70369244 -0.71251214 1.414201 ]
[-0.7095697 -0.69871914 1.414201 ]]]], shape=(1, 2, 2, 3), dtype=float32)
In many other projects you also only see uint8 images being scaled to the range [0, 1], which essentially means that each image is divided by 255. Check this post for more details.
Sorry, Alone !!! I think the Normalize Fn of the Tensorlfow are fractional Fn that considers beta, gamma, and sigma values.
model = tf.keras.models.Sequential([
tf.keras.layers.InputLayer(input_shape=(1, 32, 32, 3)), # input shape to have value 25088 but received input with shape (None, 784)
tf.keras.layers.Normalization(mean=3., variance=2. ,name='Layer_1'),
tf.keras.layers.Normalization(mean=4., variance=6. ,name='Layer_2'),
tf.keras.layers.Dense(256, activation='relu' ,name='Layer_3'),
])
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(6, activation=tf.nn.softmax ,name='Layer_4'))
model.summary()
with tf.compat.v1.variable_scope('Layer_1', reuse=tf.compat.v1.AUTO_REUSE):
v2 = tf.compat.v1.get_variable('v2', shape=[256]) # <tf.Variable 'Layer_1/v2:0' shape=(256,) dtype=float32, numpy=array([-0.06715409, 0.10130859, 0.05591007, -0.05931217, 0.10036706, ...
x1 = tf.compat.v1.get_variable('x', shape=[256]) # <tf.Variable 'Layer_1/x:0' shape=(256,) dtype=float32, numpy=array([-6.63143843e-02, 3.17198113e-02, 1.04614533e-01, -2.30028257e-02, ...
y1 = tf.compat.v1.get_variable('y', shape=[256]) # <tf.Variable 'Layer_1/y:0' shape=(256,) dtype=float32, numpy=array([-0.10782533, 0.01488321, -0.04950972, -0.09561327, 0.10698273, ...
y2 = tf.compat.v1.get_variable('y_', shape=[256]) # <tf.Variable 'Layer_1/y_:0' shape=(256,) dtype=float32, numpy=array([-0.04931336, -0.10670284, -0.10054329, -0.09619174, 0.08752564, ...
mu = tf.compat.v1.get_variable('mu', shape=[256]) # <tf.Variable 'Layer_1/mu:0' shape=(256,) dtype=float32, numpy=array([-0.06098992, 0.02202646, -0.05624849, 0.0602672 , -0.02878931, ...
sigma = tf.compat.v1.get_variable('sigma', shape=[256]) # <tf.Variable 'Layer_1/sigma:0' shape=(256,) dtype=float32, numpy=array([ 2.84786597e-02, 1.00004725e-01, -8.51654559e-02, -5.34656569e-02, ...
gamma = tf.compat.v1.get_variable('gamma', shape=[256]) # <tf.Variable 'Layer_1/gamma:0' shape=(256,) dtype=float32, numpy=array([ 0.10177503, 0.04634983, -0.02325767, 0.04158259, 0.10051229, ...
beta = tf.compat.v1.get_variable('beta', shape=[256]) # <tf.Variable 'Layer_1/beta:0' shape=(256,) dtype=float32, numpy=array([-7.85651207e-02, -4.94908020e-02, 8.88925046e-03, 9.37148184e-03, ...