Standartization for input images using in quantized neural networks

Question

I am working with quantized neural networks (need input image with pixels [0, 255]) for a while. For the ssd_mobilenet_v1.tflite model the following standartization parameter are given though https://tfhub.dev/tensorflow/lite-model/ssd_mobilenet_v1/1/metadata/2 :

 mean: 127.5
 std : 127.5

So, with this parameter the common formula normalized_input = (input - mean) / std don’t make sense for me. When a pixel value is smaller than 128, then the bracket gets 0 and the normalized input is 0 too. So every value under 128 will leeds to black pixels. This can’t be right or am I wrong?

Thanks for your help. I would love to have a discussion here.

Kind regard Chris

Asked By: Chris

||

Source

Answer 1

I would say that each value in the tensor is normalized based on the mean and std leading to black pixels, which is completely normal behavior:

import tensorflow as tf

mean = 127.5
std = 127.5
input = tf.concat([tf.random.uniform((1, 2, 2, 2)), tf.reshape(tf.repeat(tf.constant(128.0), repeats=4), (1, 2, 2, 1))], axis=-1)
normalized_input = (input - mean) / std
print(input)
print(normalized_input)

tf.Tensor(
[[[[  0.50647175   0.20693159 128.        ]
   [  0.18777049   0.9095379  128.        ]]

  [[  0.42894745   0.76806736 128.        ]
   [  0.58564055   0.31613588 128.        ]]]], shape=(1, 2, 2, 3), dtype=float32)
tf.Tensor(
[[[[-0.9960277  -0.998377    0.00392157]
   [-0.9985273  -0.99286634  0.00392157]]

  [[-0.99663574 -0.99397594  0.00392157]
   [-0.99540675 -0.9975205   0.00392157]]]], shape=(1, 2, 2, 3), dtype=float32)

I have often come across projects where the mean and the std have been calculated based on the whole image dataset and the images have been standardized based on these measures:

import tensorflow as tf
import matplotlib.pyplot as plt

input = tf.concat([tf.random.uniform((1, 2, 2, 2)), tf.reshape(tf.repeat(tf.constant(128.0), repeats=4), (1, 2, 2, 1))], axis=-1)
normalized_input = (input - tf.reduce_mean(input, keepdims=True)) / tf.math.reduce_std(input, keepdims=True)

print(input)
print(normalized_input)
plt.imshow(tf.squeeze(input, axis=0))
plt.imshow(tf.squeeze(normalized_input, axis=0))

tf.Tensor(
[[[[7.1283507e-01 6.4363706e-01 1.2800000e+02]
   [1.5691042e-02 2.3734951e-01 1.2800000e+02]]

  [[6.6603470e-01 1.3576746e-01 1.2800000e+02]
   [3.1267488e-01 9.6504271e-01 1.2800000e+02]]]], shape=(1, 2, 2, 3), dtype=float32)
tf.Tensor(
[[[[-0.70291406 -0.7040649   1.414201  ]
   [-0.71450937 -0.7108226   1.414201  ]]

  [[-0.70369244 -0.71251214  1.414201  ]
   [-0.7095697  -0.69871914  1.414201  ]]]], shape=(1, 2, 2, 3), dtype=float32)

In many other projects you also only see uint8 images being scaled to the range [0, 1], which essentially means that each image is divided by 255. Check this post for more details.

Answered By: AloneTogether

Answer 2

Sorry, Alone !!! I think the Normalize Fn of the Tensorlfow are fractional Fn that considers beta, gamma, and sigma values.

model = tf.keras.models.Sequential([
    tf.keras.layers.InputLayer(input_shape=(1, 32, 32, 3)), # input shape to have value 25088 but received input with shape (None, 784) 
    tf.keras.layers.Normalization(mean=3., variance=2. ,name='Layer_1'),
    tf.keras.layers.Normalization(mean=4., variance=6. ,name='Layer_2'),
    tf.keras.layers.Dense(256, activation='relu' ,name='Layer_3'),
])

model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(6, activation=tf.nn.softmax ,name='Layer_4'))
model.summary()

with tf.compat.v1.variable_scope('Layer_1', reuse=tf.compat.v1.AUTO_REUSE):                 
            v2 = tf.compat.v1.get_variable('v2', shape=[256])       # <tf.Variable 'Layer_1/v2:0' shape=(256,) dtype=float32, numpy=array([-0.06715409,  0.10130859,  0.05591007, -0.05931217,  0.10036706, ...
            x1 = tf.compat.v1.get_variable('x', shape=[256])        # <tf.Variable 'Layer_1/x:0' shape=(256,) dtype=float32, numpy=array([-6.63143843e-02,  3.17198113e-02,  1.04614533e-01, -2.30028257e-02, ...
            y1 = tf.compat.v1.get_variable('y', shape=[256])        # <tf.Variable 'Layer_1/y:0' shape=(256,) dtype=float32, numpy=array([-0.10782533,  0.01488321, -0.04950972, -0.09561327,  0.10698273, ...
            y2 = tf.compat.v1.get_variable('y_', shape=[256])       # <tf.Variable 'Layer_1/y_:0' shape=(256,) dtype=float32, numpy=array([-0.04931336, -0.10670284, -0.10054329, -0.09619174,  0.08752564, ...
            mu = tf.compat.v1.get_variable('mu', shape=[256])       # <tf.Variable 'Layer_1/mu:0' shape=(256,) dtype=float32, numpy=array([-0.06098992,  0.02202646, -0.05624849,  0.0602672 , -0.02878931, ...
            sigma = tf.compat.v1.get_variable('sigma', shape=[256]) # <tf.Variable 'Layer_1/sigma:0' shape=(256,) dtype=float32, numpy=array([ 2.84786597e-02,  1.00004725e-01, -8.51654559e-02, -5.34656569e-02, ...
            gamma = tf.compat.v1.get_variable('gamma', shape=[256]) # <tf.Variable 'Layer_1/gamma:0' shape=(256,) dtype=float32, numpy=array([ 0.10177503,  0.04634983, -0.02325767,  0.04158259,  0.10051229, ...
            beta = tf.compat.v1.get_variable('beta', shape=[256])   # <tf.Variable 'Layer_1/beta:0' shape=(256,) dtype=float32, numpy=array([-7.85651207e-02, -4.94908020e-02,  8.88925046e-03,  9.37148184e-03, ...

Answered By: Jirayu Kaewprateep

Standartization for input images using in quantized neural networks

Question:

Answers: