Matrix multiplication in TensorFlow model
Question:
I want to use matrix multiplication inside TF model. My model is a NN with input shape = (1,9). And I want to get a product of this vectors by themself (i.e. I want to get a matrix-product equals multiplication of transposed input vector by itself, so its shape equals (9,9)).
Code example:
inputs = tf.keras.layers.Input(shape=(1,9))
outputs = tf.keras.layers.Dense(1, activation='linear')(tf.transpose(inputs) @ inputs)
model = tf.keras.Model(inputs, outputs)
adam = keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(optimizer=adam, loss='mse', metrics=['mae'])
But I have problem with shape of such result. In the case of the above code I get a next architecture:
If I understand correctly, first dimension (None) in the input layer corresponds to size of batch of input data. And when I use transpose operation, it applies to all dimensions in this shape. So I get result with shape (9,1,9) after transpose and multiplication. But I think, that it is not correctly. Because I want to get product of transposed input vector by itself for all vectors in batch (i.e. correct shape for result which I want to get is (None, 9, 9)).
Getting this product as input for the model (compute this multiplication outside this model) is not suitable. Because I want to have in my model original input vector and the result of multiplication to do some operations after (above architecture is not full and using as example).
How can I get correct result? What is correct way to multiply matrices and vectors in TF, if we want to apply this operation to all vectors (matrices) in batch?
Answers:
Try tf.linalg.matmul
, since it will respect the batch dimension:
import tensorflow as tf
inputs = tf.keras.layers.Input(shape=(1,9))
outputs = tf.keras.layers.Dense(1, activation='linear')(tf.linalg.matmul(inputs, inputs, transpose_a=True))
model = tf.keras.Model(inputs, outputs)
adam = keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(optimizer=adam, loss='mse', metrics=['mae'])
print(model.summary())
Model: "model_3"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_5 (InputLayer) [(None, 1, 9)] 0 []
tf.linalg.matmul_3 (TFOpLambda (None, 9, 9) 0 ['input_5[0][0]',
) 'input_5[0][0]']
dense_4 (Dense) (None, 9, 1) 10 ['tf.linalg.matmul_3[0][0]']
==================================================================================================
Total params: 10
Trainable params: 10
Non-trainable params: 0
__________________________________________________________________________________________________
None
I am reading from your questoion that to do the matrix multiplication inside the NN where number mutiplication is do it easy !
It is sequence to sequence where we had many example of them ( those word sentense input with target multiplication dictionary )
It is no need shape output specify but seuquence output is still answer !
- Using TF.where or greater !
input:
array_1 = [ 0, 1, 1, 0 ]
array_2 = np.concatenate((array_1, array_1), axis = 0)
temp = [ 0, 1, 1, 0 ]
print( np.asarray( tf.where([ temp == [0, 1, 1, 0] ], array_2, 0 ) ) )
input('...')
output:
[0 1 1 0 0 1 1 0]
- Using tfa.seq2seq.BasicDecoder sum
input:
index = 1
next_char = tf.strings.substr(
input_word, index, len(input_word[0].numpy()) - index, unit="UTF8_CHAR", name=None
)
output, state, lengths = decoder(
next_char, start_tokens=start_tokens, end_token=end_token, initial_state=initial_state)
print('next_char[0].numpy(): ' + str(next_char[0].numpy()))
output:
input_word[0].numpy() length: tf.Tensor([b'Glxc3xbccklicherweise '], shape=(1,), dtype=string)
input_word[0].numpy() length: 18
next_char[0].numpy(): b'Glxc3xbccklicherweise '
next_char[0].numpy(): b'lxc3xbccklicherweise '
next_char[0].numpy(): b'xc3xbccklicherweise '
next_char[0].numpy(): b'cklicherweise '
next_char[0].numpy(): b'klicherweise '
next_char[0].numpy(): b'licherweise '
sum = G + L + L + ...
- Model multiplication, you using dense input and the output is sequence of the desired target as in the picture.
I want to use matrix multiplication inside TF model. My model is a NN with input shape = (1,9). And I want to get a product of this vectors by themself (i.e. I want to get a matrix-product equals multiplication of transposed input vector by itself, so its shape equals (9,9)).
Code example:
inputs = tf.keras.layers.Input(shape=(1,9))
outputs = tf.keras.layers.Dense(1, activation='linear')(tf.transpose(inputs) @ inputs)
model = tf.keras.Model(inputs, outputs)
adam = keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(optimizer=adam, loss='mse', metrics=['mae'])
But I have problem with shape of such result. In the case of the above code I get a next architecture:
If I understand correctly, first dimension (None) in the input layer corresponds to size of batch of input data. And when I use transpose operation, it applies to all dimensions in this shape. So I get result with shape (9,1,9) after transpose and multiplication. But I think, that it is not correctly. Because I want to get product of transposed input vector by itself for all vectors in batch (i.e. correct shape for result which I want to get is (None, 9, 9)).
Getting this product as input for the model (compute this multiplication outside this model) is not suitable. Because I want to have in my model original input vector and the result of multiplication to do some operations after (above architecture is not full and using as example).
How can I get correct result? What is correct way to multiply matrices and vectors in TF, if we want to apply this operation to all vectors (matrices) in batch?
Try tf.linalg.matmul
, since it will respect the batch dimension:
import tensorflow as tf
inputs = tf.keras.layers.Input(shape=(1,9))
outputs = tf.keras.layers.Dense(1, activation='linear')(tf.linalg.matmul(inputs, inputs, transpose_a=True))
model = tf.keras.Model(inputs, outputs)
adam = keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(optimizer=adam, loss='mse', metrics=['mae'])
print(model.summary())
Model: "model_3"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_5 (InputLayer) [(None, 1, 9)] 0 []
tf.linalg.matmul_3 (TFOpLambda (None, 9, 9) 0 ['input_5[0][0]',
) 'input_5[0][0]']
dense_4 (Dense) (None, 9, 1) 10 ['tf.linalg.matmul_3[0][0]']
==================================================================================================
Total params: 10
Trainable params: 10
Non-trainable params: 0
__________________________________________________________________________________________________
None
I am reading from your questoion that to do the matrix multiplication inside the NN where number mutiplication is do it easy !
It is sequence to sequence where we had many example of them ( those word sentense input with target multiplication dictionary )
It is no need shape output specify but seuquence output is still answer !
- Using TF.where or greater !
input:
array_1 = [ 0, 1, 1, 0 ]
array_2 = np.concatenate((array_1, array_1), axis = 0)
temp = [ 0, 1, 1, 0 ]
print( np.asarray( tf.where([ temp == [0, 1, 1, 0] ], array_2, 0 ) ) )
input('...')
output:
[0 1 1 0 0 1 1 0]
- Using tfa.seq2seq.BasicDecoder sum
input:
index = 1
next_char = tf.strings.substr(
input_word, index, len(input_word[0].numpy()) - index, unit="UTF8_CHAR", name=None
)
output, state, lengths = decoder(
next_char, start_tokens=start_tokens, end_token=end_token, initial_state=initial_state)
print('next_char[0].numpy(): ' + str(next_char[0].numpy()))
output:
input_word[0].numpy() length: tf.Tensor([b'Glxc3xbccklicherweise '], shape=(1,), dtype=string)
input_word[0].numpy() length: 18
next_char[0].numpy(): b'Glxc3xbccklicherweise '
next_char[0].numpy(): b'lxc3xbccklicherweise '
next_char[0].numpy(): b'xc3xbccklicherweise '
next_char[0].numpy(): b'cklicherweise '
next_char[0].numpy(): b'klicherweise '
next_char[0].numpy(): b'licherweise '
sum = G + L + L + ...
- Model multiplication, you using dense input and the output is sequence of the desired target as in the picture.