Matrix multiplication in TensorFlow model

Question:

I want to use matrix multiplication inside TF model. My model is a NN with input shape = (1,9). And I want to get a product of this vectors by themself (i.e. I want to get a matrix-product equals multiplication of transposed input vector by itself, so its shape equals (9,9)).

Code example:

inputs = tf.keras.layers.Input(shape=(1,9))
outputs = tf.keras.layers.Dense(1, activation='linear')(tf.transpose(inputs) @ inputs)
    
model = tf.keras.Model(inputs, outputs)

adam = keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)

model.compile(optimizer=adam, loss='mse', metrics=['mae'])

But I have problem with shape of such result. In the case of the above code I get a next architecture:

enter image description here

If I understand correctly, first dimension (None) in the input layer corresponds to size of batch of input data. And when I use transpose operation, it applies to all dimensions in this shape. So I get result with shape (9,1,9) after transpose and multiplication. But I think, that it is not correctly. Because I want to get product of transposed input vector by itself for all vectors in batch (i.e. correct shape for result which I want to get is (None, 9, 9)).

Getting this product as input for the model (compute this multiplication outside this model) is not suitable. Because I want to have in my model original input vector and the result of multiplication to do some operations after (above architecture is not full and using as example).

How can I get correct result? What is correct way to multiply matrices and vectors in TF, if we want to apply this operation to all vectors (matrices) in batch?

Asked By: Alimagadov K.

||

Answers:

Try tf.linalg.matmul, since it will respect the batch dimension:

import tensorflow as tf

inputs = tf.keras.layers.Input(shape=(1,9))
outputs = tf.keras.layers.Dense(1, activation='linear')(tf.linalg.matmul(inputs, inputs, transpose_a=True))
    
model = tf.keras.Model(inputs, outputs)

adam = keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)

model.compile(optimizer=adam, loss='mse', metrics=['mae'])
print(model.summary())
Model: "model_3"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_5 (InputLayer)           [(None, 1, 9)]       0           []                               
                                                                                                  
 tf.linalg.matmul_3 (TFOpLambda  (None, 9, 9)        0           ['input_5[0][0]',                
 )                                                                'input_5[0][0]']                
                                                                                                  
 dense_4 (Dense)                (None, 9, 1)         10          ['tf.linalg.matmul_3[0][0]']     
                                                                                                  
==================================================================================================
Total params: 10
Trainable params: 10
Non-trainable params: 0
__________________________________________________________________________________________________
None
Answered By: AloneTogether

I am reading from your questoion that to do the matrix multiplication inside the NN where number mutiplication is do it easy !
It is sequence to sequence where we had many example of them ( those word sentense input with target multiplication dictionary )
It is no need shape output specify but seuquence output is still answer !

  1. Using TF.where or greater !
    input:
    array_1 = [ 0, 1, 1, 0 ]
    array_2 = np.concatenate((array_1, array_1), axis = 0)
    temp = [ 0, 1, 1, 0 ]
    
    print( np.asarray( tf.where([ temp == [0, 1, 1, 0] ], array_2, 0 ) ) )
    
    input('...')

output:

[0 1 1 0 0 1 1 0]
  1. Using tfa.seq2seq.BasicDecoder sum
    input:
    index = 1
    next_char = tf.strings.substr(
        input_word, index, len(input_word[0].numpy()) - index, unit="UTF8_CHAR", name=None
    )
    output, state, lengths = decoder(
        next_char, start_tokens=start_tokens, end_token=end_token, initial_state=initial_state)
    
    print('next_char[0].numpy(): ' + str(next_char[0].numpy()))

output:

input_word[0].numpy() length: tf.Tensor([b'Glxc3xbccklicherweise '], shape=(1,), dtype=string)
input_word[0].numpy() length: 18
next_char[0].numpy(): b'Glxc3xbccklicherweise '
next_char[0].numpy(): b'lxc3xbccklicherweise '
next_char[0].numpy(): b'xc3xbccklicherweise '
next_char[0].numpy(): b'cklicherweise '
next_char[0].numpy(): b'klicherweise '
next_char[0].numpy(): b'licherweise '

sum = G + L + L + ...
  1. Model multiplication, you using dense input and the output is sequence of the desired target as in the picture.
Answered By: Jirayu Kaewprateep