Tensorflow Multi Head Attention on Inputs: 4 x 5 x 20 x 64 with attention_axes=2 throwing mask dimension error (tf 2.11.0)
Tensorflow Multi Head Attention on Inputs: 4 x 5 x 20 x 64 with attention_axes=2 throwing mask dimension error (tf 2.11.0) Question: The expectation here is that the attention is applied on the 2nd dimension (4, 5, 20, 64). I am trying to apply self attention using the following code (issue reproducible with this code): …
Total answers: 1