transformer-model

Failure to install old versions of transformers in colab

Failure to install old versions of transformers in colab Question: I recently had a problem installing Transformer version 2.9.0 in colab. Asked By: hana || Source Answers: Colab has recently upgraded to Python 3.9. There is a temporary mechanism for users to run Python 3.8 runtime (Linux-5.10.147+-x86_64-with-glibc2.29 platform). This is available from the Command Palette …

Total answers: 2

Decoding hidden layer embeddings in T5

Decoding hidden layer embeddings in T5 Question: I’m new to NLP (pardon the very noob question!), and am looking for a way to perform vector operations on sentence embeddings (e.g., randomization in embedding-space in a uniform ball around a given sentence) and then decode them. I’m currently attempting to use the following strategy with T5 …

Total answers: 1

Transformer Positional Encoding — What is maxlen used for

Transformer Positional Encoding — What is maxlen used for Question: class PositionalEncoding(nn.Module): def __init__(self, emb_size: int, dropout: float, maxlen: int = 5000): super(PositionalEncoding, self).__init__() den = torch.exp(- torch.arange(0, emb_size, 2)* math.log(10000) / emb_size) pos = torch.arange(0, maxlen).reshape(maxlen, 1) pos_embedding = torch.zeros((maxlen, emb_size)) pos_embedding[:, 0::2] = torch.sin(pos * den) pos_embedding[:, 1::2] = torch.cos(pos * den) pos_embedding …

Total answers: 1

How to import Transformers with Tensorflow

How to import Transformers with Tensorflow Question: After installing Transformers using pip install Transformers I get version 4.25.1 , but when I try to import Transformer by from tensorflow.keras.layers import Transformer # or from tensorflow.keras.layers.experimental import Transformer I get this error: ImportError: cannot import name ‘Transformer’ from ‘tensorflow.keras.layers’ I am using Tenserflow 2.10 and python …

Total answers: 3

Masking layer vs attention_mask parameter in MultiHeadAttention

Masking layer vs attention_mask parameter in MultiHeadAttention Question: I use MultiHeadAttention layer in my transformer model (my model is very similar to the named entity recognition models). Because my data comes with different lengths, I use padding and attention_mask parameter in MultiHeadAttention to mask padding. If I would use the Masking layer before MultiHeadAttention, will …

Total answers: 2

How to save/load a model checkpoint with several losses in Pytorch?

How to save/load a model checkpoint with several losses in Pytorch? Question: Using Ubuntu 20.04, Pytorch 1.10.1. I am trying to solve a music generation task with a transformer architecture and multi-embeddings, for processing tokens with several characteristics. In each training iteration, I have to calculate the loss of each token characteristic and store it …

Total answers: 2

Get probability of multi-token word in MASK position

Get probability of multi-token word in MASK position Question: It is relatively easy to get a token’s probability according to a language model, as the snippet below shows. You can get the output of a model, restrict yourself to the output of the masked token, and then find the probability of your requested token in …

Total answers: 2