How to add a dense layer on top of SentenceTransformer?

Question:

In this tutorial (Train and Fine-Tune Sentence Transformers Models) they go through creating a SentenceTransformer by combining a word embedding module with a pooling layer:

from sentence_transformers import SentenceTransformer, models

## Step 1: use an existing language model
word_embedding_model = models.Transformer('distilroberta-base')

## Step 2: use a pool function over the token embeddings
pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension())

## Join steps 1 and 2 using the modules argument
model = SentenceTransformer(modules=[word_embedding_model, pooling_model])

# model.encode("Hi there")  # => works fine

And then they say:

If necessary, additional layers can be added, for example, dense, bag of words, and convolutional.

I tried to add a dense layer on top of the model, but I’m getting an error:

from sentence_transformers import SentenceTransformer, models

## Step 1: use an existing language model
word_embedding_model = models.Transformer('distilroberta-base')

## Step 2: use a pool function over the token embeddings
pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension())

##  My Dense Layer
dense_layer = torch.nn.Linear(pooling_model.get_sentence_embedding_dimension(), 128)

## Join steps 1 and 2 using the modules argument
model = SentenceTransformer(modules=[word_embedding_model, pooling_model, dense_layer])

And when I run model.encode("hi there") I get:

TypeError: linear(): argument ‘input’ (position 1) must be Tensor, not dict

I found the same error here but using BertModel.from_pretrained, not models.Transformer. The suggested answer (passing the argument return_dict=False) doesn’t work:

word_embedding_model = models.Transformer('distilroberta-base', return_dict=False)

TypeError: Transformer.init() got an unexpected keyword argument ‘return_dict’

Any ideas how to add a dense layer correctly?

Asked By: Alaa M.

||

Answers:

According to the documentation, replace this line:

dense_layer = torch.nn.Linear(pooling_model.get_sentence_embedding_dimension(), 128)

with the following:

dense_layer = models.Dense(pooling_model.get_sentence_embedding_dimension(), 128)
Answered By: Ro.oT