how to convert HuggingFace's Seq2seq models to onnx format
Question:
I am trying to convert the Pegasus newsroom in HuggingFace’s transformers model to the ONNX format. I followed this guide published by Huggingface. After installing the prereqs, I ran this code:
!rm -rf onnx/
from pathlib import Path
from transformers.convert_graph_to_onnx import convert
convert(framework="pt", model="google/pegasus-newsroom", output=Path("onnx/google/pegasus-newsroom.onnx"), opset=11)
and got these errors:
ValueError Traceback (most recent call last)
<ipython-input-9-3b37ed1ceda5> in <module>()
3 from transformers.convert_graph_to_onnx import convert
4
----> 5 convert(framework="pt", model="google/pegasus-newsroom", output=Path("onnx/google/pegasus-newsroom.onnx"), opset=11)
6
7
6 frames
/usr/local/lib/python3.6/dist-packages/transformers/models/pegasus/modeling_pegasus.py in forward(self, input_ids, attention_mask, encoder_hidden_states, encoder_attention_mask, head_mask, encoder_head_mask, past_key_values, inputs_embeds, use_cache, output_attentions, output_hidden_states, return_dict)
938 input_shape = inputs_embeds.size()[:-1]
939 else:
--> 940 raise ValueError("You have to specify either decoder_input_ids or decoder_inputs_embeds")
941
942 # past_key_values_length
ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds
I have never seen this error before. Any ideas?
Answers:
Pegasus is a seq2seq
model, you can’t directly convert a seq2seq
model (encoder-decoder model) using this method. The guide
is for BERT which is an encoder model. Any only encoder or only decoder transformer model can be converted using this method.
To convert a seq2seq
model (encoder-decoder) you have to split them and convert them separately, an encoder to onnx and a decoder to onnx. you can follow this guide (it was done for T5 which is also a seq2seq
model)
Why are you getting this error?
while converting PyTorch to onnx
_ = torch.onnx._export(
model,
dummy_input,
...
)
you need to provide a dummy variable to both encoder and to the decoder separately. by default when converting using this method it provides the encoder the dummy variable. Since this method of conversion didn’t accept decoder of this seq2seq model, it won’t give a dummy variable to the decoder and you get the above error.
ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds
The ONNX export of canonical models from Transformers library is supported out of the box in Optimum library (pip install optimum
):
optimum-cli export onnx --model t5-small --task seq2seq-lm-with-past --for-ort t5_small_onnx/
Which will give:
.
└── t5_small_onnx
├── config.json
├── decoder_model.onnx
├── decoder_with_past_model.onnx
├── encoder_model.onnx
├── special_tokens_map.json
├── spiece.model
├── tokenizer_config.json
└── tokenizer.json
You can check optimum-cli export onnx --help
for more details. What’s cool is that the model can then directly be used with ONNX Runtime in (e.g. here) ORTModelForSeq2SeqLM.
Pegasus itself is not yet supported, but will soon be: https://github.com/huggingface/optimum/pull/620
Disclaimer: I am a contributor to this lib.
I am trying to convert the Pegasus newsroom in HuggingFace’s transformers model to the ONNX format. I followed this guide published by Huggingface. After installing the prereqs, I ran this code:
!rm -rf onnx/
from pathlib import Path
from transformers.convert_graph_to_onnx import convert
convert(framework="pt", model="google/pegasus-newsroom", output=Path("onnx/google/pegasus-newsroom.onnx"), opset=11)
and got these errors:
ValueError Traceback (most recent call last)
<ipython-input-9-3b37ed1ceda5> in <module>()
3 from transformers.convert_graph_to_onnx import convert
4
----> 5 convert(framework="pt", model="google/pegasus-newsroom", output=Path("onnx/google/pegasus-newsroom.onnx"), opset=11)
6
7
6 frames
/usr/local/lib/python3.6/dist-packages/transformers/models/pegasus/modeling_pegasus.py in forward(self, input_ids, attention_mask, encoder_hidden_states, encoder_attention_mask, head_mask, encoder_head_mask, past_key_values, inputs_embeds, use_cache, output_attentions, output_hidden_states, return_dict)
938 input_shape = inputs_embeds.size()[:-1]
939 else:
--> 940 raise ValueError("You have to specify either decoder_input_ids or decoder_inputs_embeds")
941
942 # past_key_values_length
ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds
I have never seen this error before. Any ideas?
Pegasus is a seq2seq
model, you can’t directly convert a seq2seq
model (encoder-decoder model) using this method. The guide
is for BERT which is an encoder model. Any only encoder or only decoder transformer model can be converted using this method.
To convert a seq2seq
model (encoder-decoder) you have to split them and convert them separately, an encoder to onnx and a decoder to onnx. you can follow this guide (it was done for T5 which is also a seq2seq
model)
Why are you getting this error?
while converting PyTorch to onnx
_ = torch.onnx._export(
model,
dummy_input,
...
)
you need to provide a dummy variable to both encoder and to the decoder separately. by default when converting using this method it provides the encoder the dummy variable. Since this method of conversion didn’t accept decoder of this seq2seq model, it won’t give a dummy variable to the decoder and you get the above error.
ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds
The ONNX export of canonical models from Transformers library is supported out of the box in Optimum library (pip install optimum
):
optimum-cli export onnx --model t5-small --task seq2seq-lm-with-past --for-ort t5_small_onnx/
Which will give:
.
└── t5_small_onnx
├── config.json
├── decoder_model.onnx
├── decoder_with_past_model.onnx
├── encoder_model.onnx
├── special_tokens_map.json
├── spiece.model
├── tokenizer_config.json
└── tokenizer.json
You can check optimum-cli export onnx --help
for more details. What’s cool is that the model can then directly be used with ONNX Runtime in (e.g. here) ORTModelForSeq2SeqLM.
Pegasus itself is not yet supported, but will soon be: https://github.com/huggingface/optimum/pull/620
Disclaimer: I am a contributor to this lib.