How to load transformers pipeline from folder?
Question:
According to here pipeline provides an interface to save a pretrained pipeline locally with a save_pretrained
method. When I use it, I see a folder created with a bunch of json and bin files presumably for the tokenizer and the model.
But the documentation does not specify a load method. How does one initialize a pipeline using a locally saved pipeline?
Answers:
If you read the specification for save_pretrained
, it simply states that it
Save[s] the pipeline’s model and tokenizer.
I’ve also given a slightly related answer here on how custom models and tokenizers can be loaded. Essentially, you can simply specify the specific models/paths in the pipeline
:
from transformers import pipeline, AutoModel, AutoTokenizer
# Replace with your custom model of choice
model = AutoModel.from_pretrained('/path/to/your/model')
tokenizer = AutoTokenizer.from_pretrained('/path/to/your/tokenizer')
pipe = pipeline(task='summarization', # replace with whatever task you have
model=model,
tokenizer=tokenizer)
Apparently the default initialization works with local folders as well. So one can download a model like this:
pipe = pipeline("text-classification")
pipe.save_pretrained("my_local_path")
And later load it like
pipe = pipeline("text-classification", model = "my_local_path")
According to here pipeline provides an interface to save a pretrained pipeline locally with a save_pretrained
method. When I use it, I see a folder created with a bunch of json and bin files presumably for the tokenizer and the model.
But the documentation does not specify a load method. How does one initialize a pipeline using a locally saved pipeline?
If you read the specification for save_pretrained
, it simply states that it
Save[s] the pipeline’s model and tokenizer.
I’ve also given a slightly related answer here on how custom models and tokenizers can be loaded. Essentially, you can simply specify the specific models/paths in the pipeline
:
from transformers import pipeline, AutoModel, AutoTokenizer
# Replace with your custom model of choice
model = AutoModel.from_pretrained('/path/to/your/model')
tokenizer = AutoTokenizer.from_pretrained('/path/to/your/tokenizer')
pipe = pipeline(task='summarization', # replace with whatever task you have
model=model,
tokenizer=tokenizer)
Apparently the default initialization works with local folders as well. So one can download a model like this:
pipe = pipeline("text-classification")
pipe.save_pretrained("my_local_path")
And later load it like
pipe = pipeline("text-classification", model = "my_local_path")