huggingface-transformers

How to prevent transformer generate function to produce certain words?

How to prevent transformer generate function to produce certain words? Question: I have the following code: from transformers import T5Tokenizer, T5ForConditionalGeneration tokenizer = T5Tokenizer.from_pretrained("t5-small") model = T5ForConditionalGeneration.from_pretrained("t5-small") input_ids = tokenizer("The <extra_id_0> walks in <extra_id_1> park", return_tensors="pt").input_ids sequence_ids = model.generate(input_ids) sequences = tokenizer.batch_decode(sequence_ids) sequences Currently it produces this: [‘<pad><extra_id_0> park offers<extra_id_1> the<extra_id_2> park.</s>’] Is there a …

Total answers: 1

Loading Hugging face model is taking too much memory

Loading Hugging face model is taking too much memory Question: I am trying to load a large Hugging face model with code like below: model_from_disc = AutoModelForCausalLM.from_pretrained(path_to_model) tokenizer_from_disc = AutoTokenizer.from_pretrained(path_to_model) generator = pipeline("text-generation", model=model_from_disc, tokenizer=tokenizer_from_disc) The program is quickly crashing after the first line because it is running out of memory. Is there a way …

Total answers: 1

What does config inside “super().__init__(config)“ actually do?

What does config inside “super().__init__(config)“ actually do? Question: I have the following code to create a custom model for Named-entity-recognition. Using ChatGPT and Copilot, I’ve commented it to understand its functionality. However, the point with config inside super().__init__(config) is not clear for me. Which role does it play since we have already used XLMRobertaConfig at …

Total answers: 1

Huggingface Trainer throws an AttributeError:'Namespace' object has no attribute 'get_process_log_level

Huggingface Trainer throws an AttributeError:'Namespace' object has no attribute 'get_process_log_level Question: I am trying to run Trainer from Hugging face(pytorch) with arguments parser. My code looks like if __name__ == ‘__main__’: parser = HfArgumentParser(TrainingArguments) parser.add_argument(‘–model_name_or_path’, type=str, required=True) . . . . training_args = parser.parse_args() print(‘args’, training_args) os.makedirs(training_args.output_dir, exist_ok=True) random.seed(training_args.seed) set_seed(training_args.seed) dataset_train = … . . …

Total answers: 1

How to do inference with fined-tuned huggingface models?

How to do inference with fined-tuned huggingface models? Question: I have fine-tuned a Huggingface model using the IMDB dataset, and I was able to use the trainer to make predictions on the test set by doing trainer.predict(test_ds_encoded). However, when doing the same thing with the inference set that has a dummy label feature (all -1s …

Total answers: 1

Why is the parallel version of my code slower than the serial one?

Why is the parallel version of my code slower than the serial one? Question: I am trying to run a model multiple times. As a result it is time consuming. As a solution I try to make it parallel. However, it ends up to be slower. Parallel is 40 seconds while serial is 34 seconds. …

Total answers: 1

Sending large file to gcloud worked on another internet connection but not mine

Sending large file to gcloud worked on another internet connection but not mine Question: So I am doing this to send my 400 megabyte ai model to the cloud model_file = pickle.dumps(model) blob = bucket.blob("models/{user_id}.pickle") blob.upload_from_string(model_file) it takes a long time to process then i get three errors: ssl.SSLWantWriteError: The operation did not complete (write) …

Total answers: 1

tokenizer.push_to_hub(repo_name) is not working

tokenizer.push_to_hub(repo_name) is not working Question: I’m trying to puch my tokonizer to my huggingface repo… it consist of the model vocab.Json (I’m making a speech recognition model) My code: vocab_dict["|"] = vocab_dict[" "] del vocab_dict[" "] vocab_dict["[UNK]"] = len(vocab_dict) vocab_dict["[PAD]"] = len(vocab_dict) len(vocab_dict) import json with open(‘vocab.json’, ‘w’) as vocab_file: json.dump(vocab_dict, vocab_file) from transformers import …

Total answers: 3

Decoding hidden layer embeddings in T5

Decoding hidden layer embeddings in T5 Question: I’m new to NLP (pardon the very noob question!), and am looking for a way to perform vector operations on sentence embeddings (e.g., randomization in embedding-space in a uniform ball around a given sentence) and then decode them. I’m currently attempting to use the following strategy with T5 …

Total answers: 1