Error updating ner model in spacy 3, any advice?

Question:

I am currently updating the NER model from fr_core_news_lg pipeline. The code used to work about 1 or 2 months ago, when I last used it. But now, something happened and I can’t run it anymore. I haven’t change anything from the code, just wanted to run it again. But I received the following error:

Traceback (most recent call last):
File "../nermodel.py", line 174, in <module>
ner_model.train(med_label)
File "../nermodel.py", line 102, in train
optimizer = self.nlp.entity.create_optimizer()
AttributeError: 'French' object has no attribute 'entity'

The error points to the part of the code where I update my NER model with new examples:

def train(self, label, n_iter=10, batch_size=50):
    # creating an optimizer and selecting a list of pipes NOT to train
    optimizer = self.nlp.entity.create_optimizer()
    other_pipes = [pipe for pipe in self.nlp.pipe_names if pipe != 'ner']

    # adding a named entity label
    ner = self.nlp.get_pipe('ner')
    ner.add_label(label)

    with self.nlp.disable_pipes(*other_pipes):
        for itn in range(n_iter):
            random.shuffle(self.train_data)
            losses = {}

            # batch the examples and iterate over them
            for batch in spacy.util.minibatch(self.train_data, size=batch_size):
                texts = [text for text, entities in batch]
                annotations = [entities for text, entities in batch]

                # update the model
                self.nlp.update(texts, annotations, sgd=optimizer, losses=losses)
                print(losses)
    print("Final loss: ", losses)

A single training example, so that NER learns that ‘consultation’ is an entity, goes as follows:

('et la consultation post-réanimation', {'entities': [(6, 18, 'MEDICAL_TERM')]})

I’ve updated SpaCy to the most recent version, and downloaded again the fr_core_news_lg model, even tried this in a new python environment, to no avail. Which makes me think that there’s a change in the pipeline or in SpaCy library. Googling around, I wasn’t able to find precisely an answer for this. Does anybody have a fix for this?

EDIT: Provided more details.

Asked By: Rafael

||

Answers:

I think this code should work for you:

def train(self, label, n_iter=10, batch_size=50):
    # creating an optimizer and selecting a list of pipes NOT to train
    optimizer = self.nlp.create_optimizer()
    other_pipes = [pipe for pipe in self.nlp.pipe_names if pipe != 'ner']

    # adding a named entity label
    ner = self.nlp.get_pipe('ner')
    ner.add_label(label)

    with self.nlp.disable_pipes(*other_pipes):
        for itn in range(n_iter):
            random.shuffle(self.train_data)
            losses = {}

            # batch the examples and iterate over them
            for batch in spacy.util.minibatch(self.train_data, size=batch_size):
                for text, annotations in batch:
                    doc = nlp.make_doc(text)
                    example = Example.from_dict(doc, annotations)
                    nlp.update([example], drop=0.35, sgd=optimizer, losses=losses)
                print(losses)
    print("Final loss: ", losses)

To break it down a little bit further, in spacy 3 there are two changes:

  1. They got rid of entity in nlp.entity.create_optimizer()
  2. We don’t pass texts and annotations directly to nlp.update() but with Example
Answered By: krisograbek
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.