spaCy 3 beam parse for NER probability

Question:

I’m trying to retrieve the probability of my spaCy model in assigning the right label to an entity. I have spaCy version 3.0.5.

threshold = 0.5
        
for i in testing_raw:
    doc = nlp_updated(i)
    beams = nlp_updated.beam_parse([ doc ], beam_width = 16, beam_density = 0.0001)
    entity_scores = defaultdict(float)

    for beam in beams:
        for score, ents in nlp_updated.entity.moves.get_beam_parses(beam):
            for start, end, label in ents:
                entity_scores[(start, end, label)] += score

        for key in entity_scores:
            start, end, label = key
            score = entity_scores[key]
            if ( score > threshold):
                print ('Label: {}, Text: {}, Score: {}'.format(label, doc[start:end], score))

The following line throws this error:

beams = nlp_updated.beam_parse([ doc ], beam_width = 16, beam_density = 0.0001)

AttributeError: 'English' object has no attribute 'beam_parse'

Is this because spaCy version 3 doesn’t consider beam_parse? If so, how can I do this in this version of spaCy as I can’t seem to find anything in the documentation?

Asked By: user47467

||

Answers:

This workaround for getting NER probabilities doesn’t work in v3 because the API has changed, and there’s no recommended replacement at the moment.

A SpanCategorizer is being developed that will allow you to get NER labels with confidence scores.

Answered By: polm23

Use nlp.get_pipe('ner') instead of nlp.entity to get the NER component. (SpaCy 3.4.4)

Answered By: Delta3
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.