Spacy add_alias, TypeError

Question:

MWE

from spacy.kb import KnowledgeBase
import spacy 


#kb.add_entity already called.
nlp = spacy.blank("en")
kb = KnowledgeBase(vocab=nlp.vocab, entity_vector_length=96)
name = "test"
qid = 1 # type(qid) => int
kb.add_alias(alias=name.lower(), entities=[qid], probabilities=[1])

produces the error at the last line: TypeError: an integer is required

A previous SO post suggested that the same error arose in another context (importing SpaCy) because the version of srsly was greater than 2. Using their solution of downgrading to v1.0.1 of srsly merely switched the error to module srsly has no attribute read_yaml.

I am using spacy 3.4.4 and srsly 2.4.5.

Update
A fuller stack trace points to line 228 in spacy/kb.pyx:

  for entity, prob in zip(entities, probabilities):
            entity_hash = self.vocab.strings[entity] #this gives the error
            if not entity_hash in self._entry_index:
                raise ValueError(Errors.E134.format(entity=entity))

            entry_index = <int64_t>self._entry_index.get(entity_hash)
            entry_indices.push_back(int(entry_index))
            probs.push_back(float(prob))
Asked By: mac389

||

Answers:

That looks like a bug. In the API docs KnowledgeBase.add_alias has type Iterable[Union[str, int]] for entities but the code above (the actual error is actually one line below) only works for str and not int values. (The marked line should have self.vocab.strings.as_int(entity).)

That said, the value 1 is probably not going to be the right value here no matter what and the simplest solution is to use strings instead like "1" or "Q1", which should currently work as expected. You also need to add the entity before adding aliases (this snippet is not going to work even with a string value).

Answered By: aab
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.