Displacy Custom Colors for custom entities using Displacy

Question:

I have a list of words, noun-verb phrases and I want to:

  • Search dependency patterns, words, in a corpus of text
  • identify the paragraph that matches appears in
  • extract the paragraph
  • highlight the matched words in the paragraph
  • create a snip/jpeg of the paragraph with matched words highlighted
  • save the image in an excel.

The MWE below pertains to highlighting the matched words and displaying them using displacy. I have mentioned the rest of my task just to provide the context. The output isn’t coloring the custom entities with custom colors.

import spacy
from spacy.matcher import PhraseMatcher
from spacy.tokens import Span

good = ['bacon', 'chicken', 'lamb','hot dog']
bad = [ 'apple', 'carrot']

nlp = spacy.load('en_core_web_sm')  
patterns1 = [nlp(good) for good in good]
patterns2 = [nlp(bad) for bad in bad]
matcher = PhraseMatcher(nlp.vocab)
matcher.add('good', None, *patterns1)
matcher.add('bad', None, *patterns2)

doc = nlp("I like bacon and chicken but unfortunately I only had an apple and a carrot in the fridge")
matches = matcher(doc)

for match_id, start, end in matches:
    
    span = Span(doc, start, end, label=match_id)
    doc.ents = list(doc.ents) + [span]  # add span to doc.ents

print([(ent.text, ent.label_) for ent in doc.ents])  

The code above produces this output:

[('bacon', 'good'), ('chicken', 'good'), ('apple', 'bad'), ('carrot', 'bad')]

But when I try to custom color the entities, it doesn’t seem to be working.

from spacy import displacy
colors = {'good': "#85C1E9", "bad": "#ff6961"}
options = {"ents": ['good', 'bad'], "colors": colors}

displacy.serve(doc, style='ent',options=options)

This is the output I get:

enter image description here

Asked By: Amatya

||

Answers:

I just copy/pasted your code and it works fine here. I’m using spaCy v3.1.1.

enter image description here

What does the HTML output source look like?


I was able to reproduce your issue on spaCy 2.3.5. I was able to fix it by making the labels upper-case (GOOD and BAD). I can’t find a bug about this but since the models normally only use uppercase labels I guess this is an issue with older versions.

Answered By: polm23

I am sharing my own, quite customized, example which uses SpaCy to visualize FrameNet annotations:

import nltk
nltk.download('framenet_v17')
from nltk.corpus import framenet as fn
from spacy import displacy
import matplotlib
import matplotlib.pyplot as plt

FRAME_NAME = "Expectation" # choice your own from fn.frames() !
FRAME_ELEMENTS = [e.name for _,e in fn.frame_by_name(FRAME_NAME)['FE'].items() if e['coreType'] == 'Core']
print(f"Frame={FRAME_NAME},CoreElements={'+'.join(FRAME_ELEMENTS)}")

COLORS = plt.cycler("color", plt.cm.Pastel2.colors) # chose your colors !
COLORS = [matplotlib.colors.to_hex(c['color']) for c in COLORS] # doesn't work work in RGB? 
COLORS = dict(zip([FRAME_NAME]+FRAME_ELEMENTS,COLORS))

for s in fn.exemplars(frame='Expectation'):

    span_labels = [dict(s)['Target'][0]+(FRAME_NAME,)]+dict(s)['FE'][0]
    span_labels = [{"start":t[0],"end":t[1],"label":t[2]} for t in span_labels]

    args = {
        "text": s['text'],
        "ents": span_labels,
    }

    displacy.render(args,style="ent",manual=True,options={"colors":COLORS})

enter image description here

Answered By: Maciej S.
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.