How to change the highlight color in pdf using fitz module in python

Question:

Hi I am trying to change the highlight color in a pdf but not able to do so.
The default highlight color is yellow but i want to change it
Following is my code:

    import fitz

    doc = fitz.open(r"pathinput.pdf")

    page=doc[0]
    text="some text"
    text_instances = page.searchFor(text)


    for inst in text_instances:
        highlight = page.addHighlightAnnot(inst)
        highlight.setColors(colors='Red')
        highlight.update()


    doc.save(r"pathoutput.pdf")    

Also how do i search for the entire pdf together and not just one page

and how can i highlight text on an image given in a pdf

Asked By: Gavya Mehta

||

Answers:

I think the setColors expects a dictionary, check the documentation here

import fitz

doc = fitz.open("test.pdf")


page = doc[0]

text = "result"

text_instances = page.searchFor(text)

for inst in text_instances:
    highlight = page.addHighlightAnnot(inst)
    highlight.setColors({"stroke":(0, 0, 1), "fill":(0.75, 0.8, 0.95)})
    highlight.update()


doc.save("output.pdf")

enter image description here

Answered By: ybl

I tried the following and it worked

import fitz

doc = fitz.open(r"pathtopdffile.pdf")
page = doc[6]
# highlighting a pre-determined coordinate
highlight = page.addHighlightAnnot((10, 628.9634743875279, 642.0, 640.9634743875279))
highlight.set_colors(stroke=[1, 0.8, 0.8]) # light red color (r, g, b)
highlight.update()

In this code snippet, I am just trying to annotate using coordinates of the text and I am getting these coordinates from external code.
Some more colors I used:

highlight.set_colors(stroke=[0.5, 1, 1]) # light aqua
highlight.set_colors(stroke=[0.5, 0, 0]) # dark brown

Simple way to determine the color will be to multiply 255 to each of these values.

Divide 255 from each values for other way round (converting RGB to this notation)

Example:

stroke=[0.5, 1, 1] # RGB(255*0.5, 255*1, 255*1) = RGB(127, 255, 255)
Answered By: sourin

If anyone is having issues with the above question, update:
text_instances = page.searchFor(text)
with
text_instances = page.search_for(text),
as the instance name has been updated in later versions of fitz/pymupdf.

Answered By: Enda Crossan
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.