Python AttributeError: 'Page' object has no attribute '_getContents'

Question:

I’am trying to remove the watermark from the PDF by using a python code and the code that i am running is
I am using PyMuPDF and have used fitz library.

def remove_img_on_pdf(idoc, page):
    #image list
    img_list = idoc.getPageImageList(page)
    con_list = idoc[page]._getContents()


    # xref 274 is the only /Contents object of the page (could be
    for i in con_list:
        c = idoc._getXrefStream(i) # read the stream source
        #print(c)
        if c != None:
            for v in img_list:
                
                arr = bytes(v[7], 'utf-8')
                r = c.find(arr) # try find the image display command
                if r != -1:
                    cnew = c.replace(arr, b"")
                    idoc._updateStream(i, cnew)
                    c = idoc._getXrefStream(i)
    return idoc




doc=fitz.open('ELN_Mod3AzDOCUMENTS.PDF')
rdoc = remove_img_on_pdf(doc, 0) #first page
rdoc.save('no_img_example.PDF')

I get this error saying

Traceback (most recent call last):
  File "watermark.py", line 27, in <module>
    rdoc = remove_img_on_pdf(doc, 0) #first page
  File "watermark.py", line 5, in remove_img_on_pdf
    con_list = idoc[page]._getContents()
AttributeError: 'Page' object has no attribute '_getContents'

Please help me find out a solution out of this, thank you in advance.

Asked By: Manoj Shivaprakash

||

Answers:

Your function have some strange methods such as _getContents, _getXrefStream and _updateStream, maybe they are deprecated or somthing, but here is working code for solving your problem:

import fitz


def remove_img_on_pdf(idoc, page):
    img_list = idoc.getPageImageList(page)
    con_list = idoc[page].get_contents()

    for i in con_list:
        c = idoc.xref_stream(i)
        if c != None:
            for v in img_list:
                arr = bytes(v[7], 'utf-8')
                r = c.find(arr)
                if r != -1:
                    cnew = c.replace(arr, b"")
                    idoc.update_stream(i, cnew)
                    c = idoc.xref_stream(i)
    return idoc


doc = fitz.open('ELN_Mod3AzDOCUMENTS.PDF')
rdoc = remove_img_on_pdf(doc, 0)
rdoc.save('no_img_example.PDF')

As you can see, I’ve used another methods instead of non-working ones. Also, here is documentation for PyMuPDF.

Answered By: INNO

Ïf you print dir(idoc[0]) , you see a list of attributes. you should use idoc[page].get_contents() instead.

Answered By: Maryam Gholami
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.