Python: TypeError: expected str, bytes or os.PathLike object, not PdfFileReader
Question:
I have the following code. This is just a starting point. Later on I’d like to replace the static “Hello Word” text with items from a csv file that i read and loop through for every item in the csv.
I want the watermark on every page.
# importing the required modules
import PyPDF2
import io
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import letter
def add_watermark(wmFile, pageObj):
# opening watermark pdf file
wmFileObj = open(wmFile, 'rb')
# creating pdf reader object of watermark pdf file
pdfReader = PyPDF2.PdfFileReader(wmFileObj)
# merging watermark pdf's first page with passed page object.
pageObj.mergePage(pdfReader.getPage(0))
# closing the watermark pdf file object
wmFileObj.close()
# returning watermarked page object
return pageObj
def main():
import PyPDF2
import io
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import letter
# watermark pdf file name
packet = io.BytesIO()
# Create a new PDF with Reportlab
can = canvas.Canvas(packet, pagesize=letter)
can.setFont('Helvetica-Bold',18)
can.drawString(10, 100, "Hello world")
can.showPage()
can.save()
# Move to the beginning of the StringIO buffer
packet.seek(0)
mywatermark = PyPDF2.PdfFileReader(packet)
# original pdf file name
origFileName = 'Module1.pdf'
# new pdf file name
newFileName = 'watermarked_example.pdf'
# creating pdf File object of original pdf
pdfFileObj = open(origFileName, 'rb')
# creating a pdf Reader object
pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
# creating a pdf writer object for new pdf
pdfWriter = PyPDF2.PdfFileWriter()
# adding watermark to each page
for page in range(pdfReader.numPages):
# creating watermarked page object
wmpageObj = add_watermark(mywatermark, pdfReader.getPage(page))
# adding watermarked page object to pdf writer
pdfWriter.addPage(wmpageObj)
# new pdf file object
newFile = open(newFileName, 'wb')
# writing watermarked pages to new file
pdfWriter.write(newFile)
# closing the original pdf file object
pdfFileObj.close()
# closing the new pdf file object
newFile.close()
if __name__ == "__main__":
main()
The error I get is:
Traceback (most recent call last):
File "watermark.py", line 101, in <module>
main()
File "watermark.py", line 83, in main
wmpageObj = add_watermark(mywatermark, pdfReader.getPage(page))
File "watermark.py", line 32, in add_watermark
wmFileObj = open(wmFile, 'rb')
TypeError: expected str, bytes or os.PathLike object, not PdfFileReader
I believe I get the point that it’s expecting a string, bytes or a file, which I don’t write, it’s just an “object”.
I tried a couple of things but whatever I try it makes things actually worse 🙁
Can someone help out? I’m pretty sure it’s just a small thing as I’m good in overseeing the obvious.
any help is appreciated.
thanks
Answers:
I’ll leave the guides and imperfections to the end, here’s how you fix this piece of code:
1) Set the variable ‘packet’ to an existing PDF-file filename in the same directory that the script is in:
packet = 'my_watermark.pdf'
2) Delete the moving to the beginning of the ‘stringIO’ buffer (like we’d ever need it):
packet.seek(0) # delete this
mywatermark = PyPDF2.PdfFileReader(packet) #delete this too
3) Give ‘packet’ as an argument instead of ‘mywatermark’ in the for-loop block:
wmpageObj = add_watermark(packet, pdfReader.getPage(page))
4) From the add_watermark function delete file openings and closings, leave only the constructing of the PdfFileReader instance, but with the parameter ‘wmFile’:
wmFileObj = open(wmFile, 'rb') # delete this
pdfReader = PyPDF2.PdfFileReader(wmFile) # let this be, but change wmFileObj to wmFile
pageObj.mergePage(pdfReader.getPage(0)) # let this be
wmFileObj.close() # delete this
return pageObj # let this be
Also, in your code there are imports in your main function, move them to the beginning of the file, and do read some documentation. PyPDF2‘s documentation shows how to merge pages (it’s the module’s specialty tbh), and while it’s a bit laconical, on the other side, Reportlab‘s User Guide is very thorough, but straightforward. Always try to see the meaning too behind your code.
I have the following code. This is just a starting point. Later on I’d like to replace the static “Hello Word” text with items from a csv file that i read and loop through for every item in the csv.
I want the watermark on every page.
# importing the required modules
import PyPDF2
import io
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import letter
def add_watermark(wmFile, pageObj):
# opening watermark pdf file
wmFileObj = open(wmFile, 'rb')
# creating pdf reader object of watermark pdf file
pdfReader = PyPDF2.PdfFileReader(wmFileObj)
# merging watermark pdf's first page with passed page object.
pageObj.mergePage(pdfReader.getPage(0))
# closing the watermark pdf file object
wmFileObj.close()
# returning watermarked page object
return pageObj
def main():
import PyPDF2
import io
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import letter
# watermark pdf file name
packet = io.BytesIO()
# Create a new PDF with Reportlab
can = canvas.Canvas(packet, pagesize=letter)
can.setFont('Helvetica-Bold',18)
can.drawString(10, 100, "Hello world")
can.showPage()
can.save()
# Move to the beginning of the StringIO buffer
packet.seek(0)
mywatermark = PyPDF2.PdfFileReader(packet)
# original pdf file name
origFileName = 'Module1.pdf'
# new pdf file name
newFileName = 'watermarked_example.pdf'
# creating pdf File object of original pdf
pdfFileObj = open(origFileName, 'rb')
# creating a pdf Reader object
pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
# creating a pdf writer object for new pdf
pdfWriter = PyPDF2.PdfFileWriter()
# adding watermark to each page
for page in range(pdfReader.numPages):
# creating watermarked page object
wmpageObj = add_watermark(mywatermark, pdfReader.getPage(page))
# adding watermarked page object to pdf writer
pdfWriter.addPage(wmpageObj)
# new pdf file object
newFile = open(newFileName, 'wb')
# writing watermarked pages to new file
pdfWriter.write(newFile)
# closing the original pdf file object
pdfFileObj.close()
# closing the new pdf file object
newFile.close()
if __name__ == "__main__":
main()
The error I get is:
Traceback (most recent call last):
File "watermark.py", line 101, in <module>
main()
File "watermark.py", line 83, in main
wmpageObj = add_watermark(mywatermark, pdfReader.getPage(page))
File "watermark.py", line 32, in add_watermark
wmFileObj = open(wmFile, 'rb')
TypeError: expected str, bytes or os.PathLike object, not PdfFileReader
I believe I get the point that it’s expecting a string, bytes or a file, which I don’t write, it’s just an “object”.
I tried a couple of things but whatever I try it makes things actually worse 🙁
Can someone help out? I’m pretty sure it’s just a small thing as I’m good in overseeing the obvious.
any help is appreciated.
thanks
I’ll leave the guides and imperfections to the end, here’s how you fix this piece of code:
1) Set the variable ‘packet’ to an existing PDF-file filename in the same directory that the script is in:
packet = 'my_watermark.pdf'
2) Delete the moving to the beginning of the ‘stringIO’ buffer (like we’d ever need it):
packet.seek(0) # delete this
mywatermark = PyPDF2.PdfFileReader(packet) #delete this too
3) Give ‘packet’ as an argument instead of ‘mywatermark’ in the for-loop block:
wmpageObj = add_watermark(packet, pdfReader.getPage(page))
4) From the add_watermark function delete file openings and closings, leave only the constructing of the PdfFileReader instance, but with the parameter ‘wmFile’:
wmFileObj = open(wmFile, 'rb') # delete this
pdfReader = PyPDF2.PdfFileReader(wmFile) # let this be, but change wmFileObj to wmFile
pageObj.mergePage(pdfReader.getPage(0)) # let this be
wmFileObj.close() # delete this
return pageObj # let this be
Also, in your code there are imports in your main function, move them to the beginning of the file, and do read some documentation. PyPDF2‘s documentation shows how to merge pages (it’s the module’s specialty tbh), and while it’s a bit laconical, on the other side, Reportlab‘s User Guide is very thorough, but straightforward. Always try to see the meaning too behind your code.