PyPDF2 IOError: [Errno 22] Invalid argument on PyPdfFileReader Python 2.7

Question:

Goal = Open file, encrypt file, write encrypted file.
Trying to use the PyPDF2 module to accomplish this. I have verified theat “input” is a file type object. I have researched this error and it translates to “file not found”. I believe that it is linked somehow to the file/file path but am unsure how to debug or troubleshoot. and getting the following error:

Traceback (most recent call last):
  File "CommissionSecurity.py", line 52, in <module>
    inputStream = PyPDF2.PdfFileReader(input)
  File "buildbdist.win-amd64eggPyPDF2pdf.py", line 1065, in __init__
  File "buildbdist.win-amd64eggPyPDF2pdf.py", line 1660, in read
IOError: [Errno 22] Invalid argument

Below is the relevant code. I’m not sure how to correct this issue because I’m not really sure what the issue is. Any guidance is appreciated.

for ID in FileDict:
        if ID in EmailDict : 
            path = "C:\Apps\CorVu\DATA\Reports\AlliD\Monthly Commission Reports\Output\pdcom1\"
            #print os.listdir(path)
            file = os.path.join(path + FileDict[ID])

            with open(file, 'rb') as input:
                print type(input)
                inputStream = PyPDF2.PdfFileReader(input)
                output = PyPDF2.PdfFileWriter()
                output = inputStream.encrypt(EmailDict[ID][1])
            with open(file, 'wb') as outputStream:
                output.write(outputStream)  
        else : continue
Asked By: AlliDeacon

||

Answers:

I think your problem might be caused by the fact that you use the same filename to both open and write to the file, opening it twice:

with open(file, 'rb') as input :
    with open(file, 'wb') as outputStream :

The w mode will truncate the file, thus the second line truncates the input.
I’m not sure what you’re intention is, because you can’t really try to read from the (beginning) of the file, and at the same time overwrite it. Even if you try to write to the end of the file, you’ll have to position the file pointer somewhere.
So create an extra output file that has a different name; you can always rename that output file to your input file after both files are closed, thus overwriting your input file.

Or you could first read the complete file into memory, then write to it:

with open(file, 'rb') as input:
    inputStream = PyPDF2.PdfFileReader(input)
    output = PyPDF2.PdfFileWriter()
    output = input.encrypt(EmailDict[ID][1])
with open(file, 'wb') as outputStream:
    output.write(outputStream)  

Notes:

  • you assign inputStream, but never use it
  • you assign PdfFileWriter() to output, and then assign something else to output in the next line. Hence, you never used the result from the first output = line.

Please check carefully what you’re doing, because it feels there are numerous other problems with your code.


Alternatively, here are some other tips that may help:

The documentation suggests that you can also use the filename as first argument to PdfFileReader:

stream – A File object or an object that supports the standard read
and seek methods similar to a File object. Could also be a string
representing a path to a PDF file.

So try:

inputStream = PyPDF2.PdfFileReader(file)

You can also try to set the strict argument to False:

strict (bool) – Determines whether user should be warned of all
problems and also causes some correctable problems to be fatal.
Defaults to True.

For example:

inputStream = PyPDF2.PdfFileReader(file, strict=False)
Answered By: user707650

Using open(file, ‘rb’) was causing the issue becuase PdfFileReader() does that automagically. I just removed the with statement and that corrected the problem.

with open(file, 'rb') as input:
    inputStream = PyPDF2.PdfFileReader(input)
Answered By: AlliDeacon

This error raised up because of PDF file is empty.
My PDF file was empty that’s why my error was raised up. So First of all i fill my PDF file with some data and Then start reeading it using PyPDF2.PdfFileReader,

And it solved my Problem!!!

Answered By: ZeevhY Org.

Late but, you may be opening an invalid PDF file or an empty file that’s named x.pdf and you think it’s a PDF file

Answered By: Ahmed I. Elsayed
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.