renaming a list of pdf files with for loop

Question:

i am trying to rename a list of pdf files by extracting the name from the file using PyPdf. i tried to use a for loop to rename the files but i always get an error with code 32 saying that the file is being used by another process. I am using python2.7
Here’s my code

import os, glob
from pyPdf import PdfFileWriter, PdfFileReader

# this function extracts the name of the file
def getName(filepath):
    output = PdfFileWriter()
    input = PdfFileReader(file(filepath, "rb"))
    output.addPage(input.getPage(0))
    outputStream = file(filepath + '.txt', 'w')
    output.write(outputStream)
    outputStream.close()

    outText = open(filepath + '.txt', 'rb')
    textString = outText.read()
    outText.close()

    nameStart = textString.find('default">')
    nameEnd = textString.find('_SATB', nameStart)
    nameEnd2 = textString.find('</rdf:li>', nameStart)

    if nameStart:
        testName = textString[nameStart+9:nameEnd]
        if len(testName) <= 100:
            name = testName + '.pdf'
        else:
            name = textString[nameStart+9:nameEnd2] + '.pdf'
    return name


pdfFiles = glob.glob('*.pdf')
m = len(pdfFiles)
for each in pdfFiles:
    newName = getName(each)
    os.rename(each, newName)
Asked By: chidimo

||

Answers:

You’re not closing the input stream (the file) used by the pdf reader.
Thus, when you try to rename the file, it’s still open.

So, instead of this:

input = PdfFileReader(file(filepath, "rb"))

Try this:

inputStream = file(filepath, "rb")
input = PdfFileReader(inputStream)
(... when done with this file...)
inputStream.close()
Answered By: elmart

It does not look like you close the file object associated with the PDF reader object. Though maybe at tne end of the function it is closed automatically, but to be sure you might want to create a separate file object which you pass to the PdfFileReader and then close the file handle when done. Then rename.

The below was from SO: How to close pyPDF "PdfFileReader" Class file handle
import os.path
from pyPdf import PdfFileReader

fname = 'my.pdf'
fh = file(fname, "rb")
input = PdfFileReader(fh)

fh.close()
os.rename(fname, 'my_renamed.pdf')
Answered By: Paul

Consider using the with directive of Python. With it you do not need to handle closing the file yourself:

def getName(filepath):
    output = PdfFileWriter()
    with file(filepath, "rb") as pdfFile:
        input = PdfFileReader(pdfFile)
        ...
Answered By: Alfe
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.