Syntax error while installing pdfminer using python

Question:

I want to use the pdfminer for extracting the text info. I have downloaded the pdfminer-20131113. I have installed the python in C:python34.
Now using cmd, I am setting the path to the setup.py file of pdfminer.
and running the following command.

python setup.py install

But I am getting the below error.

> D:pdfminer-20101226>python setup.py install
Traceback (most recent call last):
  File "setup.py", line 3, in <module>
    from pdfminer import __version__
  File "D:pdfminer-20101226pdfminer__init__.py", line 4
    if __name__ == '__main__': print __version__
                                               ^
SyntaxError: invalid syntax

It seems to be some error in the setup.py file of pdfminer, which I am not sure how to resolve.

Also, I saw a pdf2txt.py file in the build folder of pdfminer. I tried to use that also as pdf2txt.py -o output.html pdffilename.pdf (with full path). but instead of converting it. it opens the pdf2txt.py file.

Asked By: Maverick

||

Answers:

The PDFMiner project homepage states:

Written entirely in Python. (for version 2.4 or newer)

and further down:

Install Python 2.4 or newer. (Python 3 is not supported.)

so you’ll have to install Python 2 to run this project.

Alternatively, you could try the Python 3 port, pdfminer3k; it hasn’t seen any updates in 20 months, while PDFMiner does have more recent releases, so your mileage may vary.

Answered By: Martijn Pieters

pdfminer.six is a fork with Python 2+3 support using six. Last commit was 15 days ago.

Answered By: RUser4512

This should solve your problem in Python 3

pip install pdfminer.six
Answered By: Sagun Shrestha
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.