Python library to extract 'epub' information

Question:

I’m trying to create a epub uploader to iBook in python. I need a python lib to extract book information. Before implementing this by myself I wonder if anyone know a already made python lib that does it.

Asked By: xiamx

||

Answers:

Something like epub-tools, for example? But that’s mostly about writing epub format (from various possible sources), as is epubtools (similar spelling, different project). For reading it, I’d try the companion project threepress, a Django app for showing epub books on a browser — haven’t looked at that code, but I imagine that in order to show the book it must surely first be able to read it;-).

Answered By: Alex Martelli

An .epub file is a zip-encoded file containing a META-INF directory, which contains a file named container.xml, which points to another file usually named Content.opf, which indexes all the other files which make up the e-book (summary based on http://www.jedisaber.com/eBooks/tutorial.asp ; full spec at http://www.idpf.org/2007/opf/opf2.0/download/ )

The following Python code will extract the basic meta-information from an .epub file and return it as a dict.

import zipfile
from lxml import etree

def epub_info(fname):
def xpath(element, path):
return element.xpath(
path,
namespaces={
"n": "urn:oasis:names:tc:opendocument: rel="nofollow">epub module. It looks like an easy option.

Answered By: Hugh Bothwell

I wound up here after looking for something similar and was inspired by Mr. Bothwell's code snippet to start my own project. If anyone is interested ... http://epubzilla.odeegan.com/

Answered By: Nicholas O'Deegan
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.