apache-tika

How to split PDF into paragraphs using Tika

How to split PDF into paragraphs using Tika Question: I have a PDF document which I am currently parsing using Tika-Python. I would like to split the document into paragraphs. My idea is to split the document into paragraphs and then create a list of paragraphs using the isspace() function I also tried splitting using …

Total answers: 2