in reply to Converting PDF file to text
See update of Parsing PDFs by text position? and linked threads
> nothing had worked
What does this exactly mean?
If pdftohtml -xml doesn't produce readable text, your only remaining chance is OCR, because the PDF might embed its own font in random order or even only an image showing the text.
Cheers Rolf
(addicted to the Perl Programming Language and ☆☆☆☆ :)
Je suis Charlie!
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Converting PDF file to text
by LanX (Saint) on May 11, 2017 at 19:08 UTC | |
|
Re^2: Converting PDF file to text
by cerian (Novice) on May 12, 2017 at 16:26 UTC | |
by runrig (Abbot) on May 12, 2017 at 21:54 UTC | |
by LanX (Saint) on May 13, 2017 at 11:59 UTC | |
by LanX (Saint) on May 12, 2017 at 17:08 UTC |