See update of Parsing PDFs by text position? and linked threads
> nothing had worked
What does this exactly mean?
If pdftohtml -xml doesn't produce readable text, your only remaining chance is OCR, because the PDF might embed its own font in random order or even only an image showing the text.
Cheers Rolf
(addicted to the Perl Programming Language and ☆☆☆☆ :)
Je suis Charlie!
In reply to Re: Converting PDF file to text
by LanX
in thread Converting PDF file to text
by cerian
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |