I always used pdftohtml:
http://pdftohtml.sourceforge.net/Then I parsed the HTML for content with HTML::TreeBuilder::XPath. This works particularly well for simple documents, or documents with a standardized structure. You can look for the x/y offset of the element to find the exact piece of information you're looking for.
In reply to Re: pdf to html
by snobol
in thread pdf to html
by mouleeshmichael
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |