Problems? Is your data what you think it is? | |
PerlMonks |
Re: How to parse PDFby moritz (Cardinal) |
on Aug 24, 2007 at 07:36 UTC ( [id://634805]=note: print w/replies, xml ) | Need Help?? |
The simple answer is you have to try it.
Pipe your pdf through the pdftotext tool (on Ubuntu in the poppler-utils package), and see if the output is parsable. That doesn't take very long, you can test it literally in two minutes. Take a look at PDF::Parse and PDF and see if they help you. But in principle it is much easier to validate the data before it is put into a PDF - have you tried to ask the external vendor if he could provide the same data in a format that is easier accessible?
In Section
Seekers of Perl Wisdom
|
|