Looks like there are several CPAN modules that may help. Consider:
Other random links include:
--f
| [reply] |
If want to get the text out of the PDF file, use 'pdftotext' provided by xpdf.
pdftotext works very well. You can pipe the text from the pdf to a file and then parse the text file
you created with a perl script.
| [reply] |
Yup, this is the way to go. Done it several times, with good success. Multiple column text (like in newspapers or brochures) sucks, though, as you can't tell where the
columns start. For this, a little manual work with ghostview
might be needed (ghostview can copy and paste text from PDFs after it has extracted the text, e.g., after a search command).
Christian Lemburg
Brainbench MVP for Perl
http://www.brainbench.com
| [reply] |
| [reply] |