in reply to Re^2: Extracting text from a PDF (using PDF::API2)
in thread Extracting text from a PDF (using PDF::API2)
as a side note, there is a PDF::API3 available on CPAN.
I have used CAM::PDF mainly for the tasks of extracting text. However, I have had little luck with embedded html in pdfs. You may be able to walk the root dictionary of the pdf using CAM::PDF and store information you need. There is also a module CAM::PDF::Renderer::Text that may be of some help
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Extracting text from a PDF (using PDF::API2)
by music_man1352000 (Novice) on Dec 03, 2009 at 04:17 UTC |