I'm the author of CAM::PDF. Even under the best circumstances, getpdftext.pl produces barely readable output. My module doesn't have a renderer, so the text extraction is a total hack that I tossed into the module for fun.
I'm quite pleased that other tools have produced good results! CAM::PDF (which I barely maintain anymore, I'm sorry to say) is optimized for high-performance, low-level editing of PDF documents.
In reply to Re: Extracting text from PDF. No really
by chrisdolan
in thread Extracting text from PDF. No really
by clinton
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |