in reply to Re: extract text from pdf
in thread extract text from pdf

If I want just the PDFs text to use it for whatever (save it in a database, ...) I found this line quiete convenient:

my $txt = `pdftotext whatever.pdf -` or die 'ERROR running pdftotext'; say $txt;
Or if the file-name is in a variable and the PDF-file contains umlauts or other non-ascii chars:
my $command_line = qq{pdftotext -enc 'UTF-8' '$path' -}; my $text = `$command_line` or die 'ERROR running pdftotext';