in reply to Find blank pages in PDF

Your code assumes getPageText() returns an empty string when there are no text blocks in the PDF. This is probably an incorrect assumption. In general, a function in list context could be returning a false (-1), an undef or a string with whitespace. (tab, cr, etc). Try this:

{ my $foo = $doc->getPageText($_) ; print $_ unless (defined $foo && # Returned something and, $foo =~ m/[[:alnum:]]+/ms ); # actually returned text }

Sorry, I didn't actually test this.

update: fixed that dratted ~=/=~ update: fixed regex, tested now.

s//----->\t/;$~="JAPH";s//\r<$~~/;{s|~$~-|-~$~|||s |-$~~|$~~-|||s,<$~~,<~$~,,s,~$~>,$~~>,, $|=1,select$,,$,,$,,1e-1;print;redo}