in reply to regex for unicode email addresses

> I looked through the PDF:: family at cpan and did not see a way to slurp out a column directly,

I don't think your solution works well, you are showing us a mix of multiple columns.

As long as the fonts are not scrambled, you can use a proper solution like described here:

Parsing PDFs by text position?

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery