in reply to
Extracting caption of a image from PDF file
you can try this..it's java and not perl, but will probably do the job you want, using the original question html; not pdf. http://jsoup.org/
the hardest line to type correctly is: stty erase ^H
Comment on
Re: Extracting caption of a image from PDF file
In Section
Seekers of Perl Wisdom