in reply to get the html source of a webpage .

CAM::PDF should be able to get you what you want. You'll just have the enhance on the below script to scan for the beginning and end of each store's contact info:

use CAM::PDF; use LWP::Simple qw(getstore); use strict; use warnings; my $url = 'http://data.lexus.nl/home/data/LexusV8/pdf/Lexusdealerlijst +.pdf'; my ($file) = $url =~ m{([^/]*)$}; getstore($url, $file) if ! -e $file; my $cam = CAM::PDF->new($file); print $cam->getPageText(1);