CAM::PDF should be able to get you what you want. You'll just have the enhance on the below script to scan for the beginning and end of each store's contact info:
use CAM::PDF; use LWP::Simple qw(getstore); use strict; use warnings; my $url = 'http://data.lexus.nl/home/data/LexusV8/pdf/Lexusdealerlijst +.pdf'; my ($file) = $url =~ m{([^/]*)$}; getstore($url, $file) if ! -e $file; my $cam = CAM::PDF->new($file); print $cam->getPageText(1);
In reply to Re: get the html source of a webpage .
by wind
in thread get the html source of a webpage .
by manjulakp
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |