in reply to Re^2: Question regarding web scraping
in thread Question regarding web scraping
It's trivial by wrapping part of your code in a for() loop, and turning the single scalar $URL link into an array @URLS, that contains a list of urls instead. The for() loop iterates over this list. Note that this assumes the regex is the same for all urls. Untested:
use strict; use warnings; use LWP::Simple; my @URLS = qw( http://one.example.com http://two.example.com http://three.example.com ); my $regex = '<div class="usertext-body may-blank-within md-container + ">' . '<div class="md">(.+?)</div>\s*</div>' . '</form><ul class="flat-list buttons">'; for my $URL (@URLS){ my $CONTENT = get($URL); my $x = ''; my $count = 0; while ($CONTENT =~ m{$regex}gs){ $x .= $1; ++$count; } print "---$URL---\n"; print $x; print "Count: $count\n"; }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Question regarding web scraping
by Lisa1993 (Acolyte) on Oct 23, 2016 at 08:10 UTC | |
by Corion (Patriarch) on Oct 23, 2016 at 08:21 UTC |