in reply to Still empty...here is a live and working example
in thread Key/Value pair from GET
Here's something that might do the job:<a href="PlatformSoftwareSection.jsp?siteId=1&jid=94DDB69B3747X42D738A +8A4E54CDD8A4&platfor mId=1&special=&bySection=1&sectionId=2167&catalog=1&am +p;title=FireViewer+Videos+%26+Images "> <span class="smallprint">E-Books & Document Readers</span></a>,
You'll note I took the liberty of redefining your regex completely. In this case, I'm scooping the "sectionId" variable (numeric only) followed by any amount of "stuff", then grabbing the non-tagged content of the 'span' tag. It works, as best as I can tell, but isn't very adaptable.#!/usr/bin/perl -w use strict; use LWP::Simple; my %categories; # No need for '= ()' my $page = get('http://www.handango.com/PlatformSoftware.jsp?platformI +d=1&siteId=1&zsortParams=true'); while ($page =~ / sectionId=(\d+) # Section ID (all digits) [^>"]+"> # Remainder of param and tag \s+ # Some whitespace <span\s+class="smallprint"> # SPAN tag ([^<]*) # "Stuff" up to next tag < # Start of next tag /xig) { $categories{$1} = $2; print "[$1] and [$2]\n"; } foreach my $idkey (keys %categories) { print "$idkey,$categories{$idkey}\n"; }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
working great...will work on it to make it less fragile
by inblosam (Monk) on Jun 06, 2002 at 16:20 UTC |