in reply to Re: Text::Balanced woes..
in thread Text::Balanced woes..
It works marvelously, except that when using extract_multiple with extract_tagged as the subroutine, there seems no (obvious:) way to access the 5th (#4) element of the array returned by extract_tagged....
Or is it that by calling it within extract_multiple it isn't in list context? But if that's the case, then it must be in scalar context, what happens to the remainder string?
i guess the crux of my question is: "When using extract_multiple, how does one access the other members of the returned array, as it seems that item 0 is the only available?"
i've got a some working code, but am reluctant to post the code here (it is an anti-spambot tool, after all)but i'd be happy to share it via email.
Thanks again for everyone's help and comments!# find all the URLs from the page contents, rejecting any from bianca @data = extract_multiple( $response->content, [ sub {extract_tagged($_[0], '<a href="http://', '</a>', undef, {reject => ['bianca.com']} ) } ], undef, 1); # loop thru and strip the URL to it's bare address, this is # what's needed to insert into the database for (my $i=0; $i<=$#data; $i++) { my @temp = extract_tagged($data[$i], '<a href="http://', '">', und +ef, undef); $data[$i] = $temp[4]; }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Text::Balanced woes..
by Smylers (Pilgrim) on May 28, 2002 at 10:29 UTC | |
|
Re: Text::Balanced woes..
by Smylers (Pilgrim) on May 28, 2002 at 10:37 UTC | |
by u914 (Pilgrim) on Jun 12, 2002 at 05:45 UTC |