monks,

alas, I am stymied once again ... and have humbly come for assistance.

I am trying to pull the links off a page and store them in @links. There is standard code for this which I have used with success.

my @links = (); sub callback { my($tag, %attr) = @_; return if $tag ne 'a'; push(@links, values %attr); + } # Make the parser. $p = HTML::LinkExtor->new(\&callback); # Request document and parse it as it arrives $res = $ua->request(HTTP::Request->new(GET => $url), sub {$p->parse($_[0])});

Now, however, I am trying to get the links off a page that requires a username/password ... through the assistance of the monks I have accomplished a user/pass webpage grab...

$ua = LWP::UserAgent->new; $req = HTTP::Request->new(GET => $url); $req->authorization_basic('user', 'pass'); $res = $ua->request($req)->as_string,

Now the question is how to merge the user/pass webpage grab with the link extractor.

I have tried

$ua = LWP::UserAgent->new; $req = HTTP::Request->new(GET => $url); $req->authorization_basic('user', 'pass'); $res = $ua->request($req)->as_string, sub {$p->parse($_[0])};

but when I print out @links I get nothing. I think (but really have no clue) this has something to do with the ->as_string, but without it the webpage comes out as HTTP::Response=HASH(0x8435960).

Is there something else that I should be doing to get these links pulled out properly? Obviously there is, but do you guys know what that might be?

cdherold


In reply to Link Extraction when grabbing web page with USER/PASS by cdherold

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.