I am certainly no expert on HTML. I am not sure exactly what you have. One line doesn't tell me much. Over the years I've written a few web scrapers with LWP and a couple with WWW::Mechanize. As long as the webpage is serving up just HTML code instead of javascript, you can use the base WWW::Mechanize module. That's been the case so far in my current applications. If the webpage requires executing javascript code, then Perl cannot do that alone. In that case, you will need WWW:Mechanize::Chrome or similar. In that case, Perl controls the browser and has the browser execute the Javascript code. Mechanize sees the result of what the browser's javascript code did.

I would start by reading Cpan Mech Docs and then take a look at some Mech examples. Then I would start "hacking" and experimenting and see how far you can get with the base Mech module. If you are using a public, heavily trafficked web site, then show us the URL.

Also be aware of the potential impact that your code could have on the target web site. I have one application that "beats up" one web site pretty good. But I have agreement with the site owner about what hours and what days my application can run. This is an important consideration if you are going to retrieve a lot of data.

Update: s/Java/Javascript/; #Completely different things!


In reply to Re: need help determining which web browsing module to use by Marshall
in thread need help determining which web browsing module to use by Special_K

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.