Preface

Of course, the best way for one machine to talk to another machine over the web is through some machine-sensible protocol: XML, soap, whatever. That being said, there are times when this option isn't available, forcing you to use http(s) to write an app that mimics a browser.

This meditation describes my recent experiments writing such an app. Your Mileage May Vary.

And of course do make sure such automated apps conform with any Terms of Use of the site you're using.

LWP and WWW:Mechanize vs. OLE

There are many posts around web from folks asking "How doe I use perl to mimic a browser", and folks always answer, "use LWP" or "use WWW::Mechanize". Those are astoundingly great modules for many circumstances, but they also have limitations.

I'd suggest the strengths of LWP and WWW::Mechanize are: I would suggest the weaknesses of LWP and WWW::Mechanize are: I have been experimenting with an app that interfaces with a website: it needs to log in, redirect to a secure site, examine the status of some pages, post multi-page forms full of hidden cookies and javascript, and repeat a handful of times.

After some struggles with LWP and WWW::Mechanize, I finally decided to try OLE.

I thought "surely OLE will break, or be slower, or be harder to implement."

I was pleasantly surprised: for my needs on this project, OLE was easier. Again, Your Mileage May Vary.

I used http://samie.sourceforge.net/ to get me started.

I'd suggest the strengths of OLE for IE (through SAMIE) are:

And the weaknesses of SAMIE:

Summary

Perl is about using the right tool the job.

For quick page fetches, I'd use LWP. For simple web apps, I'd use WWW::Mechanize. For testing redirectors or lower-level code, I'd use LWP (so as to be able to see exactly what is going on). For interfacing with a complex multipage secure form quickly on a Win* platform, I'd now suggest considering OLE.

rkg

I found the following links of some help:

update (broquaint): shortened width-bursting URLs


In reply to On being a browser by rkg

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.