I think what you might want to look into, is LWP::Simple, using the getstore() method, which would look something like this, using your example:
use strict; use LWP::Simple qw/getstore/; my $site = getstore('http://www.google.com', 'test.html'); # Or you could store the content in a scalar, with: # my $content = get('http://www.google.com');

But you also mentioned you wanted images, so you'll want to look into using HTML::SimpleLinkExtor as well, which is used thusly:

use strict; use HTML::SimpleLinkExtor; my $extract = HTML::SimpleLinkExtor->new(); $extract->parse_file("test.html"); # or you could use $extract->parse($content), as shown # above, using the raw HTML my @images = $extract->img; # feed each element of @images back to a getstore() for # each image found

If you want to maintain "browsability" of the site when you're done fetching it locally, you'll want to look into rewriting the links found in the document. I would suggest using URI to get the absolute url of the links, fetch those, and save them locally to disk. If you want to maintain the same filenames as referenced in the main HTML page you fetched, you might want to either regex off the filename portion itself, from the link you get, or use File::Basename to chop it off programatically. Here's a regex that can help split the file from the URL you get with URI's new_abs() method:

$file =~ s/.*[\/\\](.*)/$1/;

Why would you want to use these modules instead of OLE? Because they're portable, and if your code happens to move from Windows to Linux or BSD or some other non-OLE-supporting variant, it will still continue to work, without you having to rewrite it.

Good luck in your quest.


In reply to Re: SaveAs OLE Explorer by hacker
in thread SaveAs ole Explorer by Brett Wraight

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.