Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have been using LWP Simple to mirror a site. I would also like to to retrieve the .gif, .jpg files associated with the pages. How do I do this? Thanks
  • Comment on How to retrieve gifs etc using LWP simple

Replies are listed 'Best First'.
Re: How to retrieve gifs etc using LWP simple
by chromatic (Archbishop) on Dec 21, 2000 at 09:13 UTC
    There's the Image::Grab module, which uses the HTML::Parser module to look for images and, well, grab them.

    That's much more recommended than rolling your own regex solution.

Re: How to retrieve gifs etc using LWP simple
by Fastolfe (Vicar) on Dec 22, 2000 at 01:19 UTC
    The 'mirror' function of LWP::Simple only acts on a given URL. It does not "spider" the site, following links and image references. If that's the behavior you want, you will need to use something like HTML::Parser to parse the results of your 'mirror' (assuming it returned a value indicating something changed), pull out the links and image references, and then 'mirror' each of those.

    Surely there is a pre-written script or module that will do this though...? I don't have any experience with any, but I imagine spidering a site for the purposes of mirroring is something that's been solved several times over already. If you're only worried about one page, I'm sure most of these have a 'depth' option that you can simply set to 1.

      try wget, you can't beat it