in reply to Re^2: downloading images from a webpage
in thread downloading images from a webpage

I finally got output, but it looks like a kid did it. I'd like to polish it up and be able to have a script that uses WWW::Mechanize more effectively.

#!/usr/bin/perl -w use strict; use LWP::Simple; open FILE, "text1.txt" or die $!; my $url; my $text; while (<FILE>) { $text = $_; $text =~ s/\s+//; $url = 'http://www.nobeliefs.com/' . $text; print qq[ '$url' ]; $text =~ s#images/##; print "$text\n"; getstore($url, $text) or die "Can't download: $@\n"; }

How would I use chomp instead of $text =~ s/\s+//;? Nothing I tried worked.

My failure with WWW::Mechanize was almost complete. The most I could get it to do was dump the names of the images to STDOUT. How could I re-write this to avoid all the nonsense with saving to a file which I then have to read? The syntax for $mech->images is: Lists all the images on the current page. Each image is a WWW::Mechanize::Image object. In list context, returns a list of all images. In scalar context, returns an array reference of all images. I tried a dozen different things, but I don't get why this is not list context:

#!/usr/bin/perl -w use strict; use WWW::Mechanize; open FILE, "text2.txt" or die $!; my $domain = 'http://www.nobeliefs.com/nazis.htm'; my $m = WWW::Mechanize->new; $m->get( $domain); my @list = $m->images(); print "@list \n"; #$m->text(); #$m->content( format => 'text2.txt' ); #print FILE $m; close FILE;

Replies are listed 'Best First'.
Re^4: downloading images from a webpage
by blakew (Monk) on Apr 09, 2012 at 15:54 UTC
    Hi. Sorry for the delay.

    You do call images() in list context. What you're going to want to do is combine this with getstore to save your images:

    my @list = $m->images(); for my $img (@list) { my $url = $img->url_abs(); my $filename = ...; (fill this in, probably based on $url) getstore($url,$filename) or die "Can't download '$url': $@\n"; }
Re^4: downloading images from a webpage
by Aldebaran (Curate) on Apr 14, 2012 at 00:56 UTC

    I really screwed up this thread with a lot of newbie mistakes, but I wanted to thank blake for helping me combine a couple of the things I've been trying to do here. I can't seem to do so where his reply was, but I would like him to see that his script improvements worked. Let me also state that I'm as far from national socialism ideologically as I could be: I support civil rights, union rights, democracy, and I'm a pacifist.

    $ perl hitler8.pl $ cat hitler8.pl #!/usr/bin/perl -w use strict; use WWW::Mechanize; use LWP::Simple; my $domain = 'http://www.nobeliefs.com/nazis.htm'; my $m = WWW::Mechanize->new; $m->get( $domain); my $counter = 0; my @list = $m->images(); for my $img (@list) { my $url = $img->url_abs(); $counter++; my $filename = "site/image_". "$counter"; getstore($url,$filename) or die "Can't download '$url': $@\n"; } $ cd site/ $ ls image_1 image_15 image_20 image_26 image_31 image_37 image_42 +image_9 image_10 image_16 image_21 image_27 image_32 image_38 image_43 image_11 image_17 image_22 image_28 image_33 image_39 image_5 image_12 image_18 image_23 image_29 image_34 image_4 image_6 image_13 image_19 image_24 image_3 image_35 image_40 image_7 image_14 image_2 image_25 image_30 image_36 image_41 image_8 $ echo "thx all for comments" thx all for comments $