Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

endless loop problems

by Anonymous Monk
on Jun 28, 2006 at 15:51 UTC ( [id://558042]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I think I found myself in an endless loop. This is a followup of my post yesterday adding an IF to a push.

When run, if $limit_img_size eq "yes" it will do nothing but load until judgement day or until my server decides it's time to kill it. If I change it to "no" to catch the else, the else section works as intended.

A little background on the script: It goes to ImageFap which is a free image host with free searchable galleries. I am doing a search on it and pulling back all the galleries of results on the first page. From here, I go through the galleries one at a time and extract the image links from within the gallery. All I'm collecting are image links but I need to make sure all my image links have images of a specific size or less if that's what the user wants.

my $pics_found = $#found_images + 1; if ($limit_img_size eq "yes") { while ($pics_found < $pics_to_find * 30) # get more than needed so + we can randomly choose images later { my $next_gal = pop(@found_galleries); # remove one gallery link a +t a time until we're done if (!$next_gal) { last; } my $get_gal = get($next_gal); while ($get_gal =~ m#(http://images\.imagefap\.com/images/thumb/\ +d+/\d+/\d+\.jpg)#g) { print "1 - $1<br>"; #testing, doesn't print my $image = get($1); print "found $image<br>"; #testing, doesn't print my ($height, $width) = imgsize(\$image); push @found_images, $image if ($height <= $max_height and $width <= $max_width); } } } else { while ($pics_found < $pics_to_find * 30) { my $next_gal = pop(@found_galleries); if (!$next_gal) { last; } my $get_gal = get($next_gal); ######### # collect our image links ######### push @found_images, $get_gal =~ m#http://images\.imagefap\.com/im +ages/thumb/\d+/\d+/\d+\.jpg#g; $pics_found = $#found_images + 1; } }

Replies are listed 'Best First'.
Re: endless loop problems
by japhy (Canon) on Jun 28, 2006 at 16:00 UTC
    Your basic problem is that you're not updating $pics_found in your top while loop, like you are in the bottom loop. By the way, $pics_found = $#found_images + 1 is an ugly way of saying $pics_found = @found_images, since an array in scalar context returns its size. There's really no need for a separate variable, then! Also, move your $limit_img_size logic to the inside of the while loop. The two loops do the same thing, there's just a slight bit of extra logic if $limit_img_size is "yes".
    while (@found < $pics_to_find * 30) { last unless @found_galleries; # exit the loop if there are no more my $gallery_content = get(pop @found_galleries); while ($gallery_content =~ /.../) { ... # why use "yes" and "no"? why not just a true value or a false va +lue? next if $limit_img_size eq "yes" and !some_size_constraint(); push @found, ...; } }

    Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
    How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart
Re: endless loop problems
by ikegami (Patriarch) on Jun 28, 2006 at 16:09 UTC

    The problem is that you never update $pics_found. Replace $pics_found with @found_images.

    I see that you tried to debug the problem by adding print statements. That's good. (We like monks that put in some effort, and it's what I would have done.) The problem with your print statements is that they are being buffered. Adding $| = 1 will reveal them.

    I took the liberty of cleaning up your code:

    # $| = 1; # Uncomment when adding print statements for debugging. # Get more than needed so we can randomly choose images later. while (@found_galleries && @found_images < $pics_to_find * 30) { # Remove one gallery link at a time until we're done. my $gal_url = pop(@found_galleries); my $gal = get($gal_url); # Extract the image URLs from the gallery page. my @image_urls = $gal =~ m#(http://images\.imagefap\.com/images/thu +mb/\d+/\d+/\d+\.jpg)#g; # Remove the images which are too big. if ($limit_img_size eq "yes") { @image_urls = grep { my $image = get($_); my ($height, $width) = imgsize(\$image); $height <= $max_height && $width <= $max_width } @image_urls; } push @found_images, @image_urls; }
      Then again, maybe OPs code was working. Maybe he thought it was endless due not getting any prints back and because every image has to be loaded into memory for it to be sized. From ImageFap, that's one long wait because they are super slow.

      Code might not be pretty but other than the loop variable not increasing, I think it should work the way it is. I'd try to turn on the buffer var and see what happens.

      UPDATE: would it not be better for the server in terms of CPU and memory if the images were downloaded into a temp directory and sized locally rather than read from memory?

        I don't understand what you are trying to say in the first two paragraphs. (Yes, the debug statements would have worked with $| = 1. Yes, it would have worked by replacing $pics_found with @found_images. Just like I said in my post.)

        If your point is that you're taking exception to me cleaning up the code, keep in mind that good code is readable and maintainable code. The excess of redundancy and the poor choice of variable names in the original code affected both readability and maintainability.

        would it not be better for the server in terms of CPU and memory if the images were downloaded into a temp directory and sized locally rather than read from memory?

        The CPU usage would be (slightly) higher, since we'd have extra code to write to the disk and read from the disk.

        Yes, the memory usage could be smaller (depending on how we write to and read from the disk) by something less than the size of one image.

        And of course, it would be slower because we'd have to do all the work we currenlty do, plus more.

      Hi. I have been playing with this all afternoon and I can't get the output to work.

      It just doesn't size some of the images.

      my $pics_found = $#found_images + 1; while (@found_galleries && @found_images <= $pics_to_find * 20) { # Remove one gallery link at a time until we're done. my $gal_url = pop(@found_galleries); my $gal = get($gal_url); print "gal = $gal_url<br>"; # Extract the image URLs from the gallery page. my @image_urls = $gal =~ m#(http://images\.imagefap\.com/images/thu +mb/\d+/\d+/\d+\.jpg)#g; foreach my $img (@image_urls) { $img =~ s/thumb/full/ig; } # Remove the images which are too big. if ($limit_img_size eq "yes") { @image_urls = grep { my $this_url = $_; my $image = get($this_url); my ($height, $width) = imgsize(\$image); print "IMG $this_url is $height and $width<br>"; $height <= $max_height && $width <= $max_width } @image_urls; } push @found_images, @image_urls; }
      The output below is generally what I get. It will put the size down for a few of them but only rarely.
      gal = http://www.imagefap.com/gallery.php?gid=158098 IMG http://images.imagefap.com/images/full/5/190/190037365.jpg is and IMG http://images.imagefap.com/images/full/6/804/804735625.jpg is and IMG http://images.imagefap.com/images/full/7/478/478782005.jpg is and IMG http://images.imagefap.com/images/full/10/199/1994599748.jpg is an +d IMG http://images.imagefap.com/images/full/4/115/1151974238.jpg is and
      I don't suppose you or anyone else can see what's going wrong?

        It works for me, or rather, the following does:

        use strict; use warnings; use Data::Dumper qw( Dumper ); use Image::Size qw( imgsize ); use LWP::Simple qw( get ); my $this_url = "http://images.imagefap.com/images/full/5/190/190037365 +.jpg"; my $image = get($this_url); my ($height, $width) = imgsize(\$image); print "IMG $this_url is $height and $width\n"; __END__ IMG http://images.imagefap.com/images/full/5/190/190037365.jpg is 507 +and 760

        Make sure $image is defined. Maybe an error occured during the download?

        print(defined($image) ? "Ok." : "Error!", "<br>");

        Oh! And make sure your version of Image::Size accepts images passed as \$image.

        PS -
        foreach my $img (@image_urls) { $img =~ s/thumb/full/ig; }
        can be written as
        s/thumb/full/i foreach @image_urls;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://558042]
Approved by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (3)
As of 2024-03-28 15:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found