in reply to endless loop problems

The problem is that you never update $pics_found. Replace $pics_found with @found_images.

I see that you tried to debug the problem by adding print statements. That's good. (We like monks that put in some effort, and it's what I would have done.) The problem with your print statements is that they are being buffered. Adding $| = 1 will reveal them.

I took the liberty of cleaning up your code:

# $| = 1; # Uncomment when adding print statements for debugging. # Get more than needed so we can randomly choose images later. while (@found_galleries && @found_images < $pics_to_find * 30) { # Remove one gallery link at a time until we're done. my $gal_url = pop(@found_galleries); my $gal = get($gal_url); # Extract the image URLs from the gallery page. my @image_urls = $gal =~ m#(http://images\.imagefap\.com/images/thu +mb/\d+/\d+/\d+\.jpg)#g; # Remove the images which are too big. if ($limit_img_size eq "yes") { @image_urls = grep { my $image = get($_); my ($height, $width) = imgsize(\$image); $height <= $max_height && $width <= $max_width } @image_urls; } push @found_images, @image_urls; }

Replies are listed 'Best First'.
Re^2: endless loop problems
by coldfingertips (Pilgrim) on Jun 28, 2006 at 16:18 UTC
    Then again, maybe OPs code was working. Maybe he thought it was endless due not getting any prints back and because every image has to be loaded into memory for it to be sized. From ImageFap, that's one long wait because they are super slow.

    Code might not be pretty but other than the loop variable not increasing, I think it should work the way it is. I'd try to turn on the buffer var and see what happens.

    UPDATE: would it not be better for the server in terms of CPU and memory if the images were downloaded into a temp directory and sized locally rather than read from memory?

      I don't understand what you are trying to say in the first two paragraphs. (Yes, the debug statements would have worked with $| = 1. Yes, it would have worked by replacing $pics_found with @found_images. Just like I said in my post.)

      If your point is that you're taking exception to me cleaning up the code, keep in mind that good code is readable and maintainable code. The excess of redundancy and the poor choice of variable names in the original code affected both readability and maintainability.

      would it not be better for the server in terms of CPU and memory if the images were downloaded into a temp directory and sized locally rather than read from memory?

      The CPU usage would be (slightly) higher, since we'd have extra code to write to the disk and read from the disk.

      Yes, the memory usage could be smaller (depending on how we write to and read from the disk) by something less than the size of one image.

      And of course, it would be slower because we'd have to do all the work we currenlty do, plus more.

        This might be getting a little on the OT side but how much memory is used up at a time in this situation? Does the memory clear after each image? Does it pile itself up?

        I'm not so much into the technical side of things, but I'd have to think that storing an image locally might take a little longer but it would clear up a lot of unecessary memory and it'd be faster to process the image sizes if they were locally stored. So when it comes to the memory usage, if storing the files, wouldn't it only be a problem for a short while instead of the duration of the entire loop above?

        I could be speaking nonsense as I have no real clue how this stuff really works.

Re^2: endless loop problems
by Anonymous Monk on Jun 28, 2006 at 19:47 UTC
    Hi. I have been playing with this all afternoon and I can't get the output to work.

    It just doesn't size some of the images.

    my $pics_found = $#found_images + 1; while (@found_galleries && @found_images <= $pics_to_find * 20) { # Remove one gallery link at a time until we're done. my $gal_url = pop(@found_galleries); my $gal = get($gal_url); print "gal = $gal_url<br>"; # Extract the image URLs from the gallery page. my @image_urls = $gal =~ m#(http://images\.imagefap\.com/images/thu +mb/\d+/\d+/\d+\.jpg)#g; foreach my $img (@image_urls) { $img =~ s/thumb/full/ig; } # Remove the images which are too big. if ($limit_img_size eq "yes") { @image_urls = grep { my $this_url = $_; my $image = get($this_url); my ($height, $width) = imgsize(\$image); print "IMG $this_url is $height and $width<br>"; $height <= $max_height && $width <= $max_width } @image_urls; } push @found_images, @image_urls; }
    The output below is generally what I get. It will put the size down for a few of them but only rarely.
    gal = http://www.imagefap.com/gallery.php?gid=158098 IMG http://images.imagefap.com/images/full/5/190/190037365.jpg is and IMG http://images.imagefap.com/images/full/6/804/804735625.jpg is and IMG http://images.imagefap.com/images/full/7/478/478782005.jpg is and IMG http://images.imagefap.com/images/full/10/199/1994599748.jpg is an +d IMG http://images.imagefap.com/images/full/4/115/1151974238.jpg is and
    I don't suppose you or anyone else can see what's going wrong?

      It works for me, or rather, the following does:

      use strict; use warnings; use Data::Dumper qw( Dumper ); use Image::Size qw( imgsize ); use LWP::Simple qw( get ); my $this_url = "http://images.imagefap.com/images/full/5/190/190037365 +.jpg"; my $image = get($this_url); my ($height, $width) = imgsize(\$image); print "IMG $this_url is $height and $width\n"; __END__ IMG http://images.imagefap.com/images/full/5/190/190037365.jpg is 507 +and 760

      Make sure $image is defined. Maybe an error occured during the download?

      print(defined($image) ? "Ok." : "Error!", "<br>");

      Oh! And make sure your version of Image::Size accepts images passed as \$image.

      PS -
      foreach my $img (@image_urls) { $img =~ s/thumb/full/ig; }
      can be written as
      s/thumb/full/i foreach @image_urls;