Cool Uses, re-used

I couldn't resist posting this here -- even though it's at best a code snippet, and I'm less than a duffer, ranker than an armchair -- just cuz the context is so appropriate, in a self-referential sort of way. (Notwithstanding the inadequacies, at least it's short.)

It's a little script to download the cumulative "Cool Uses for Perl" pages and string them into one big page on my HD. Yeah I know. I should've filtered and scraped and WWW-Mechanize'd away all the duplicated headers and footers and sidebars and stuff. A rainy day perhaps ...

#! /usr/bin/perl -w
use strict;
use warnings;

use LWP::Simple;

my $url;
my $doc;
my $path = "http://www.perlmonks.org/?";
my $fn = "coolness.html";

open TO_FILE, ">> $fn"  or die ": $!";
for (my $i=0;$i<300;$i+=15) {
    $url = $path . "next=" . $i . ";node_id=1044"; 
    warn $url, "\n";
    $doc = get($url);
    print TO_FILE $doc;
}
close TO_FILE;
[download]

(But I'm curious, maybe someone knows: could the for( ) loop be replaced by a while(< >) contruct? Looks a bit verbose, that $doc variable shunting inbound pages to the filehandle. Um?)

(Be kind. You were all once newbies too.)

Unconsidered by castaway - already been considered a few times, we're leaving it unapproved.

Comment on Cool Uses, re-used Download Code

Replies are listed 'Best First'.
Re: Cool Uses, re-used by davido (Cardinal) on Nov 14, 2004 at 17:09 UTC
LWP::Simple doesn't populate the diamond operator, so you can't use `while(<>){...`, to answer that question. Maybe you could tie a class to a filehandle that does the LWP::Simple get() behind the scenes, but that just seems a little silly. It might please the gods if your script were a little more lazy. By that I mean, perhaps there should be a `sleep 15` inside the loop, so that you're not banging away at the server quite so furiously. ...just a thought. Dave	[reply] [d/l] [select]
Re^2: Cool Uses, re-used by diotalevi (Canon) on Nov 14, 2004 at 22:54 UTC
WWW::Mechanize::Sleepy handles playing nice.	[reply]
Re: Cool Uses, re-used by Your Mother (Archbishop) on Nov 14, 2004 at 18:40 UTC
Could the for( ) loop be replaced by a while(< >) contruct? `use IO::All; # needs IO::All::LWP for this to work my $io = io->http('http://www.perlmonks.org/'); while ( my $line = $io->getline ) { print $line; }` [download]	[reply] [d/l]
Re: Cool Uses, re-used by elwarren (Priest) on Nov 15, 2004 at 20:20 UTC
the for loop could be rewritten as: `my $path = "http://www.perlmonks.org/?node_id=1044;next="; foreach my $i (0..20) { my $url = $path . 15 * $i; }` [download] Going over 300 doesn't error out, it returns empty and prompts for a new node. So I'm not sure how you could do a while construct unless you checked the page length returned, and saw it was smaller than a page with listings. You may prefer using the print format instead of standard layout by adding displaytype to your url `http://www.perlmonks.org/?displaytype=print;node_id=1044;next=100` As the others have said, you shouldn't pound on the server, and you really should use the is_success(get($url)) before concatenating to your file. And you're not going to get any of the "Read More..." text from these either. But it's interesting :-) I read through all the CUFP when I first discovered Perl Monks. Cheers	[reply] [d/l] [select]