comment on

Thanks again for your patience with this. I believe that I have successfully made the code strict with the help of others here. I just want to re-iterate what I am trying to achieve:

In reference to the LWP module-based code excerpt below, I am trying to pull the files represented by my $plyrurl (line 24) below and save them as "$outputdir/$game_$players" represented by my $filename(line 31) (so a file might save as "$outputdir/gid_2007_08_06_quiaaa_yucaaa_1_112039.xml"). One point of confusion for me is that the original code author pulls content from 2 pages on his way to the destination page (line 9 and 22). This type of code is called a "spider" so maybe the code needs to pull data from each page on its path to the destination page? I just think it would be easier to find a solution with simpler code if anything can be taken out!

That said, I tried to adapt your original reply using the batters example to this code but am getting lost with the character class abbreviations. Because of that, I am trying to concatenate the strings instead. The current code here is returning a "could not open file ./xxx_players/gid_2007_08_06_quiaaa_yucaaa_1/_112039.xml No such file or directory" error. Shouldn't the last part not have the backslash there before '_112039'?

my $sourceurl = "http://gd2.mlb.com/components/game/aaa";
my $outputdir = "./xxx_players";
my $dayurl = "$sourceurl/year_$year/month_$mon/day_$mday/";
    print "\t$dayurl\n";
    
my $response = $browser->get($dayurl); 
    die "Couldn't get $dayurl: ", $response->status_line,   
        "\n" unless $response->is_success;
my $html = $response->content;
my @games = @_;
    while($html =~ m/<a href=\"(gid_\w+\/)\"/g ){  
        push @games, $1;}

# the loop that downloads data
my $game; 
foreach $game (@games) {
    my $gameurl = "$dayurl/$game";
    $response = $browser->get($gameurl);
        die "Couldn't get $gameurl: ",                
            $response->status_line, "\n" unless         
                $response->is_success;
    my $gamehtml = $response->content;
    if($gamehtml =~ m/<a href=\"players\.xml\"/ ) {
        my $plyrurl = "$dayurl/$game/players.xml";
            $response = $browser->get($plyrurl);
        die "Couldn't get $plyrurl: ",  
                    $response->status_line, "\n"
                unless $response->is_success;
        my $plyrhtml = $response->content;
        my $players = 'players.xml';
        my $filename = "$outputdir/$game" . "$players";
        print "\t\tfetching game: ${game}_$players\n";
        open(FILEHANDLE, ">","$filename"
                or die "could not open file $filename $!\n";    
        print FILEHANDLE "$game" . "$filename";
        close FILEHANDLE;
    } else 
       {my $players = 'players.xml';
       print "warning: no player list for $game . $players\n";
    }
[download]

In reply to Re^6: How to Save Fetched Web Files as "path/$string.xml" by nase
in thread How to Save Fetched Web Files as "path/$string.xml" by nase

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.