Thanks again for your patience with this. I believe that I have successfully made the code strict with the help of others here. I just want to re-iterate what I am trying to achieve:

In reference to the LWP module-based code excerpt below, I am trying to pull the files represented by my $plyrurl (line 24) below and save them as "$outputdir/$game_$players" represented by my $filename(line 31) (so a file might save as "$outputdir/gid_2007_08_06_quiaaa_yucaaa_1_112039.xml"). One point of confusion for me is that the original code author pulls content from 2 pages on his way to the destination page (line 9 and 22). This type of code is called a "spider" so maybe the code needs to pull data from each page on its path to the destination page? I just think it would be easier to find a solution with simpler code if anything can be taken out!

That said, I tried to adapt your original reply using the batters example to this code but am getting lost with the character class abbreviations. Because of that, I am trying to concatenate the strings instead. The current code here is returning a "could not open file ./xxx_players/gid_2007_08_06_quiaaa_yucaaa_1/_112039.xml No such file or directory" error. Shouldn't the last part not have the backslash there before '_112039'?

my $sourceurl = "http://gd2.mlb.com/components/game/aaa"; my $outputdir = "./xxx_players"; my $dayurl = "$sourceurl/year_$year/month_$mon/day_$mday/"; print "\t$dayurl\n"; my $response = $browser->get($dayurl); die "Couldn't get $dayurl: ", $response->status_line, "\n" unless $response->is_success; my $html = $response->content; my @games = @_; while($html =~ m/<a href=\"(gid_\w+\/)\"/g ){ push @games, $1;} # the loop that downloads data my $game; foreach $game (@games) { my $gameurl = "$dayurl/$game"; $response = $browser->get($gameurl); die "Couldn't get $gameurl: ", $response->status_line, "\n" unless $response->is_success; my $gamehtml = $response->content; if($gamehtml =~ m/<a href=\"players\.xml\"/ ) { my $plyrurl = "$dayurl/$game/players.xml"; $response = $browser->get($plyrurl); die "Couldn't get $plyrurl: ", $response->status_line, "\n" unless $response->is_success; my $plyrhtml = $response->content; my $players = 'players.xml'; my $filename = "$outputdir/$game" . "$players"; print "\t\tfetching game: ${game}_$players\n"; open(FILEHANDLE, ">","$filename" or die "could not open file $filename $!\n"; print FILEHANDLE "$game" . "$filename"; close FILEHANDLE; } else {my $players = 'players.xml'; print "warning: no player list for $game . $players\n"; }

In reply to Re^6: How to Save Fetched Web Files as "path/$string.xml" by nase
in thread How to Save Fetched Web Files as "path/$string.xml" by nase

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.