'2007_08_06_quiaaab_yucaaa_1' is the end part of the web url to which I linked and a unique way to identify each game. I will look into interpolation and these other things. Much thanks as always. | [reply] |
OK, I believe that interpolation is what I tried to do with the string but my question was what strategy best executes that interpolation. I don't even know for sure that a filename output may contain a string. $game in the filename = that portion of the url that I emphasized earlier. | [reply] |
use warnings;
use strict;
my $root = 'http://gd2.mlb.com/components/game/aaa/year_2007/month_08/
+day_06/gid_2007_08_06_quiaaa_yucaaa_1/batters/';
my @batters = map {"$root$_"} qw(112039.xml 120107.xml);
for my $url (@batters) {
my ($prefix, $file) = $url =~ m!/(gid\w*)/batters/(.*)!;
print "${prefix}_$file\n";
}
Prints:
gid_2007_08_06_quiaaa_yucaaa_1_112039.xml
gid_2007_08_06_quiaaa_yucaaa_1_120107.xml
The double quoted string following the print can drop into your open, but remove the \n at the end and remember to use the three parameter version and to check the result.
DWIM is Perl's answer to Gödel
| [reply] [d/l] [select] |
This post now contains the actual code not previously provided at the bottom.
I have not yet been able to work with the script for those files specifically (that one is next). However, I am trying to apply the interpolation method to a similar script (in scratchpad) that I've already built (using the same gid URL). Please note that I am unable to 'use strict' as it is presenting too many conflicts with the time::local module I'm using in another part of the code.
This 3-parameter open works very well as you suggested but isn't the desired result:
open(FILEHANDLE, ">","$outputdir/players.xml")
This does not work:
open(FILEHANDLE, ">","$outputdir/$game._players.xml")
I'm trying to open as a filename such as $outputdir/$game_players.xml where $game="gid...." The only difference from the last problem is that '110246' is now 'players'. I've tried my best to adapt the code but I keep hitting roadblocks. The important part of the code is now in my scratchpad. Note that I am using LWP to pull data from the web.
Thanks as always. Edit: The warning that I am getting is the 'could not open' one at the bottom of the code. I also read a little bit about the File::Spec module which has something to do with directories and filenames but I was unable to find anything on it past the brief descriptions in the Perl manual. Could be irrelevant.
my $sourceurl = "http://gd2.mlb.com/components/game/aaa";
my $outputdir = "./aaa_players";
my $dayurl = "$sourceurl/year_$year/month_$mon/day_$mday/";
my $response = $browser->get($dayurl);
die "Couldn't get $dayurl: ", $response->status_line, "\n"
unless $response->is_success;
my $html = $response->content;
my @games = @_;
while($html =~ m/<a href=\"(gid_\w+\/)\"/g ){
push @games, $1;}
# the loop that downloads data
foreach $game (@games) {
my $gameurl = "$dayurl/$game";
$response = $browser->get($gameurl);
die "Couldn't get $gameurl: ", $response->status_line, "\n"
unless $response->is_success;
$gamehtml = $response->content;
if($gamehtml =~ m/<a href=\"players\.xml\"/ ) {
$plyrurl = "$dayurl/$game/players.xml";
$response = $browser->get($plyrurl);
die "Couldn't get $plyrurl: ", $response->status_line, "\n"
unless $response->is_success;
$plyrhtml = $response->content;
print "\t\tfetching game: $game\n";
open(FILEHANDLE, ">","$outputdir/players.xml")
or die "could not open file $game/players.xml: $|\n";
print FILEHANDLE "$game\n";
close FILEHANDLE;
} else {
print "warning: no player list for $game\n";
}
| [reply] [d/l] [select] |