appletag has asked for the wisdom of the Perl Monks concerning the following question:

Scope: Extract a clean forecast from this url:
http://weather.noaa.gov/pub/data/forecasts/zone/pa/paz049.txt

I use HTTPGET on a WinXP Pro box to d/l the file. I was given this code (from Chatterbox users) to delete the first 16 lines:
perl -n -i.bak -e "print if $.>16" paz049.txt

Someone on the Chatterbox (bart and diotalevi) suggested a solution using:
print if s/^\.//

The "good" lines of the forecast all have a period at the beginning. I'm assuming this could be used to "extract" the good lines.

The .txt file is used for a weather crawler on in-store digital signage ( think WalMart TV). I'm using a batch file to execute the HTTPGET and the lines of perl.

Any help for this Perl NOOB is greatly appreciated.

Ben

Replies are listed 'Best First'.
Re: Editing a text file
by bart (Canon) on Nov 20, 2006 at 20:31 UTC
    I disagree with diotalevi: not all "interesting lines" start with a dot. I can see that those lines can be wrapped. An example of the file I just downloaded using wget:
    .WEDNESDAY...MOSTLY SUNNY. HIGHS IN THE LOWER 50S. EAST WINDS 5 TO 10 MPH. .WEDNESDAY NIGHT...MOSTLY CLEAR. LOWS IN THE UPPER 20S. EAST WINDS 5 TO 10 MPH. .THANKSGIVING DAY...PARTLY SUNNY. HIGHS IN THE LOWER 50S.
    See the wrapping in or around the wind speeds?

    I think one easy way to unwrap such a short file, is to first collect the block you want, and then split on the dots that start a line. Something like:

    @ARGV = 'paz049.txt'; my $buffer = ''; while(<>) { if(s/^\.// .. /^$/) { $buffer .= $_; } } $buffer =~ s/\s+$//; $buffer =~ s/\n(?!\.)/ /g; my @lines = split /\s*\n\./, $buffer;

    If I now print out all the data in @lines, I seem to get the proper result:

      Thanks bart, I'm evaluating Geo::WeatherNOAA but I want to learn more about perl. I look forward to trying out this code. BR
Re: Editing a text file
by SheridanCat (Pilgrim) on Nov 20, 2006 at 21:48 UTC
    Have you looked at Geo::WeatherNOAA on CPAN? According to the POD:
    This module is intended to interpret the NOAA zone forecasts and current city hourly data files. It should give a programmer an easy time to use the data instead of having to mine it.
      Thanks! Penfold suggested this module earlier. I loaded the other prereq's and I'm trying it out. I'll let everyone know how it goes.
Re: Editing a text file
by madbombX (Hermit) on Nov 20, 2006 at 20:32 UTC
      I look forward to checking out the links you provided. I'm really enjoying this exercise. When a script has a print command in it, how can you send that to a file?
        When a script has a print command in it, how can you send that to a file?

        From inside the script, you can print to any file that's been opened for output just by specifying the filehandle in the print command, before the first argument:

        open SOMEFILE, '>', $filename; # Then... print SOMEFILE "Blah, blah, blah";

        Alternately, from outside the script, when calling it, most operating environments provide a way to redirect the standard output of the script to a file. For instance, in a POSIX-like environment you can use tee or a shell redirection operator (usually > in most Unix-style shells, and also in DOS, OS/2, and NT, among others). If you need more specific information about this, you'll have to specify more details about your operating environment.


        Sanity? Oh, yeah, I've got all kinds of sanity. In fact, I've developed whole new kinds of sanity. You can just call me "Mister Sanity". Why, I've got so much sanity it's driving me crazy.
Re: Editing a text file
by Not_a_Number (Prior) on Nov 20, 2006 at 20:34 UTC

    No idea what you mean when you say "think Walmart TV", but maybe this could do something like what you seem to require (without the need for batch files):

    use strict; use warnings; use LWP::Simple; my $url = 'http://weather.noaa.gov/pub/data/forecasts/zone/pa/paz049.t +xt'; my @content = grep s/^\.//, split "\n", get $url or die "Couldn't get it!\n"; print join "\n", @content;
      Walmart TV is the in-house TV's running advertisment. Our system has ads and a weather ticker. I'm currently evaluating Get::WeatherNOAA but I want to learn some perl so I'm going to try what you sent. I have LWP::Simple loaded so trying that code should be a pretty easy. I need to learn how to handle output now. Thanks!
        correction: Geo::WeatherNOAA