A string parsing question

LordAvatar has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

I am writing an application which parses information
from National Weather Service warnings.
The top 3 lines of each warning message appear as follows:

WUUS52 KGSP 011455
SVRGSP
NCC045-071-109-SCC021-087-091-011545-
The third line contains a series of state/county codes. If the warning is issued for multiple counties in the same state only the first county in the state has the state abbreviation appended to it. In the above example counties 045, 071 and 109 in North Carolina are under a severe thunderstorm warning. Multiple states and counties are handled as in the example above. The last 6 digits are the expiration time in DDHHMM format.

I am trying to parse the third line and match the intermediate county codes to their parent state i.e 071 and 109 to NC.

Here is a snippet of code that I've written which reads the first and third lines. I've tried a few regexes for the third line but I haven't found a way to reliably add the state abbreviation to the intermediate codes. Right now I just get a list of all the codes like this:

NCC045 071 109 SCC021 087..etc.
I am trying to create:
NCC045 NCC071 NCC109 SCC021...etc.

Thanks for any help!

LINE: while(defined($line=<F>)) {
           chomp $line;
           $count++;
           #first line
           if ($line=~/\w{4}\d{2}\s+\w{4}\s+(\d{6})/) {
                      $issueTime = $1;
                      push @warningInfo, $issueTime;
              }
              #third line
              if ($count==3) {
                  
                      push @warningInfo, split/-/,$line
                      $count=0;
                      next FILE;
              }
              else {next;}
      }
      close F;

Comment on A string parsing question Select or Download Code

Replies are listed 'Best First'.
Re: A string parsing question by suaveant (Parson) on Apr 20, 2001 at 19:12 UTC
Not too hard... `# I just put it in $_ for example's sake. $_ = 'NCC045-071-109-SCC021-087-091-011545-'; @foo = split '-', $_; until($exp = pop @foo) {}; my $code; for(@foo) { if(s/^([A-Z]+)//) { #if there are letters at the front $code = $1; } push @{$data{$code}}, $_; # or do print $string_with_code = "$code$_\n"; }` [download] Tested, it works... expiration is in $exp (assuming it is ALWAYS the last item) For loop looks at each item, strips of the code and stores it as the current code, then you can do as you like with $_, which contains the number. - Ant	[reply] [d/l]
Re: A string parsing question by Sifmole (Chaplain) on Apr 20, 2001 at 19:35 UTC
I just realized I missed an aspect of your question... Here is a one that actually works. :) `my $data = 'NCC045-071-109-SCC021-087-091-011545-'; my $state; while ($data =~ m/([A-Z]+)?(\d+)/g) { last if (length($2) == 6); $state = $1 unless (! defined $1); print $state, $2, ' '; }` [download] Finally that should be better.	[reply] [d/l]
Re: A string parsing question by jeroenes (Priest) on Apr 20, 2001 at 19:13 UTC
Try `$line = s/([A..Z]{3})(\d+)-(\d+)-(\d+)-([A..Z])/$1$2 $1$3 $1$4 $5/g;` [download] "We are not alone"(FZ)	[reply] [d/l]
Re: Re: A string parsing question by suaveant (Parson) on Apr 20, 2001 at 19:23 UTC
Only problem with that is if there are 4 or 2 counties... which I assume can happen. It's not a great regexp problem... better codewise. - Ant	[reply]
Re: A string parsing question by LordAvatar (Acolyte) on Apr 20, 2001 at 22:27 UTC
Hello fellow Monks, Thanks for your help! -LordAvatar	[reply]
Re: A string parsing question by Sifmole (Chaplain) on Apr 20, 2001 at 19:26 UTC
IGNORE Or.. `my $data = 'NCC045-071-109-SCC021-087-091-011545-'; $data =~ s/([A-Z]+)//; my $state = $1; my @counties = split('-', $data); print join(' ', map { $state.$_; } @counties), "\n";` [download] Runs, and will handle any number of codes on the same line.	[reply] [d/l]
Re: Re: A string parsing question by suaveant (Parson) on Apr 20, 2001 at 19:29 UTC
Ahhh, but it only catches the first state code... the 4th item becomes NCCSCC021, does it not? And the expiry date has NCC prepended, too... - Ant	[reply]
Re: Re: Re: A string parsing question by Sifmole (Chaplain) on Apr 20, 2001 at 19:36 UTC
Thanks for pointing that out. I was never too good at those reading comprehension thingies.	[reply]
Re: Re: A string parsing question by Sifmole (Chaplain) on Apr 20, 2001 at 19:29 UTC
Ignore... this is wrong Another alternative if you don't like the map and split. `my $data = 'NCC045-071-109-SCC021-087-091-011545-'; $data =~ s/([A-Z]+)//; my $state = $1; print $state, $1, ' ' while ($data =~ m/(\d+)/g);` [download]	[reply] [d/l]