match tags

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

#!/usr/bin/perl
while(@data = <DATA>){
$au='';
$by='';
 for($i=0;$i<$#data;$i++){
if($data[$i]=~ m/^\.\.AU/){
                unless($data[$i+1]=~ /^\.\./){
                        $data[$i+1] =~ s/^\s+|\s+$//;
                        $au = $data[$i+1];
                }
        }
if($data[$i]=~ m/^\.\.BD/){
                unless($data[$i+1]=~ /^\.\./){
                        $data[$i+1] =~ s/^\s+|\s+$//;
                        $by = $data[$i+1];


                }
        }

}
print "$au,$by";
}

__DATA__
..HD:
..SE:
..AU:
C
..BD:
ON PAC
..BD:
BY PK
..SE:
..AU:
R CHRIS
..BD:
ON PAC-20 FOOTBALL
..BD:
ON PAC-30 BASKETBALL
..AU:
DK
[download]

How to make a pair of AU and BD and print all the values. If there are corresponding BD values how to concatenate with "," unless any other tag with matching with "..text:" is found. If no BD is found for the corresponding "AU",, we should not add comma. output should look like:

C,ON PAC,BY PK
R CHRIS,ON PAC-20 FOOTBALL,ON PAC-30 BASKETBALL
DK
[download]

Comment on match tags Select or Download Code

Replies are listed 'Best First'.
Re: match tags by jethro (Monsignor) on Nov 09, 2009 at 10:53 UTC
Create a variable (lets call it $collect), where you concatenate any BDs. If you find anything else, print and clear $collect. If that 'anything else' is AU, initialize $collect with it. In essence: `if ($data[$i]=~ m/^\.\.BD/) { unless ... { $collect .= ',' . $data[i+1]; } } else { print $collect,"\n" if ($collect); $collect=''; if ($data[$i]=~ m/^\.\.AU/) { unless ... $collect= $data[i+1]; } } } }` [download] You should be able to adapt the rest. Note that you would need to add a g modifier behind `s/^\s+\|\s+$//` so that both leading and trailing spaces can be removed in the same string, otherwise it would remove only the first one it finds If you need error detection (because there might be files that have no AU before a BD), you can test that too. If you find a BD while $collect is empty, that would be the condition to print an error message Also you don't have any `use warnings;` and `use strict;` in your script. While not necessary for a working program those lines are heavily recommended. And you will hear that advice every time you post something here unless you add them	[reply] [d/l] [select]
Re: match tags by arun_kom (Monk) on Nov 09, 2009 at 12:03 UTC
A simple approach. `use strict; use warnings; my $found = ''; while(<DATA>){ chomp; if($_=~ /^\.\.AU:$/) { $found = 'AU'; next; } if($found eq 'AU'){ print "\n".$_; } if($_=~ /^\.\.BD:$/) { $found = 'BD'; next; } if($found eq 'BD'){ print ','.$_; } $found = ''; }` [download] UPDATE: Could have been shorter ;) `my $tag; while(<DATA>){ chomp; if(/([A-Z]{2}):$/){ $tag = $1; next; } if($tag eq 'AU'){ print "\n".$_; } if($tag eq 'BD'){ print ','.$_; } }` [download]	[reply] [d/l] [select]
Re: match tags by shmem (Chancellor) on Nov 09, 2009 at 16:19 UTC
`while (@data = <DATA>) { ... }` [download] doesn't make sense. The expression `@data = <DATA>` reads all lines from the filehandle `DATA` into `@data`. End while. Drop it. Instad of `for($i=0;$i<$#data;$i++){` [download] better say `for my $line (@data) {` [download] That said, I'd do `my @tokens; while (<DATA>) { if (/\.\.(AU\|BD):/) { if ($1 eq 'AU') { print join( ',',@tokens),"\n", @tokens = () if @tokens; } chomp (my $line = <DATA>); push @tokens, $line; } } print join( ',',@tokens),"\n", @tokens = () if @tokens;` [download]	[reply] [d/l] [select]