Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

#!/usr/bin/perl while(@data = <DATA>){ $au=''; $by=''; for($i=0;$i<$#data;$i++){ if($data[$i]=~ m/^\.\.AU/){ unless($data[$i+1]=~ /^\.\./){ $data[$i+1] =~ s/^\s+|\s+$//; $au = $data[$i+1]; } } if($data[$i]=~ m/^\.\.BD/){ unless($data[$i+1]=~ /^\.\./){ $data[$i+1] =~ s/^\s+|\s+$//; $by = $data[$i+1]; } } } print "$au,$by"; } __DATA__ ..HD: ..SE: ..AU: C ..BD: ON PAC ..BD: BY PK ..SE: ..AU: R CHRIS ..BD: ON PAC-20 FOOTBALL ..BD: ON PAC-30 BASKETBALL ..AU: DK
How to make a pair of AU and BD and print all the values. If there are corresponding BD values how to concatenate with "," unless any other tag with matching with "..text:" is found. If no BD is found for the corresponding "AU",, we should not add comma. output should look like:
C,ON PAC,BY PK R CHRIS,ON PAC-20 FOOTBALL,ON PAC-30 BASKETBALL DK

Replies are listed 'Best First'.
Re: match tags
by jethro (Monsignor) on Nov 09, 2009 at 10:53 UTC
    Create a variable (lets call it $collect), where you concatenate any BDs. If you find anything else, print and clear $collect. If that 'anything else' is AU, initialize $collect with it. In essence:
    if ($data[$i]=~ m/^\.\.BD/) { unless ... { $collect .= ',' . $data[i+1]; } } else { print $collect,"\n" if ($collect); $collect=''; if ($data[$i]=~ m/^\.\.AU/) { unless ... $collect= $data[i+1]; } } } }

    You should be able to adapt the rest.

    Note that you would need to add a g modifier behind s/^\s+|\s+$// so that both leading and trailing spaces can be removed in the same string, otherwise it would remove only the first one it finds

    If you need error detection (because there might be files that have no AU before a BD), you can test that too. If you find a BD while $collect is empty, that would be the condition to print an error message

    Also you don't have any use warnings; and use strict; in your script. While not necessary for a working program those lines are heavily recommended. And you will hear that advice every time you post something here unless you add them

Re: match tags
by arun_kom (Monk) on Nov 09, 2009 at 12:03 UTC

    A simple approach.

    use strict; use warnings; my $found = ''; while(<DATA>){ chomp; if($_=~ /^\.\.AU:$/) { $found = 'AU'; next; } if($found eq 'AU'){ print "\n".$_; } if($_=~ /^\.\.BD:$/) { $found = 'BD'; next; } if($found eq 'BD'){ print ','.$_; } $found = ''; }

    UPDATE: Could have been shorter ;)

    my $tag; while(<DATA>){ chomp; if(/([A-Z]{2}):$/){ $tag = $1; next; } if($tag eq 'AU'){ print "\n".$_; } if($tag eq 'BD'){ print ','.$_; } }
Re: match tags
by shmem (Chancellor) on Nov 09, 2009 at 16:19 UTC
    while (@data = <DATA>) { ... }

    doesn't make sense. The expression @data = <DATA> reads all lines from the filehandle DATA into @data. End while. Drop it.

    Instad of

    for($i=0;$i<$#data;$i++){

    better say

    for my $line (@data) {

    That said, I'd do

    my @tokens; while (<DATA>) { if (/\.\.(AU|BD):/) { if ($1 eq 'AU') { print join( ',',@tokens),"\n", @tokens = () if @tokens; } chomp (my $line = <DATA>); push @tokens, $line; } } print join( ',',@tokens),"\n", @tokens = () if @tokens;