gargs has asked for the wisdom of the Perl Monks concerning the following question:

Guys, This is probably a really easy problem to solve for most of you. However, being new to Perl I am having all kinds of problems getting anything to work. I have a report file that has something like the following data in it:
BCG-011 ULTRASITE U WO 1 BC011 WO + 05823 00051 BTS-025 U WO 0 0 SNBXCASD005X RF + 3 TRX-001 U WO 713 0 117 MBCCH P 4 TRX-002 U WO 729 0 117 0 05823 00052 BTS-026 U WO 0 0 SNBXCASD005Y RF + 3 TRX-005 U WO 722 0 117 MBCCH P 3 TRX-006 U BL-TRX 731 0 117 8 05823 00053 BTS-027 U WO 0 1

What I need is for perl to parse out which TRX has BL-TRX status (TRX-006 in the above case) and which SNBX does it belong to (SNBXCASD005Y in the above case) and finally which BCG is it in (BCG-011 in the above case). I would like to generate a report that says something like:
BCG LOCATION TRX STATUS ------------------------------------------------ BCG-011 SNBXCASD005Y TRX-006 BL-TRX
Any help you can provide will be much appreciated...
A.N.C.

update (broquaint): title change (was Newbie Needs Help!)

Replies are listed 'Best First'.
Re: Newbie Needs Help!
by graff (Chancellor) on Jan 29, 2003 at 02:42 UTC
    Thanks for a very clear statement of the goal. Many here would also be grateful to see what sort of code you have tried, so we could see if your problems are few or many...

    In any case, the solution would depend on how easy it is to identify boundaries between consecutive records, given that each record consists of multiple lines with rather intricate white-space formatting. The example suggests that each new record begins with this sort of pattern, which can be used to capture this part of the info for the report:

    /^(BCG-\d+)/ # i.e. line-initial "BCG-" followed by digits
    and that the "SNBX" and "TRX" records always follow line-initial whitespace, so they can be captured as follows:
    /^\s+(SNBX\S+)/ /^\s+(TRX-\d+)\s+\S+\s+BL-TRX/
    If those assumptions hold, then the logic could go something like this (not tested):
    print "BCG LOCATION TRX STATUS\n"; print "---------------------------------\n"; while (<>) { if (/^(BCG-\d+)/) { # start of record $recid = $1; $loc = $trx = $stat = ""; } elsif (/^\s+(SNB\S+)/) { $loc = $1; } elsif (/^\s+(TRX\S+)\s+\S+\s+(BL-TRX)/) { print "$recid $loc $1 $2\n"; } }
    That would probably need some tweaking to get the column alignments right -- you could try the FORMAT mechanism (which I never really got the hang of) or the printf function (which I've been hooked on since cutting my teeth in C years ago).
Re: Newbie Needs Help!
by JamesNC (Chaplain) on Jan 29, 2003 at 06:26 UTC
    Here you go :-)
    #!/perl/bin/perl use strict; print "\n BCG LOCATION TRX STATUS"; print "\n ---------------------------------------------\n"; my $bcg;my $location; my $trx; my $status; while(<DATA>){ chomp; $bcg = $1 if /\b(BCG-\d+)\s+/ig; $location = $1 if /\b(SNBX\w+)\b/ig; if( /\s+(TRX-\d+)\s+\w\s+(BL-TRX)/ig){ $trx = $1; $status=$2; write; } } format STDOUT = @<<<<<< @<<<<<<<<<<< @<<<<<<< @<<<<<< $bcg, $location, $trx, $status . __DATA__ BCG-011 ULTRASITE U WO 1 BC011 WO + 05823 00051 BTS-025 U WO 0 0 SNBXCASD005X RF + 3 TRX-001 U BL-TRX 713 0 117 MBCCH P 4 TRX-002 U WO 729 0 117 0 05823 00052 BTS-026 U WO 0 0 SNBXCASD005Y RF + 3 TRX-005 U WO 722 0 117 MBCCH P 3 TRX-006 U BL-TRX 731 0 117 8 05823 00053 BTS-027 U WO 0 1 BCG-012 ULTRASITE U WO 1 BC011 WO + 05823 00051 BTS-025 U WO 0 0 SNBXCASD125T RF + 3 TRX-031 U BL-TRX 713 0 117 MBCCH P 4 TRX-012 U BL-TRX 729 0 117 0 05823 00052 BTS-026 U WO 0 0 SNBXLASD995Y RF + 3 TRX-005 U WO 722 0 117 MBCCH P 3 TRX-106 U BL-TRX 731 0 117 8 05823 00053 BTS-027 U WO 0 1
    Isn't perl great! :-)
      A very small suggested modification to an excellent reply. Format is a powerful tool use it to your advantage replace
      print "\n BCG LOCATION TRX STATUS"; print "\n ---------------------------------------------\n";
      with
      format STDOUT_TOP = BCG LOCATION TRX STATUS --------------------------------------------- .
      to give automatic page headers with a new page at every 60 lines ( the default ).
      Ay_Bee -_-_-_-_-_-_-_-_-_-_-_- My memory concerns me - but I forget why !!!
Re: Newbie Needs Help!
by Cody Pendant (Prior) on Jan 29, 2003 at 02:37 UTC
    Small amount of help follows:
    while(<DATA>){ if(/^(BCG-\d+)/){$bcg = $1} # current BCG, maybe? if(/^( SNB\w+)/){$snb = $1} # current SNB, maybe? @stuff = split(/\s+/); # put lines into an array if ($stuff[3] eq 'BL-TRX'){ # if the BLT is found print "here's an interesting BCG:\n$bcg $snb @stuff\n"; # report that it has been found } } __DATA__ BCG-011 ULTRASITE U WO 1 BC011 WO + 05823 00051 BTS-025 U WO 0 0 SNBXCASD005X RF + 3 TRX-001 U WO 713 0 117 MBCCH P 4 TRX-002 U WO 729 0 117 0 05823 00052 BTS-026 U WO 0 0 SNBXCASD005Y RF + 3 TRX-005 U WO 722 0 117 MBCCH P 3 TRX-006 U BL-TRX 731 0 117 8 05823 00053 BTS-027 U WO 0 1
    Though I have no idea if that actually helps because your data is just random noise to me.

    Also I just bet some smarter monk than me is going to come along and say "what you've got there is a classic BCG file format and you should use File::Formats::Weird::BCG.pm to handle it for you!".
    --

    “Every bit of code is either naturally related to the problem at hand, or else it's an accidental side effect of the fact that you happened to solve the problem using a digital computer.” M-J D