A loop with lots of flags.
This builds a hash. It loads _all_ the data. You would want to filter out fixed info you don't want. Also, depending on the data you would expect, you would also want to make sure the regexs are tight enough (these are very loose). As you can see, it is in 'verbose' idiom.
I tried it with 2 records and with more than 1 history field.
#!/bin/perl5
use strict;
use warnings;
my %hash;
my $true = 1;
my $false = 0;
my ($header, $history, $footer, @fields );
my ($vendor, $i );
while (<DATA>){
chomp;
next if /^$/;
if (/^VENDOR/ and /PAGE/){
$header = $true;
$footer = $false;
@fields = split;
$vendor = $fields[1];
push @{$hash{$vendor}{'header'}} , @fields;
$i = 0;
next;
}
elsif ( /AWARD\sHISTORY/ ){
$header = $false;
$history = $true;
next;
}
elsif ( /PID/ ){
$history = $false;
$footer = $true;
next;
}
if ( $history ){
@fields = split;
my $history_row = join '', 'history ', $i;
push @{$hash{$vendor}{$history_row}}, @fields;
$i++;
}
elsif ( $footer ){ # bug fix, was $header
@fields = split;
push @{$hash{$vendor}{'footer'}}, @fields;
}
}
open my $out, '>', 'parse.txt';
for my $v ( keys %hash ){
print $out "vendor: $v\n";
for my $rec ( keys %{$hash{$v}} ){
print $out "\trecord:\t$rec\n";
print $out "\t\t";
for my $fld ( @{$hash{$v}{$rec}} ){
print $out "$fld\t";
}
print $out "\n";
}
}
close $out;
__DATA__
VENDOR 61125 TOTAL DOLLAR VAR 77,097.60 PAGE 1 2003 08 01
VENDOR SIS UNIT BASE SHIP TOT DOL DOLLAR PERCENT
CONTRACT NUMBER PRICE PRICE QTY U/I DATE
+ PR NUMBER BIN/PART NUMBER VALUE VARIANCE VARIANCE
YT67DY7898DUFT5126 88.20000 70.00000 50 EA 0000000
+0 POI90809819856 1560007117067 4,410.00 910.00 0
AWARD HISTORY PIIN BSCM N/A U/I UNIT PRI
+CE AWD DT QTY OPT DT FOB REP TYPE
765WTY34TF56A 7J777 N EA 39.5
+5000 93012 147 00000 2 Y B
PID DATA LINE NR
+ LINE NR
01 001PART, DESCRIPTION, DATA
+ 02 002TECHNICAL DATA AVAILABILITY:
03 003
VENDOR 61126 TOTAL DOLLAR VAR 77,097.60 PAGE 1 2003 08 01
VENDOR SIS UNIT BASE SHIP TOT DOL DOLLAR PERCENT
CONTRACT NUMBER PRICE PRICE QTY U/I DATE
+ PR NUMBER BIN/PART NUMBER VALUE VARIANCE VARIANCE
YT67DY7898DUFT5126 88.20000 70.00000 50 EA 0000000
+0 POI90809819856 1560007117067 4,410.00 910.00 0
AWARD HISTORY PIIN BSCM N/A U/I UNIT PRI
+CE AWD DT QTY OPT DT FOB REP TYPE
765WTY34TF56A 7J777 N EA 39.5
+5000 93012 147 00000 2 Y B
765WTY34TF56B 7J777 N EA 39.5
+5000 93012 147 00000 2 Y B
765WTY34TF56C 7J777 N EA 39.5
+5000 93012 147 00000 2 Y B
PID DATA LINE NR
+ LINE NR
01 001PART, DESCRIPTION, DATA
+ 02 002TECHNICAL DATA AVAILABILITY:
03 003
produces..
vendor: 61125
record: history 0
765WTY34TF56A 7J777 N EA 39.55000 93012 147...
record: footer
01 001PART, DESCRIPTION, DATA 02 002TECHNICAL...
record: header
VENDOR 61125 TOTAL DOLLAR VAR 77,097.60 PAGE...
vendor: 61126
record: history 0
765WTY34TF56A 7J777 N EA 39.55000 93012 147...
record: history 2
765WTY34TF56C 7J777 N EA 39.55000 93012 147...
record: history 1
765WTY34TF56B 7J777 N EA 39.55000 93012 147...
record: footer
01 001PART, DESCRIPTION, DATA 02 002TECHNICAL...
record: header
VENDOR 61126 TOTAL DOLLAR VAR 77,097.60 PAGE...
Update: added output
Update2: Fixed bug! Footer wasn't stored.
Update3: Truncated and formated the output (tabs were a bad idea) |