Re^4: Parsing large text file with perl

Thank you all for the help... Hopefully my last question. Here is the code that at the very least seperates the data out.

#!/usr/bin/perl

use strict;
use warnings;

my $true = 1;
my $false = 0;
my ($header, $history, $footer, @fields);
my ($vendor, $i);
my $file = "AUG.txt";
my @FILE;
my $vendor_id = 0;
my @VENDORS;
my @CONTRACTS;
my $contract_id = 0;
my @AWARDS;

open (INFILE, $file);
@FILE = <INFILE>;
close (INFILE);


foreach (@FILE){
   chomp;
   next if /^$/;
   if (/VENDOR.+PAGE/){
      @fields = split;
      $vendor_id++;
      #push @VENDORS,"$vendor_id $fields[1]\n";
      print "\n\nVENDOR \= $fields[1]\n";
      next;
   }
   elsif (/\s+?\S{17}\s+?\S+?\./){
      #push @CONTRACTS,"$vendor_id $_\n";
      @fields = split;
      print "   CONTRACT NUMBER  \= $fields[0]\n";
      print "   VENDOR PRICE     \= $fields[1]\n";
      print "   BASE PRICE       \= $fields[2]\n";
      print "   QTY              \= $fields[3]\n";
      print "   SHIP DATE        \= $fields[4]\n";
      print "   PR NUMBER        \= $fields[5]\n";
      print "   ARR NUMBER       \= $fields[6]\n";
      print "   DOLLAR VALUE     \= $fields[7]\n";
      print "   DOLLAR VARIENCE  \= $fields[8]\n";
      print "   PERCENT VARIANCE \= $fields[9]\n";
      print "\n";
      next;
   }
   elsif (/^\s+?\S{13}\s+?\S+?\s+?\S/){
      #print "$_\n";
      $_ =~ s/^\s*//;
      my @fields = unpack "a21 a9 a9 a2 a13 a8 a9 a9 a4 a5 a6", $_;

        print "      PIIN       \= $fields[0]\n";
        print "      FSCM       \= $fields[1]\n";
        print "      N/A        \= $fields[2]\n";
        print "      U/I        \= $fields[3]\n";
        print "      UNIT PRICE \= $fields[4]\n";
        print "      AWD DT     \= $fields[5]\n";
        print "      QTY        \= $fields[6]\n";
        print "      OPT DT     \= $fields[7]\n";
        print "      FOB        \= $fields[8]\n";
        print "      REP        \= $fields[9]\n";
        print "      TYPE       \= $fields[10]\n";
        print "\n";
   }
   else{
     $_ =~ s/^\s*//;
     if (/^\d{2}\s\d{3}/){
        print "$_\n";
     }
   }
}
[download]

Part of the out put:

VENDOR = 1NWV5
   CONTRACT NUMBER  = AAB40003VG880MODF
   VENDOR PRICE     = 3.25000
   BASE PRICE       = 0.76000
   QTY              = 34
   SHIP DATE        = EA
   PR NUMBER        = 00000000
   ARR NUMBER       = YPG03188000386
   DOLLAR VALUE     = 3110009197232
   DOLLAR VARIENCE  = 110.50
   PERCENT VARIANCE = 84.66

      PIIN       = CFS50080P7291
      FSCM       = 5N366
      N/A        = N
      U/I        = EA
      UNIT PRICE =       0.30000
      AWD DT     =    80004
      QTY        =     6,600
      OPT DT     =     00000
      FOB        =    D
      REP        =     Y
      TYPE       =      B

01 001ROLLER,NEEDLE     02 002DIV GENERAL MOTORS CORP
03 003PAGE 73342        04 004P/N 2275468
05 005IDENTIFY TO:      06 006
07 007
[download]

Ofcourse keeping in mind that both the contract information and the history information can repeat any number of times per vendor. Now I need to somehow create a data structure that will allow me to easily read the data back out and make database inserts. Here is how the data is related:

VENDOR = 1NWV5
   FOREACH VENDOR
       LIST OF CONTRACTS
          FOREACH CONTRACT
             LIST OF CONTRACT INFORMATION
             LIST OF AWARDS
                FOREACH AWARD
                   LIST OF AWARD INFORMATION
             CONTRACT DESCRIPTION [The three or four lines after the h
+istory - This is getting dumped in a big text field in the database.]
[download]

From the looks of it I would have an Array of Vendors containing an Array of Contracts containg two Hashes (Contract Information and Contract Description) and an Array of Hashes. What I just said doesn't even make since to me. So hopefully you can put it in perspective or suggest an easier way. As i need to be able to pull the data back out of the structure. Thanks again -Shawn

Comment on Re^4: Parsing large text file with perl Select or Download Code

Replies are listed 'Best First'.
Re^5: Parsing large text file with perl by wfsp (Abbot) on Sep 03, 2004 at 05:27 UTC
Something like this: `push @{$hash{$vendor}{$history_row}}, @fields;` [download] from my original suggestion. That's all you have to do! That one is a hash of a hash of arrays. The `for` loop at the end demonstrated how you would unroll it. Have a look at perldsc and perllol. You will see how to build complex data structures like the one above and adapt it. Once you get the hang of it is very easy to use. Extracting and reporting is perl's bread and butter! Have a look at the docs, see how I built my structure and have a go at adapting it. If you get stuck come back. btw I would get advice on that `unpack`! Ask another question. If it gets out I'm giving advice like that I'll be excommuncated!	[reply] [d/l] [select]
Re^6: Parsing large text file with perl by Anonymous Monk on Sep 03, 2004 at 10:32 UTC
Thanks again, I will ask.	[reply]