in reply to Re^3: Parsing large text file with perl
in thread Parsing large text file with perl

Thank you all for the help... Hopefully my last question. Here is the code that at the very least seperates the data out.
#!/usr/bin/perl use strict; use warnings; my $true = 1; my $false = 0; my ($header, $history, $footer, @fields); my ($vendor, $i); my $file = "AUG.txt"; my @FILE; my $vendor_id = 0; my @VENDORS; my @CONTRACTS; my $contract_id = 0; my @AWARDS; open (INFILE, $file); @FILE = <INFILE>; close (INFILE); foreach (@FILE){ chomp; next if /^$/; if (/VENDOR.+PAGE/){ @fields = split; $vendor_id++; #push @VENDORS,"$vendor_id $fields[1]\n"; print "\n\nVENDOR \= $fields[1]\n"; next; } elsif (/\s+?\S{17}\s+?\S+?\./){ #push @CONTRACTS,"$vendor_id $_\n"; @fields = split; print " CONTRACT NUMBER \= $fields[0]\n"; print " VENDOR PRICE \= $fields[1]\n"; print " BASE PRICE \= $fields[2]\n"; print " QTY \= $fields[3]\n"; print " SHIP DATE \= $fields[4]\n"; print " PR NUMBER \= $fields[5]\n"; print " ARR NUMBER \= $fields[6]\n"; print " DOLLAR VALUE \= $fields[7]\n"; print " DOLLAR VARIENCE \= $fields[8]\n"; print " PERCENT VARIANCE \= $fields[9]\n"; print "\n"; next; } elsif (/^\s+?\S{13}\s+?\S+?\s+?\S/){ #print "$_\n"; $_ =~ s/^\s*//; my @fields = unpack "a21 a9 a9 a2 a13 a8 a9 a9 a4 a5 a6", $_; print " PIIN \= $fields[0]\n"; print " FSCM \= $fields[1]\n"; print " N/A \= $fields[2]\n"; print " U/I \= $fields[3]\n"; print " UNIT PRICE \= $fields[4]\n"; print " AWD DT \= $fields[5]\n"; print " QTY \= $fields[6]\n"; print " OPT DT \= $fields[7]\n"; print " FOB \= $fields[8]\n"; print " REP \= $fields[9]\n"; print " TYPE \= $fields[10]\n"; print "\n"; } else{ $_ =~ s/^\s*//; if (/^\d{2}\s\d{3}/){ print "$_\n"; } } }
Part of the out put:
VENDOR = 1NWV5 CONTRACT NUMBER = AAB40003VG880MODF VENDOR PRICE = 3.25000 BASE PRICE = 0.76000 QTY = 34 SHIP DATE = EA PR NUMBER = 00000000 ARR NUMBER = YPG03188000386 DOLLAR VALUE = 3110009197232 DOLLAR VARIENCE = 110.50 PERCENT VARIANCE = 84.66 PIIN = CFS50080P7291 FSCM = 5N366 N/A = N U/I = EA UNIT PRICE = 0.30000 AWD DT = 80004 QTY = 6,600 OPT DT = 00000 FOB = D REP = Y TYPE = B 01 001ROLLER,NEEDLE 02 002DIV GENERAL MOTORS CORP 03 003PAGE 73342 04 004P/N 2275468 05 005IDENTIFY TO: 06 006 07 007
Ofcourse keeping in mind that both the contract information and the history information can repeat any number of times per vendor. Now I need to somehow create a data structure that will allow me to easily read the data back out and make database inserts. Here is how the data is related:
VENDOR = 1NWV5 FOREACH VENDOR LIST OF CONTRACTS FOREACH CONTRACT LIST OF CONTRACT INFORMATION LIST OF AWARDS FOREACH AWARD LIST OF AWARD INFORMATION CONTRACT DESCRIPTION [The three or four lines after the h +istory - This is getting dumped in a big text field in the database.]
From the looks of it I would have an Array of Vendors containing an Array of Contracts containg two Hashes (Contract Information and Contract Description) and an Array of Hashes. What I just said doesn't even make since to me. So hopefully you can put it in perspective or suggest an easier way. As i need to be able to pull the data back out of the structure. Thanks again -Shawn

Replies are listed 'Best First'.
Re^5: Parsing large text file with perl
by wfsp (Abbot) on Sep 03, 2004 at 05:27 UTC
    Something like this:
    push @{$hash{$vendor}{$history_row}}, @fields;
    from my original suggestion. That's all you have to do! That one is a hash of a hash of arrays. The for loop at the end demonstrated how you would unroll it.

    Have a look at perldsc and perllol. You will see how to build complex data structures like the one above and adapt it. Once you get the hang of it is very easy to use.

    Extracting and reporting is perl's bread and butter! Have a look at the docs, see how I built my structure and have a go at adapting it.

    If you get stuck come back.

    btw I would get advice on that unpack! Ask another question. If it gets out I'm giving advice like that I'll be excommuncated!

      Thanks again, I will ask.