Thank you all for the help... Hopefully my last question. Here is the code that at the very least seperates the data out.
#!/usr/bin/perl use strict; use warnings; my $true = 1; my $false = 0; my ($header, $history, $footer, @fields); my ($vendor, $i); my $file = "AUG.txt"; my @FILE; my $vendor_id = 0; my @VENDORS; my @CONTRACTS; my $contract_id = 0; my @AWARDS; open (INFILE, $file); @FILE = <INFILE>; close (INFILE); foreach (@FILE){ chomp; next if /^$/; if (/VENDOR.+PAGE/){ @fields = split; $vendor_id++; #push @VENDORS,"$vendor_id $fields[1]\n"; print "\n\nVENDOR \= $fields[1]\n"; next; } elsif (/\s+?\S{17}\s+?\S+?\./){ #push @CONTRACTS,"$vendor_id $_\n"; @fields = split; print " CONTRACT NUMBER \= $fields[0]\n"; print " VENDOR PRICE \= $fields[1]\n"; print " BASE PRICE \= $fields[2]\n"; print " QTY \= $fields[3]\n"; print " SHIP DATE \= $fields[4]\n"; print " PR NUMBER \= $fields[5]\n"; print " ARR NUMBER \= $fields[6]\n"; print " DOLLAR VALUE \= $fields[7]\n"; print " DOLLAR VARIENCE \= $fields[8]\n"; print " PERCENT VARIANCE \= $fields[9]\n"; print "\n"; next; } elsif (/^\s+?\S{13}\s+?\S+?\s+?\S/){ #print "$_\n"; $_ =~ s/^\s*//; my @fields = unpack "a21 a9 a9 a2 a13 a8 a9 a9 a4 a5 a6", $_; print " PIIN \= $fields[0]\n"; print " FSCM \= $fields[1]\n"; print " N/A \= $fields[2]\n"; print " U/I \= $fields[3]\n"; print " UNIT PRICE \= $fields[4]\n"; print " AWD DT \= $fields[5]\n"; print " QTY \= $fields[6]\n"; print " OPT DT \= $fields[7]\n"; print " FOB \= $fields[8]\n"; print " REP \= $fields[9]\n"; print " TYPE \= $fields[10]\n"; print "\n"; } else{ $_ =~ s/^\s*//; if (/^\d{2}\s\d{3}/){ print "$_\n"; } } }
Part of the out put:
VENDOR = 1NWV5 CONTRACT NUMBER = AAB40003VG880MODF VENDOR PRICE = 3.25000 BASE PRICE = 0.76000 QTY = 34 SHIP DATE = EA PR NUMBER = 00000000 ARR NUMBER = YPG03188000386 DOLLAR VALUE = 3110009197232 DOLLAR VARIENCE = 110.50 PERCENT VARIANCE = 84.66 PIIN = CFS50080P7291 FSCM = 5N366 N/A = N U/I = EA UNIT PRICE = 0.30000 AWD DT = 80004 QTY = 6,600 OPT DT = 00000 FOB = D REP = Y TYPE = B 01 001ROLLER,NEEDLE 02 002DIV GENERAL MOTORS CORP 03 003PAGE 73342 04 004P/N 2275468 05 005IDENTIFY TO: 06 006 07 007
Ofcourse keeping in mind that both the contract information and the history information can repeat any number of times per vendor. Now I need to somehow create a data structure that will allow me to easily read the data back out and make database inserts. Here is how the data is related:
VENDOR = 1NWV5 FOREACH VENDOR LIST OF CONTRACTS FOREACH CONTRACT LIST OF CONTRACT INFORMATION LIST OF AWARDS FOREACH AWARD LIST OF AWARD INFORMATION CONTRACT DESCRIPTION [The three or four lines after the h +istory - This is getting dumped in a big text field in the database.]
From the looks of it I would have an Array of Vendors containing an Array of Contracts containg two Hashes (Contract Information and Contract Description) and an Array of Hashes. What I just said doesn't even make since to me. So hopefully you can put it in perspective or suggest an easier way. As i need to be able to pull the data back out of the structure. Thanks again -Shawn

In reply to Re^4: Parsing large text file with perl by maida
in thread Parsing large text file with perl by maida

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.