Hi,

I've a file containing ~20k lines of ASCII data. Each line has about 170 fields. I care about ~30 fields, scattered throughout each line. On each line, data is not delimited. Each field is of fixed width. The first field on each line is unique (their destiny to become my hash key).

Speed is not a factor, but it seems "unpack" may be the best choice. In my initial implementation, I've never coded anything so ugly as this. The line with the format specifier is almost 400 characters long. Yuk!

my $hash; while (<$fh>) { next unless m/^(\d{14})/; my $code = $1; ($hash->{$code}->{'CODE'}, $hash->{$code}->{'FIELD2'}, $hash->{$code}->{'FIELD3'}, $hash->{$code}->{'FIELD4'}, $hash->{$code}->{'FIELD5'}, $hash->{$code}->{'FIELD6'}, ... $hash->{$code}->{'FIELD170'}) = unpack( "A14A1A1A1A5A5A30A50A20A1A5A5A5A5...."); }
My second implementation used the document that specified the width to automatically build up the format string and grab the names of the fields:
($hash->{$code}->{$name[0]}, $hash->{$code}->{$name[1]}, $hash->{$code}->{$name[169]}) = unpack($fmt);
It was not significantly better because I still need to know the number of fields in order to built the left-hand-side and now I've got another file to parse, etc.

I'm really missing something. If I used the regexp engine with /g, I could programmatically walk down each line pulling out the fields I want.

I'm just not sure here...other than that I must be missing something. Your advice is GREATLY appreciated!

Thanks!
Cheers!


In reply to Unpack Many Fields by shoness

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.