comment on

Our small group of programmers will be parsing (and sometimes writing), a fair amount of fixed-width-field records, and we want to settle on a local convention for specifying the format string used by unpack(), as well as the array of field names for the record. We will do the unpacking into a hash slice with something like

my %test_vals;
@test_vals{ @TEST_FLD } = unpack $TEST_REC_PACK, $rec;
[download]

to load up the test_vals hash . Below is our first pass on specifying the two constants needed, $TEST_REC_PACK and @TEST_FLD.

I've looked at Data::FixedFormat, Text::FixedLength and Parse::FixedLength to see if they would simplify things, but they don't appear offer enough to justify the mental overhead of having to refer to documentation in a separate module. Initially at least, plain old Perl looks clearer.

Update: Note that the correspondence with the fat commas, "=>" is loading an array, rather than a hash. This is so that the field specs and field names are available in order. A hash would lose this order.

I have two questions, at different levels, about our first pass approach below.

Does this make sense to the monks as a clear way to present the spec? ("Looks fine to me." is just as useful a response as "I suggest you do this differently...")
Are there suggestions for what we use to split the array of fld-name => pack-spec pairs? Does the cleaner and more compartmentalized approach of the sub-call version outweigh the compactness of the one-line-ish idiom, or the other way round. Do you have something clearer to suggest? More elegant?

Your considered wisdom, advice and opinions are much appreciated.

my @TEST_FLD = ();
my $TEST_REC_PACK = q{}; 
my @TEST_REC_PACK_FLDS = (
   status => 'n  ',   #   0
   time   => 'n  ',   #   2
   date   => 'N  ',   #   4
   code   => 'a16',   #   8   key
   msid   => 'a10',   #  24   key);
my $TEST_REC_LEN =       34;

# split pack-spec & fld names (one-line-ish)
my $odd=0;
($odd^=1) ? push @TEST_FLD,$_ : ($TEST_REC_PACK.=$_) 
                                for (@TEST_REC_PACK_FLDS);

# split pack-spec & fld names (using a sub)
($TEST_REC_PACK,@TEST_FLD) = 
               Split_pack_flds_spec(@TEST_REC_PACK_FLDS);

sub Split_pack_flds_spec {
   my $spec;
   my @fld;
   my $odd = 0;
   for (@_) { 
      if ($odd = !$odd) {
         push @fld,$_;
      }
      else {
         $spec .= $_; 
      }
   }
   return ($spec,@fld);
}
[download]

Testing Code:

print 'fields    =  (',join(',',@TEST_FLD),")\n";
print "pack spec =  \"$TEST_REC_PACK\"\n";

my $rec = "\x{00}\x{7B}\x{06}\x{B3}\x{01}\x{32}\x{1A}\x{83}"
         .'code_value------msid_value';

my %test_vals;
@test_vals{ @TEST_FLD } = unpack $TEST_REC_PACK, $rec;

my ($key,$val);
while (($key,$val)=each(%test_vals)) {
   printf " %6s => %s\n",$key,$val;
}
[download]

In reply to Clarity in parsing fixed length data. by rodion

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.