in reply to Clarity in parsing fixed length data.

Both of your alternatives (one-line-ish, and using a sub) are very hard to read, and require lots of duplicate code for each message type. Hide the gory details, leaving a clean interface (such as those below). This process is called encapsulation.

A procedural interface:

my $TEST_REC_FORMAT = get_msg_format [ status => 'n ', # 0 time => 'n ', # 2 date => 'N ', # 4 code => 'a16', # 8 key msid => 'a10', # 24 key ]; my %test_vals = parse_msg($TEST_REC_FORMAT, $rec);

An OO interface:

my $TEST_REC_FORMAT = MessageFormat->new([ status => 'n ', # 0 time => 'n ', # 2 date => 'N ', # 4 code => 'a16', # 8 key msid => 'a10', # 24 key ]); my %test_vals = $TEST_REC_FORMAT->parse($rec);

What exactly get_msg_format or new returns is not important. How exactly it is calculate is not important either.

Replies are listed 'Best First'.
Re^2: Clarity in parsing fixed length data.
by runrig (Abbot) on Aug 10, 2006 at 22:17 UTC
    It's so very hard to read. What's important is to hide the gory details and provide a neat interface

    And that's why I like Parse::FixedLength. The interface is very similar to what you have. I used to rewrite the same gory details over and over (similar to what the OP has) until they were stuffed into that module.

Re^2: Clarity in parsing fixed length data.
by rodion (Chaplain) on Aug 10, 2006 at 22:42 UTC
    I take it the initial reaction then is that someone reading the code won't mind going elsewhere to understand what's going on, or won't need to do so because the function names are clear enough. (Or if they do mind a little, this is out-weighed by greater ease in writing the code.)

      In general you should aim to localise the understanding of the code to the current context. By that I mean, the bit of code you are looking at now should be written so its purpose is clear without having to refer to other materials.

      Generally that means rewriting something like:

      setOptions (1, 0, 'e');

      as:

      setOptions (send_log => 1, suppress_warnings => 0, max_level => 'e');

      A little more verbose, but now (even without the context) it's pretty clear what the parameters do and what setOption is likely to achieve.

      Or to paraphrase a great man: make your code as clear as possible, but no clearer :).


      DWIM is Perl's answer to Gödel

      That's my opinion. Even if the function is simple, I still wouldn't copy it over and over, because I believe it's a distraction from the rest of the code. If you're going to go this way, stuff the function in in a module called OurCompany::FixedLengthUtils or something, and then no one has to think about how it works, they can just use it, or look up the docs for it.

      And, I would probably create a separate module/class for each format especially if you'll be reusing the same formats in different scripts (which Parse-FixedLength has instructions on doing -- disclaimer: I'm the author :-)

        Parse-FixedLength is a really nice module, and was the leading contender among the three modules when I first went looking. Nice job.

        I couldn't tell from the docs if it accepted the "=>" form when using the pack/unpack format specs. We have binary data to read, and I want the name and format columns to line up, so using a ":" separator in a single parameter would have been a problem. With your encouragement I've tested the "=>" and it works fine. It also works in 5.6.1, which is important for us.

        I'll put together an example to present to the group at work, and I'm guessing we'll use Parse-FixedLength. Thanks.