http://qs1969.pair.com?node_id=684927

Boldra has asked for the wisdom of the Perl Monks concerning the following question:

I have a little utility to print the differences between two arrays. I pass the sub a hash (%lists), with the values being array refs. The code below calculates how broad to make the columns on the output, so it selects the longest word from either the keys to the hash or the elements of the array, then adds two for padding
my $col_width = ( sort { $b <=> $a } map { length } ( #sort lengths (keys %lists, map { @$_ } values %lists) + ) )[0] + 2; #2 is the padding
I just think I've gone a bit too far, and this could be a problem for my colleagues in the future. Can anyone suggest a way of making the above code more readable? The input looks something like this:
my %lists; my @amazon = qw(hot tropical humid wet); my @sahara = qw(hot hot tropical big); $lists{"The Amazon is"} = \@amazon; $lists{"The Sahara is"} = \@sahara;


- Boldra

Replies are listed 'Best First'.
Re: I fear my code is unreadable
by moritz (Cardinal) on May 06, 2008 at 12:42 UTC
    The first step is to replace (sort {$b <=> $a } ...)[0] with
    use List::Util qw(max); max(...)

    If you want to remove the map from that line, introduce a temporary variable that holds the output of map.

Re: I fear my code is unreadable
by Fletch (Bishop) on May 06, 2008 at 12:46 UTC

    Probably could be simplified with List::Util's max to something like my $col_width = 2 + max( map length, keys %lists, map( @$_, values %lists ) );.

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

Re: I fear my code is unreadable
by jplindstrom (Monsignor) on May 06, 2008 at 14:06 UTC
    When people talk about self documenting code, this is what they mean: compare

    $something + 2; #2 is the padding

    with

    my $padding = 2; ... $something + $padding;

    Naming things right is one of the most important things in programming, and very often overlooked.

    I can't tell from the context whether $padding is the best name for what it does. For example, maybe it's utterly obvious which unit padding is in (columns?), maybe it's better named if you include the unit ($cm_padding).

    This is very, very often missing from variables containing some kind of time. As an example, what does $timeout mean?

    • Is it a boolean?
    • Is it timeout in seconds?
    • Is it timeout in minutes?
    • Is it timeout in days?

    The name could tell you all that ($is_timeout / $seconds_timeout etc).

    The same thinking goes for naming configuration variable names. Actually it's even more important in config files because there is usually no context there whatsoever. In the source, at least you've got the surrounding code to get clues from.

    This actually leads to another principle: the bigger the scope of a variable, the more important it is to get the name right, and the more explicit the name should be.

    That means it's almost okay to name a for loop index $i, because that's both an expected convention and a very small scope.

    Naming a global variable $i would be insane. The name needs to work in any context and it needs to stand on its own and carry meaning in all those contexts. In this case, $i doesn't mean anything.

    /J

Re: I fear my code is unreadable
by BrowserUk (Patriarch) on May 06, 2008 at 14:25 UTC

    I agree with everyone else that List::Util::max() is the way to go here, but for your readability concerns, the layout of your code can have a huge influence on its readability. A slight restructuring can allow you to loose some of the parens, and make things clearer:

    use constant PADDING => 2; my $col_width = PADDING + ( sort { $b <=> $a } map { length } map { ref eq 'ARRAY' ? @$_ : $_ } %lists )[ 0 ]; ## Longest key or value

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      I like your formatting suggestions, although I'm not sure about " ref eq 'ARRAY' " as being more readable... Unfortunately List::Util and Text::Table aren't available on my target platforms.


      - Boldra

        Well, you could go with

        use constant PADDING => 2; my $col_width = PADDING + ( sort { $b <=> $a } map { length } keys %list, map { @$_ } values %lists )[ 0 ]; ## Longest key or value

        But in

        use constant PADDING => 2; my $col_width = PADDING + ( sort { $b <=> $a } map { length } map { ref ? @$_ : $_ } %lists )[ 0 ]; ## Longest key or value

        The line ref eq 'ARRAY' ? @$_ : $_ just says: if it($_) is a value (array) flatten it to a list, (which you already had in your own version), or if it is a key, pass it through untouched.

        That avoids having to treat the keys and values separately, (and iterate the hash twice to generate them). In doing so it reduces the noise level a little further.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        List::Util and Text::Table aren't available on my target platforms

        What platform is that? List::Util is in the standard library. Everyone has it unless someone has gone to pathological efforts to reduce the size of the perl installation.

Re: I fear my code is unreadable
by toolic (Bishop) on May 06, 2008 at 13:39 UTC
    For printing data in automatically sized column widths, I often use Text::Table. It may be worth considering.
Re: I fear my code is unreadable
by perrin (Chancellor) on May 06, 2008 at 17:34 UTC
    The list manipulation stuff is fun, but when you stack it too deep it becomes unreadable. The problem is that it has to be read from back to front to get the order right. To fix it, I'd just use more temp variables and more lines. Here's a shot at a rewrite:
    our $PADDING = 2; # config, so probably belongs at top of file my @phrases = ( keys %lists, map { @{$_} } values %lists ); my @lengths = map { length($_) } @phrases; @lengths = sort { $b <=> $a } @lengths; my $longest = shift @lengths; return $longest + $PADDING;
    It's a lot longer, but it scans quickly, and if you need to come back and modify one of the steps it will be obvious where to do it.
Re: I fear my code is unreadable
by FunkyMonk (Chancellor) on May 06, 2008 at 22:50 UTC
    I just think I've gone a bit too far
    My experience is that if you "think you've gone a bit too far" you know you've really gone way too far. Split your code up so that each line saves it's results into an array, and use that in the next step.

    Doing that will make the code

    • self documenting (as long as you use sensible variable names)
    • easier to read
    • easier to test
    • easier to debug
    • easier to understand
    • easier to change

    (And I'm as guilty as anyone else for making single statements too long and too complex)


    Unless I state otherwise, all my code runs with strict and warnings

      I found that, although "self documenting" code is rarely found, good, clean, well-structured, and well styled code is more easily read, and better maintained.

      I suggest reading Tom Christiansen's Perl Style talk, and the article that influenced it, Rob Pike's Notes on writing C for some ideas and guidelines about style.

      Stop saying 'script'. Stop saying 'line-noise'.
      We have nothing to lose but our metaphors.

Re: I fear my code is unreadable
by mscharrer (Hermit) on May 06, 2008 at 13:35 UTC
    I'm answering not to your col_width code but to the input code. As long you don't need the arrays for other things you could write the input like below. This is far more readable and easier to maintain.
    my %lists = ( 'The Amazon is' => [ qw(hot tropical humid wet) ], 'The Sahara is' => [ qw(hot hot tropical big) ], );
Re: I fear my code is unreadable
by pc88mxer (Vicar) on May 06, 2008 at 16:02 UTC
    Another thing you could do is to put in a comment about what the fragment of code does. I.e., something like:
    # Compute the maximum width of any key or value in %lists # and add a padding. my $col_width = ...
    or just take a few sentences to explain what's going on.

      ... or you could stick that fragment of code in a subroutine or method whose name describes what it does.