Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello perl monks.

I had written a script to show this

my $Hash = { "A" => ["HYU"], "B" => ["TU6"], "C" => [ "11", "09", "88", "2" ], "D" => [ "01", "11" ] };
display as thus:
A, B, C, D HYU TU6 11 01 09 11 88 2

Of course, the script works, however, I 'feel' the code is not optimized enough. I had profiled the script using Devel::NYTProf to see what i could change, but yet nothing IMHO.

Please, could 'borrow' my your pair of eyes, and see if there are things that could be change.

Below is my script.
use warnings; use strict; my $Hash = { "A" => ["HYU"], "B" => ["TU6"], "C" => [ "11", "09", "88", "2" ], "D" => [ "01", "11" ] }; # get headiing print join( ",\t" => sort keys %$Hash ), $/; # get the value with the highest numbers of element my $highest_item = 0; foreach ( keys %{$Hash} ) { my $value = scalar @{ $Hash->{$_} }; $highest_item = $value if $highest_item < $value; } # re-write the data given as AoA, with data needed my @final_data; foreach my $key ( sort keys %{$Hash} ) { push @final_data => [ map { $Hash->{$key}[$_] || qq[] } 0 .. ( $highest_item - 1 ) ] +; } # print out in expected output foreach my $pos ( 0 .. $#final_data ) { foreach ( 0 .. $highest_item - 1 ) { print $final_data[$_][$pos], "\t"; } print $/; }
Thanks

Replies are listed 'Best First'.
Re: Can this script be Optimized?
by kcott (Archbishop) on May 01, 2014 at 07:01 UTC

    There's a number of improvements you could make. Here's a (not necessarily complete) list:

    • Deferencing variables has an overhead. Use a hash instead of a reference to a hash.
    • You're getting a list of keys three times and sorting those lists twice. Just do all of this once.
    • There's a builtin module List::Util which provides a max() function. There's no need roll your own code for this. [As far as I know, all of the functions provided by that module are implemented in C, so you also get a performance bonus.] (See also: List::MoreUtils)
    • Transforming the hash into an AoA is unnecessary. All the required functionality can be achieved by just using the initial hash.
    • You've used a total of four foreach loops; in the code I've shown below there's only one. [for and foreach are synonymous: save yourself some typing by just using for]
    • You can use the -l (ell) command switch so you don't need $/ (or "\n") after each print (see perlrun: Command Switches).

    I took your original spec and produced the following script. While I kept to the spec, I made no any attempt to reproduce any parts of your script.

    #!/usr/bin/env perl -l use strict; use warnings; use List::Util qw{max}; my %Hash = ( "A" => ["HYU"], "B" => ["TU6"], "C" => [ "11", "09", "88", "2" ], "D" => [ "01", "11" ] ); my @keys = sort keys %Hash; print join ",\t" => @keys; { no warnings 'uninitialized'; for my $i (0 .. max map { $#{$Hash{$_}} } @keys) { print join "\t" => map { $Hash{$_}[$i] } @keys; } }

    Output:

    A, B, C, D HYU TU6 11 01 09 11 88 2

    Using tabs for the output worked in this instance; however, with other data, columns may be out of alignment. Also, tab widths can vary; so what looks fine here may be misaligned elsewhere. Consider using printf to format your output.

    Also have a read of "Perl Performance and Optimization Techniques" for more tips you can use in your other scripts.

    -- Ken

      "As far as I know, all of the functions provided by that module are implemented in C, so you also get a performance bonus."

      Indeed, they are. List::Util also happens to make very efficient use of Perl's API (e.g. using multicall), allowing its functions to perform at speeds approaching Perl's built-in list operators (grep, map, sort).

      "See also: List::MoreUtils"

      This is also written in C (and also makes use of multicall), but has a fallback Perl implementation allowing it to be installed on machines lacking a C compiler.

      If a function exists in both List::Util and List::MoreUtils (and because recent List::Util releases have been adding new functions, the overlap between them is growing), then prefer the List::Util one because it will be guaranteed to be implemented in C.

      use Moops; class Cow :rw { has name => (default => 'Ermintrude') }; say Cow->new->name

        Thanks for the feedback.

        I had seen the C and pure-Perl implementation information in List::MoreUtils [from CPAN]; however, the List::Util [from perldoc.perl.org] documentation made no mention of this: hence the "As far as I know" qualifier.

        I was a little surprised when you mentioned there was an overlap between these two modules as I wasn't aware of this. I did a little investigation and found:

        http://perldoc.perl.org/List/Util.html

        Indicates Perl v5.18.2 and only shows these functions: first max maxstr min minstr reduce shuffle sum. It does show any, all, et al, which currently exist in List::MoreUtils, as suggested additions which haven't been included. (I couldn't find any mention of a module version number.)

        http://search.cpan.org/~pevans/Scalar-List-Utils-1.38/lib/List/Util.pm

        This does show any, all, et al as being included.

        $ perldoc List::Util

        This appears to be the same doco as the CPAN version. Checking my versions: List::Util 1.38 and Perl 5.18.1

        I have, up until now, used the perldoc.perl.org documentation for all builtin modules. It's clear that this is out-of-date for List::Util (i.e. it's not the POD that ships with 5.18.2); unfortunately, this leave me wondering what other parts of its doco aren't up-to-date.

        -- Ken

      In the interest of education, map is a "feature enhanced" version of for.

      map { expr; } @array; is equivalent to:

      for (@array) { push @newarray, expr; }

      Presumably, expr uses $_ either implicitly or explicitly.

      Similarly, grep { expr; } @array; is equivalent to:

      for (@array) { push @newaray, $_ if expr; }

        Actually, I'd say a closer equivalency would be

        my @newarray = map { expr } @array;

        and

        my @newarray; for (@array) { push @newarray, expr; }

        Or, conversely

        map { expr } @array;

        and

        for (@array) { expr; }

        or

        expr for @array;

        Anyway, as that was a reply to my post, were you suggesting I replace a map with a for, or vice versa? Or, perhaps, something else?

        -- Ken

Re: Can this script be Optimized?
by NetWallah (Canon) on May 01, 2014 at 05:49 UTC
    How about this:
    print join( ",\t" => my @kk=sort keys %$Hash ), $/; my $row = 0; my $items_in_row; do { $items_in_row = 0; my $v; for my $k (@kk){ if (defined ($v = $Hash->{$k}[$row])) { $items_in_row ++; }else{ $v=""; } print "$v\t" } print "$/"; $row++; } until $items_in_row == 0;
    Produces the same output as your code.

    Update: Another version:

    print join( ",\t" => my @kk=sort keys %$Hash ), $/; my $items_in_row; my $row=0; do { $items_in_row = 0; my $v; print +(map{ if ( defined ($v = $Hash->{$_}[$row]) ){ $items_in_row++ ; "$v\t" }else{ "\t" } } @kk ), "$/"; $row++; } until $items_in_row == 0;

            What is the sound of Perl? Is it not the sound of a wall that people have stopped banging their heads against?
                  -Larry Wall, 1992

      Thanks alot NetWallah, I knew there was no need for all the multiples for loops I was using.
      Timtoday takes the day.

        OK - one final (short/cryptic) version:
        print join( ",\t" => my @kk=sort keys %$Hash ), $/; for (my $i=0; scalar grep {+defined} (my @v=map {$Hash->{$_}[$i] } @kk + ) ; $i++){ print join ("\t", map {defined $_ ? $_:""}@v),"$/"; }

                What is the sound of Perl? Is it not the sound of a wall that people have stopped banging their heads against?
                      -Larry Wall, 1992