Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello again,

Everyone was so helpful the other day in regards to my thread regarding performance revisions, I thought I would ask another.

I have the following code, and I'm curious if it can be made more efficient.

BLOCK 1 sub CONVERT { # record->[1] is the key for the hash of arrays foreach my $record (@Data) { $record->[1] =~ s/\s+//g; $record->[1] = join ("_", $Map{$record->[1]}[1], $Map{$record->[1] +}[0]); } } BLOCK 2 foreach my $record (@Data ) { for my $i (0 .. 4) { print DATA $record->[$i] . " "; } print DATA "\n"; }
These are consuming the most time according to -d:SmallProf

Thanks again for the help

2006-03-28 Retitled by planetscape, as per Monastery guidelines
Original title: 'performance'

Replies are listed 'Best First'.
Re: advice on improving performance of this code
by gjb (Vicar) on Jan 03, 2003 at 21:43 UTC

    Hi again ;-)

    In the first block I'd replace

    $record->[1] = join ("_", $Map{$record->[1]}[1], $Map{$record->[1]}[0] +);
    with
    $record->[1] = $Map{$record->[1]}[1] . '_' . $Map{$record->[1]}[1];

    In the second block I'd avoid the iteration by using join:

    print DATA join(" ", @$record[0..4]), "\n";
    rather than
    for my $i (0 .. 4) { print DATA $record->[$i] . " "; } print DATA "\n";
    or even
    print DATA join(" ", @$_[0..4]), "\n" for (@Data);
    that does the whole block in one line.

    You can find the code I used to do the benchmarks on my scrachpad.

    Hope this helps, -gjb-

      Update

      Upon testing $, tests 25% faster that join in the best case to 50% SLOWER in many cases! Presumably it relates to being faster to pass one argument to print (after the join) rather than many. Suprised me. Credit gjb for this pickup. I will be using join in future

      You can set $, to ' ' and ditch the join as well which will be faster again. Manipulating $, and $" can often useful in controlling output. local ensures no suprises elsewhere in the code.

      { local $, = ' '; print @{$_}[0..4], "\n" for @Data; }

      Note that setting $, and doing it in one line like this has the side effect of adding a ' ' after the last element and before the "\n". This can be rectified with a separate call to print for the "\n" if required. print() is actually quite slow so it may be faster to do this:

      { local $, = ' '; print @$_[0..3], $$_[4] ."\n" for @Data; }

      cheers

      tachyon

      s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

      I appreciate all the advice, and especially found gjb's suggestions helpful.

      I took a look at your scratchpad, and noticed a significant improvement over my old code.

      Thanks again!

Re: advice on improving performance of this code
by jdporter (Paladin) on Jan 03, 2003 at 21:35 UTC
    That one line in block 1 could be written as:   $record->[1] = join '_', @{ $Map{ $record->[1] } }[1,0]; Block 2 could be written as:
    local $, = ' '; local $\ = "\n"; foreach my $record ( @Data ) { print OUT @{ $record->[$i] }[0..4]; }
    But I'm not sure how much those will impact performance...

    (PS - I'm not so sure using DATA as an output filehandle is such a great idea...)

    jdporter
    The 6th Rule of Perl Club is -- There is no Rule #6.

Re: advice on improving performance of this code
by webengr (Pilgrim) on Jan 03, 2003 at 22:17 UTC
    This is more of a question than a recommendation: would using "tr" instead of the regex substitution in the first block would be more efficient?

    PCS

      Yes, but then you have to replace the '\s' with the equivalent escapes something like tr/\t\r\f\n //d to match all of what \s means. (assuming no locales or unicode)


      Fun Fun Fun in the Fluffy Chair