A novice breaks his silence, seeking wisdom in the typing of many monkees...

I am after recommendations for a structural rewrite to improve performance: priorities are 0) maintainable 1) max speed, 2) min memory. This is in a data-processing context, not web pages etc.

Problem: read a number of input items (500K items), transform them into output items and save them into load tables in the database.

Focus: optimal loop structure and transformation.

Current overview
load all items into array of hash for each input item { complex transformation rules } save all items (timestamped) remove all items with old timestamp
Some alternatives I am testing for the loop:
OUTER-IF: if ( $type eq 'apple' ) { for my $item ( @items ) { apple_wash($item); apple_core($item); apple_pulp($item); } } elsif ( $type eq 'banana') { for my $item ( @items ) { banana_bend($item); banana_hang($item); } } else { die "bad fruit: $type"; } INNER-IF: for my $item ( @items ) { if ( $type eq 'apple' ) { apple_wash($item); apple_core($item); apple_pulp($item); } elsif ( $type eq 'banana') { banana_bend($item); banana_hang($item); } else { die "bad fruit: $type"; } } ARRAY OF SUB-REF: my @funcs; if ( $type eq 'apple' ) { push @funcs, \&apple_wash, \&apple_core, \&apple_pulp; } elsif ($type eq 'bananas') { push @funcs, \&banana_bend, \&banana_hang; } else { die "bad fruit: $type"; } for my $item ( @items ) { for my $func (@funcs) { $func->($item); } } SYMBOL-TABLE FIDDLE: if ( $type eq 'apple' ) { *func1 = \&apple_wash; *func2 = \&apple_core; *func3 = \&apple_pulp; } elsif ($type eq 'bananas') { *func1 = \&banana_bend; *func2 = \&banana_hang; *func3 = \&noop; } else { die "bad fruit: $type"; } for my $item ( @items ) { func1($item) unless \&func1 == \&noop; func2($item) unless \&func2 == \&noop; func3($item) unless \&func3 == \&noop; }
Specifics: I have profiled some test code with these four approaches -

OUTER-IF is the (slightly) fastest, but unwieldy in the real instance, as there are hundreds of infrastructural lines omitted from the example that make if hard to maintain duplicate FOR LOOPs.

INNER-IF is also fast, but results in difficult to maintain and long... code inside the FOR LOOP

ARRAY OF SUB-REF: slower than IFs - must be cost of dereferencing the function ref? but makes huge FOR LOOP mucho clearer.

SYMBOL-TABLE FIDDLE: I am a symbol-table virgin, and my symbol-table stuff looks poor - are there better sym-t syntacies? Generates 'Redefined function XXX' warnings.

Oh monks of Perl, I seekest thy wisdom... TIA Jeff

edited: Tue Dec 17 15:23:04 2002 by jeffa - title truncation (was: performance - loops and complex decisions, sub refs, symbol tables, inner and outer if/elses)

update 2 (broquaint): added <readmore> tag

In reply to performance - loops, sub refs, symbol tables, and if/elses by jaa

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.