I have a fairly large perl-prog whose purpose is to translate some hefty (300mb+) textfiles into a ready-to-load-via-bcp ms-sql db.

Anyway, due to coming across more and more bugs in the input file, my code grew and grew to handle the various bits of duff data.

Consequently, I found that:

So... I tidied up the code, reduced the input reads to 1, and then set about doing some benchmarking to find out whether 'last LABEL' was a liability, and if so what the best alternative was.

The 'bad' news was that reducing the reads on the input file made comparatively little difference - I say 'bad' in inverted commas because, presumably, all this indicates is that perl is pretty efficient at reading a file, so that, provided the file is large enough, the following snippets of code are roughly equivalent:

open(IN, "bigfile.txt"); while(<IN>){ &big_sub1($_); } open(IN, "bigfile.txt"); while(<IN>){ &big_sub2($_); }

Versus

open(IN, "bigfile.txt"); while(<IN>){ &big_sub1($_); &big_sub2($_); }

Moving on from that to the 'last LABEL' issue, well, yep, 'last LABEL's are bad news.

Here's the code and output I used for my test:

use Benchmark; sub sub1(){ my $val = 3; TTEST: { if($val == 1){last TTEST;} if($val == 2){last TTEST;} if($val == 3){last TTEST;} if($val == 4){last TTEST;} } } sub sub2(){ my $val = 3; if($val == 1){} if($val == 2){} if($val == 3){} if($val == 4){} } sub sub3(){ my $val = 3; if($val == 1){} elsif($val == 2){} elsif($val == 3){} elsif($val == 4){} } my $codehash = {'sub1' => \&sub1,'sub2' => \&sub2,'sub3' => \&sub3}; timethese(5000000, $codehash);

And here's the (shortened) benchmark output:

sub1: 13 wallclock secs (12.80 CPU) sub2: 8 wallclock secs (8.24 CPU) sub3: 6 wallclock secs (6.66 CPU)

Two things suprised me about this:

  1. How very bad 'last LABEL' is.
  2. Hell, it's even worse than several 'ifs'

Now, just to check what was happening, I set $val to 1, and even then the 'last LABEL' constructs were still slower than multiple 'ifs' - despite the fact that the 'last LABEL' skips the other conditions - heh, any chance of adding a "your code is crap" message when running under -w if you use labels?

Anyway, that's me done. Not so much a meditation, more of an aimless ramble...

Tom Melly, tom@tomandlu.co.uk

In reply to A Luser's Benchmarking Tale by Melly

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.