Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re^3: Given When Syntax

by Laurent_R (Canon)
on Mar 16, 2014 at 18:03 UTC ( [id://1078533]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Given When Syntax
in thread Given When Syntax

In addition to the errors that have already been pointed out to you (especially the fact that you don't have a test2 subroutine), please note that you should pass a string as a parameter to your sub:
test2(1.00.000);
should be:
test2("1.00.000");
I would also submit that this:
sub test2 { my ($var2) = @_; return ("One") if ($var2 =~ /^1/ ); return ("Two") if ($var2 == /^2/ ); return ("Three") if ($var2 == /^3/ ); return undef; }
is not correct for input values starting with 2 and 3 and is not very efficient in terms of performance, nor in terms of coding simplicity. Immediate correction of the error is to replace == with =~ for cases 2 and 3:
sub test2 { my ($var2) = @_; return ("One") if ($var2 =~ /^1/ ); return ("Two") if ($var2 =~ /^2/ ); return ("Three") if ($var2 =~ /^3/ ); return undef; }
Note that Marshall corrected these two errors, but I thought it would be useful to point these out to you for your benefit. An additional improvement would be to remove the triple regex and to extract the first digit from the string only once:
sub test2 { my $var2 = substr shift, 0, 1; return ("One") if $var2 == 1 ; return ("Two") if $var2 == 2 ; return ("Three") if $var2 == 3 ; return undef; }
Doing the extraction of the first digit only once is cleaner, removes the risk of the error I pointed out just above and is likely to be faster if that matters (although it is true that an anchored regex is pretty fast). And it paves the way for yet another improvement, the use of an array rather than multiple evaluations. The full program may now be this:
use strict; use warnings; my @translation = qw / Zero One Two Three/; sub test2 { return $translation[(substr shift, 0, 1)]; } print test2("1.00.000");
Now, assuming you have a very large amount of data and performance matters, we may want to benchmark this against your (corrected) triple regex version and an intermediate solution extracting the first digit only once:
use strict; use warnings; use Benchmark qw/cmpthese/; my @translation = qw / Zero One Two Three/; sub test1 { my $var2 = shift; return ("One") if ($var2 =~ /^1/ ); return ("Two") if ($var2 =~ /^2/ ); return ("Three") if ($var2 =~ /^3/ ); return undef; } sub test2 { my $var2 = substr shift, 0, 1; return ("One") if ($var2 == 1 ); return ("Two") if ($var2 == 2 ); return ("Three") if ($var2 == 3 ); return undef; } sub test3 { return $translation[(substr shift, 0, 1)]; } cmpthese( -1, { test_1 => sub {test1("3.01.000")}, test_2 => sub {test2("3.01.000")}, test_3 => sub {test3("3.01.000")}, } )
which gives the following results:
$ perl test_if.pl Rate test_1 test_2 test_3 test_1 1294050/s -- -11% -51% test_2 1451608/s 12% -- -45% test_3 2642856/s 104% 82% --
As you can see, the array solution is about twice faster. Having said that, performance is often not so important (it is often fast enough anyway), and I am using quite regularly solutions similar to Marshall's proposals.

Replies are listed 'Best First'.
Re^4: Given When Syntax
by tobyink (Canon) on Mar 16, 2014 at 18:17 UTC

    You're using a constant input which starts with a "3" though, which unfairly penalizes test1 and test2 (it's the final situation they check for). For inputs starting with a "1", test3 is still the fastest, but the difference between it and the other tests is much smaller.

    Also, I'd recommend running your benchmarks like this:

    cmpthese(-1, { test_1 => q{ test1("3.01.000") }, test_2 => q{ test2("3.01.000") }, test_3 => q{ test3("3.01.000") }, });

    ... using q{ ... } instead of sub { ... }. If you use sub { ... } you're wrapping each iteration in an extra sub call layer. For micro-optimization benchmarks like this, that extra layer can make a significant difference to the results.

    use Moops; class Cow :rw { has name => (default => 'Ermintrude') }; say Cow->new->name

      You're using a constant input which starts with a "3" though, which unfairly penalizes test1 and test2 (it's the final situation they check for). For inputs starting with a "1", test3 is still the fastest, but the difference between it and the other tests is much smaller.

      Yes, you are absolutely right, tobyink, I did unfairly penalize test1 and test2, and I did it consciously and voluntarily, because, in a real situation, I would assume that the first digit in the input can take any value between 1 and 9 (and possibly 0), so that having a match at the third value actually gives an unfair advantage to test1 and test2. Having said that, with three possible values, matching at the second value should be fair if values are more or less equally distributed. Changing my benchmark test to:

      cmpthese( -1, { test_1 => sub {test1("2.01.000")}, test_2 => sub {test2("2.01.000")}, test_3 => sub {test3("2.01.000")}, } )
      I obtain the following result:
      $ perl test_if.pl Rate test_1 test_2 test_3 test_1 1451608/s -- -8% -46% test_2 1578202/s 9% -- -41% test_3 2667353/s 84% 69% --
      which still shows a very clearcut advantage to the array solution.

      As for using

      cmpthese(-1, { test_1 => q{ test1("3.01.000") }, test_2 => q{ test2("3.01.000") }, test_3 => q{ test3("3.01.000") }, });
      I was not aware of the possibility of doing it this way, thank you for the information, I'll investigate this further. I doubt, though, that it really makes a huge difference, a factor of two between one solution and the others is not exactly what I would call micro-optimization.

        I think any improvements - even a ten-fold speed-up - to a sub that can be run 1.5 million times a second counts as a micro-optimization. ;-)

        use Moops; class Cow :rw { has name => (default => 'Ermintrude') }; say Cow->new->name
Re^4: Given When Syntax
by Deep_Plaid (Acolyte) on Mar 16, 2014 at 18:20 UTC

    Hello again, Laurent. Let me just start by saying "I'm not worthy! I'm not worthy!" This is great stuff. I had asked about performance and you replied. This is huge because the amount of data I'm dealing with is significant. I probably won't be able to fully examine your notes until later today or tomorrow (I'm under some deadlines), but just wanted to let you know your contribution is highly valued. Hope you are having a smashing weekend. Cheers, DP.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1078533]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (7)
As of 2024-03-28 09:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found