Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: Given When Syntax

by davido (Cardinal)
on Mar 15, 2014 at 18:11 UTC ( [id://1078472]=note: print w/replies, xml ) Need Help??


in reply to Given When Syntax

I have run across a situation where I need a Switch statement.

That need is of your own creation. There is no situation where the lack of a switch statement will prohibit the project from moving forward with efficiency and clarity. Perl5 was a dozen years old before it got a switch statement (called given/when), and the addition of that syntax didn't make any application of the Perl language that previously would have been impossible suddenly become possible. In fact, Perl's switch statement has been mostly repealed (or flagged as experimental) in modern releases of Perl.

given/when is a programmer convenience, not a language necessity. Use if/elsif/else, or a hash table filled with code refs, or if you're concerned about efficiency, use an array-table filled with coderefs, and assign constants that refer to the elements. Even if you're trying to make your decision based on pattern matching, that's not difficult:

use constant { RE_TRIGGER => 0, CODEREF => 1, }; my @dispatch = ( [ qr/this/, \&do_that ], [ qr/those, \&do_these ], ); DISPATCH: for ( @dispatch ) { $string =~ $_->[RE_TRIGGER] && do{ $_->[CODEREF]->(); last DISPATCH; }; }

In this example I'm populating an array with a series of patterns and actions to take if a given pattern matches. I have control over which match is tested first, because I'm using an array for the dispatch table. And I have control over whether to continue on to the next test, or abort upon success by the use (or omission) of last DISPATCH. I can even add a default by including a regex test as the last element in @dispatch that will always succeed.

Is it more clear than a series of if/elsif/else statements? I guess that depends on how many tests are being conducted. If it's just a small few, chained if/elsif's are probably clearer. If there are a lot of options, a table works well.

If you make the first element a subref instead of a regexp object, you have even more control, and a more generalized solution.


Dave

Replies are listed 'Best First'.
Re^2: Given When Syntax
by Deep_Plaid (Acolyte) on Mar 15, 2014 at 20:51 UTC

    Hi Dave. Thank you very much for the examples and detailed explanations you provided. This was exactly what I was looking for, and although I didn't ask directly, I thought there must be a good reason why PERL doesn't have a native switch and your explanation makes sense in this regard. I greatly appreciate your input! DP

Re^2: Given When Syntax
by Anonymous Monk on Mar 17, 2014 at 02:02 UTC
    While hashes in other languages are more expensive computationally, a hash lookup in Perl costs the same as an array lookup. Efficiency advantages of arrays manifest when using shift/unshift/push/pop. This is explicitly stated in 'Learning Perl.'

      use Benchmark 'cmpthese'; use List::Util 'sum'; our %hash = ( one => 1, two => 2, three => 3, four => 4, five => 5, six => 6, seven => 7, eight => 8, nine => 9, ten => 10, ); our @array = ( 1 .. 10 ); use constant { ONE => 0, TWO => 1, THREE => 2, FOUR => 3, FIVE => 4, SIX => 5, SEVEN => 6, EIGHT => 7, NINE => 8, TEN => 9, }; cmpthese ( -5, { hash => q/ my $n = sum( @hash{ 'one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine', 'ten' } ); /, array => q/ my $n = sum( @array[ ONE, TWO, THREE, FOUR, FIVE, SIX, SEVEN, EIGHT, NINE, TEN ] ); /, } );

      ...produces...

      Rate hash array hash 1673833/s -- -46% array 3103809/s 85% --

      I've got the 2nd and 6th edition of Learning Perl, and haven't been able to find anywhere in either edition where the book states that the efficiency advantages of arrays versus hashes manifest themselves when using shift/unshift/push//pop. Nor have I found anywhere in the book that states that a hash lookup in Perl costs the same as an array lookup. If you know of somewhere in that book I should be reading, please provide the exact quote, along with which edition I might find it in.

      I think you're probably referring to this paragraph:

      Some implementations of hashes (such as in the original awk language, where Larry borrowed the idea from) slow down as the hashes get larger and larger. This is not the case in Perl—it has a good, efficient, scalable algorithm. So, if a hash has only three key-value pairs, it’s very quick to “reach into the barrel” and pull out any one of those. If the hash has three million key-value pairs, it should be just about as quick to pull out any one of those. A big hash is nothing to fear.

      That paragraph makes an assertion that the computational complexity of hash inserts or lookups is approximately constant, as the size of the hash grows toward infinity. And the same could be said of arrays. But order of growth in computational complexity says nothing about the amount of computational complexity there is that stays constant as the data set grows.

      To understand this concept, look at the following two simple scenarios:

      • Find the numeric value 100 in an unsorted array of unique integers: We know that as the array grows, the growth in computational complexity is linear.
      • Find the string "David" in an unsorted array of strings.: We know that as the array grows, the growth in computational complexity is linear.

      So those are both O(n) searches. However, to find numeric 100, we do a series of numeric comparisons while iterating over the elements of the array. To find "David", we must go to the first element and see if the first character in the first element is a "D". If it is, then we compare the second character and see if it is an "a". Each letter of the string results in another char comparison... essentially an integer comparison for each character of the string (short-circuiting as soon as there's a mismatch)... repeated for each element in the data set.

      So it's pretty easy to see in this example that the string comparisons have an overhead per element that doesn't exist for the numeric comparisons.

      Now look at what goes on in hash lookups versus array lookups. To look up a single element in an array, internally, Perl must first start with the base address of the array. Then it must add an integral offset to the array's base address. This offset will be the element number times the size of the element's structure. This is a simple mathematical calculation; base + offset * size. Once that calculation is performed, Perl can dereference a pointer contained in the element, and fetch the element's scalar entity.

      Now for a Perl hash. First Perl must examine the string of some arbitrary length, and using a hashing algorithm to compute the bucket. That bucket may contain more than one element in the form of a linked list. The linked list would then be walked to arrive at the specific element being searched for. Fortunately the linked list chains are always kept pretty short, so we don't worry about their growth. But now instead of a simple base+offset*size equation, we feed a (usually) multi-byte string to a hashing algorithm, that gives us an offset, that leads us to a short chain of elements to scan through.

      So the array lookup has some constant overhead, and the hash lookup has some constant overhead (overhead that doesn't change as the size of the structure grows). But the amount of constant overhead for the array is smaller than the amount of constant overhead for the hash.

      We mostly don't worry about that. If we were looking for bleeding edge performance we would be using C or C++ to begin with. But the point to my mention of efficiency in this case was this: Sometimes function tables are used inside of tight loops. And infrequently, those tight loops become a performance problem. In such cases, switching from a hash based dispatch table to an array one will bring some performance improvement.

      The more important point is that if one needs to assure true "switch-like" behavior, where the order of comparisons is predictable, an array is more appropriate; a hash based dispatch table doesn't guarantee a predictable order in which cases will be tested for a match.


      Dave

        The advantages of push/pop are mentioned in the second footnote on page 49 in the 6th edition:

        You could add new items to the end of an array by simply storing them as elements with new, larger indices. But real Perl programmers don’t use indices.*

        * Of course, we’re joking, but there’s a kernel of truth in this joke. Indexing into arrays is not using Perl’s strengths. If you use the pop, push, and similar operators that avoid using indexing, your code will generally be faster than if you use many indices, and you avoid “off-by-one” errors, often called “fencepost” errors. Occasionally, a beginning Perl programmer (wanting to see how Perl’s speed compares to C’s) will take, say, a sorting algorithm optimized for C (with many array index operations), rewrite it straightforwardly in Perl (again, with many index operations) and wonder why it’s so slow. The answer is that using a Stradivarius violin to pound nails should not be considered a sound construction technique.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1078472]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2024-03-29 06:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found