in reply to Re^4: rough approximation to pattern matching using local (Multi Subs)
in thread rough approximation to pattern matching using local

Is it slower?

What penalty does it impose over two (or more), separate (differently named) subroutines?

Ie. Is the syntactic sugar worth the implementation costs?

A simple test I thought might be useful:

my $t; sub a {}; sub b {}; multi sub c (1) {}; multi sub c (2) {}; for ^7 { my $iterations = ^10 ** $_; say $iterations.max ~ " calls"; $t = now; for ^$iterations { $_ %% 2 ?? a() !! b() }; say "regular: {now - $t}"; $t = now; for ^$iterations { c($_ %% 2 ?? 1 !! 2) }; say "multi: {now - $t}"; }

With a recent Rakudo on MoarVM:

1 calls regular: 0.00630924 multi: 0.0046663 10 calls regular: 0.00344620 multi: 0.00392575 100 calls regular: 0.0056468 multi: 0.0079017 1000 calls regular: 0.0238749 multi: 0.02800533 10000 calls regular: 0.0541161 multi: 0.287505 100000 calls regular: 0.43776129 multi: 2.39229575 1000000 calls regular: 4.45541020 multi: 24.2623499

Unfortunately the numbers are clearly pretty useless.

I hope to l follow up this upcoming week when I've got a better handle on how to usefully answer your question. I'm thinking I'll install perl6-bench which has robust logic for squeezing noise (startup, gc, etc.). out of results. Other than the noise in the numbers, perhaps you could confirm the basic benchmark I created is a reasonable test in principle to answer your question?

Update

I believe the "benchmark" I used above caused confusion in this thread due to its inclusion of implicit `where` constraints (where the argument had the value 1 or 2). These literal fixed value `where` constraints are supposed to one day be resolvable at compile-time but as I write this (April 2015) they are resolved at run-time.

So here's an alternative "benchmark":

my $t; class A {}; class B {}; sub a (A $a) {}; sub b (B $a) {}; multi sub c (A $a) {}; multi sub c (B $a) {}; for ^7 { my $iterations = ^10 ** $_; say $iterations.max ~ " calls"; $t = now; for ^$iterations { $_ %% 2 ?? a(A) !! b(B) }; say "regular: {now - $t}"; $t = now; for ^$iterations { c($_ %% 2 ?? A !! B) }; say "multi: {now - $t}"; }

And the timing results:

1 calls regular: 0.00321471 multi: 0.00306454 10 calls regular: 0.0029107 multi: 0.0029139 100 calls regular: 0.0032379 multi: 0.00331093 1000 calls regular: 0.0091466 multi: 0.00742758 10000 calls regular: 0.04885046 multi: 0.053550 100000 calls regular: 0.5330036 multi: 0.5973669 1000000 calls regular: 5.6755910 multi: 5.946978

So, as I tried to explain in other comments in this thread, type checking and candidate ordering of multisub calls are generally completed at compile-time, just like regular subs.

Replies are listed 'Best First'.
Re^6: rough approximation to pattern matching using local (Multi Subs)
by BrowserUk (Patriarch) on Feb 02, 2015 at 03:12 UTC

    Okay. I'm assuming that '^n' means '0 to n-1'. Despite that it doesn't explain the presence of the '^' in the line: my $iterations = ^10 ** $_;; which seems to me (from the displayed effect of the statement), to be exactly the same as: my $iterations = 10 ** $_;? (Ie. no caret.)

    Anyway, making my assumption, and ignoring the anomaly, what your benchmark seems to show is that alternately calling two non-multi subs, 1/2 a million times each, takes 1/6 the time required to call one of two multisubs determined by pattern matching 1/2 a million times each.

    My conclusion:

    Is the cost of the convenience of runtime functional (argument) pattern matching worth the benefit of not having to decide which function to call explicitly?

    In my world: when 10 minutes becomes an hour; or an hour becomes six hours; or 6 hours becomes 36? Absolutely not!

    I suspect that in Haskell; ML; OCaml; Erlang; Miranda; Clean; -- Ie. compiled functional languages -- the runtime cost is minimal because the compilers can, if not completely infer the matching function at compile-time; at least substantially reduce the possibilities through type-inference.

    But (again; just my suspicion), P6 has to in(tro)spect the argument list at runtime and then attempt a best match (or fail?) against the signatured possibilities for the given function name, at runtime. Hence the performance cost.

    Update:For reference: This Perl 5.10.1 chosing between two functions and executing each of them 1/2 million times:

    #! perl -slw use strict; use Time::HiRes qw[ time ]; sub a{} sub b{} my $start = time; $_ % 2 ? a() : b() for 1 .. 1e6; printf "%.9f\n", time() - $start; __END__ C:\test>junk99 0.275810003

    That's 16 times faster than your a() | b() code; and 88 times faster than the P6 multi-sub version.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
    In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
      But (again; just my suspicion), P6 has to in(tro)spect the argument list at runtime and then attempt a best match (or fail?) against the signatured possibilities for the given function name, at runtime. Hence the performance cost.
      Erlang, for instance, is a dynamic language too (but strongly typed - like Python), it doesn't have any fancy type system and matches at runtime, its not 'fast', all in all, but its function calls are a lot faster then those of both Perls. It seems to me optimizations are possible... in theory.
      The extra caret was indeed bogus. Fixed below.


      Afaict, Rakudo is doing compile time resolution of the dispatch target for almost all calls to multisubs found in existing code. Aiui, if the leading candidates for a dispatch only use static nominal typing (specifying types like int, Int, Str, a class, role, etc. for their parameters) then resolution of which to finally pick is done at compile-time.

      (Operators are multisubs so it's a good job that their dispatch target is being resolved at compile time or current Rakudo would be even slower!)


      The code I wrote led to run-time resolution because A) Rakudo is currently converting a literal as a parameter (eg the 1 in `multi sub c (1) {}`) in to a 'where' constraint (`multi sub c ($ where 1) {}`) and B) there's no static analysis in place to reduce this simple 'where' constraint to a finite set of values (which is what would enable compile-time resolution despite use of a `where` constraint).

      If Rakudo treated a literal parameter as a singleton value (i.e.not doing the shortcut A), or did basic analysis of simple `where` constraints to extract finite sets of values when possible (i.e. fixing B), then use of literals in a leading multisub candidate would no longer disable compile-time resolution.


      Here's a hack workaround just to demonstrate that this can actually work in principle:

      my $t; sub a { }; sub b { }; enum A <0>; enum B <1>; multi sub c (A) { }; multi sub c (B) { }; for ^7 { my $iterations = 10 ** $_; say $iterations ~ " calls"; $t = now; for ^$iterations { $_ %% 2 ?? a() !! b() }; say "regular: {now - $t}"; $t = now; for ^$iterations { c($_ %% 2 ?? A !! B) }; say "multi: {now - $t}"; }

      This yields:

      1 calls regular: 0.00633665 multi: 0.0044097 .... 1000000 calls regular: 5.3146894 multi: 5.2373117

      So, using this enum trick, the times for these multisubs have basically caught up with the times for the plain subs. Resolution is compile-time with enums because Rakudo treats them as a finite set of values (as it should, because that's exactly what they are) and that theoretically enables (and in this case Rakudo actually implements) compile-time resolution.


      Some relevant excerpts from the relevant design doc include:

      The set of constraints for a parameter creates a subset type that implies some set of allowed values for the parameter. The set of allowed values may or may not be determinable at compile time. When the set of allowed values is determinable at compile time, we call it a static subtype.

      ... Note that all values such as 0 or "foo" are considered singleton static subtypes. ...

      As a first approximation for 6.0.0, subsets of enums are static, and other subsets are dynamic. We may refine this in subsequent versions of Perl.


      Of course, this still leaves the gulf (an order of magnitude? two?) between the basic sub call performance of Rakudo and of perl.

      Aiui there are tons of bog standard optimization techniques (speed and RAM usage) that still haven't yet been applied to Rakudo, NQP, and MoarVM. Aiui more of these optimizations are supposed to arrive this year but most will come in later years.

      I plan to update this thread if I find out any useful info about whether or not Rakudo sub calls might reasonably be expected to one day (year) eventually catch up with (maybe even overtake?) perl's performance.

        First. Thankyou for taking the time to respond to this. It is appreciated.

        Aiui, if the leading candidates for a dispatch only use static nominal typing (specifying types like int, Int, Str, a class, role, etc. for their parameters) then resolution of which to finally pick is done at compile-time.

        That implies that if I call a multi-sub defined to take (say) two Int vars; but I pass it integers embedded in ordinary scalars; then it will fail? What if the integers are being stored as strings in the PV of the scalar?

        If Perl6 is to retain scalars; but people write their modules using Ints & Strs etc. for efficiency; then it either forces their users to also use Ints & Strs etc. or multi-subs will have to use runtime resolution.

        Alternatively, I guess the programmers could add more multi-subs for each of the permutations of combinations of subscalar types and defined types; but that is a combinatorial nightmare.

        Of course, this still leaves the gulf (an order of magnitude? two?) between the basic sub call performance of Rakudo and of perl.

        Aiui there are tons of bog standard optimization techniques (speed and RAM usage) that still haven't yet been applied to Rakudo, NQP, and MoarVM. Aiui more of these optimizations are supposed to arrive this year but most will come in later years.

        That's understandable, it took Java many years and iterations to sort out their performance problems; and they basically had to invent(*) (or at least, radically refine and generalise) JIT compilation to do it.

        But my gut feel is that there are several Perl6 design elements Multi-subs, junctions, autothreading, to name but 3 -- that individually make writing an efficient runtime implementation exceedingly hard.

        And writing a single VM to deal with all of those; plus the ability to introspect and reflect on everything including the kitchen sink; the neighbours dog; uncle Tom Cobbly an'all; makes for ... well, what we've seen till now.

        I am aware smalltalk had a form of JIT before Java; and of course, LISP did it first; but Java refined it, generalised it, popularised it, and brought it to the main stream.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
        In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked