Limbic~Region has asked for the wisdom of the Perl Monks concerning the following question:

All,
I remember reading that in relatively recent versions of Perl, a recursive function called by sort would result in a segfault. I don't know the particulars and may not even be remembering correctly. I decided to try my luck here and was happily suprised to find out that not only did it not segfault - it actually worked*.

I then decided to try and convert the code to use the goto &sub form to see if it made any difference in performance. See Pure Perl tail call optimization for more information.

#!/usr/bin/perl use strict; use warnings; my @data = map {chomp; [ split '/' ] } <DATA>; @data = map { join '/', @$_ } sort my_sort @data; print $_,$/ for @data; sub my_sort { my $i = $_[0] || 0; if ( defined $a->[$i] && defined $b->[$i] ) { if ( my $result = $a->[$i] <=> $b->[$i] ) { return $result; } ++$_[0]; goto &my_sort; } return defined $a->[$i] ? 1 : -1; } __DATA__ 0/1/2/3 0 0/4/5/6 0/4 0/1 0/1/2 0/4/5 0/10/111/145 0/10/111 0/10
This b0rk on my box (perl -V output available upon request). I confirmed that the function worked correctly when not called as a parameter to sort. After "playing" around, I discovered the following change made it start working again.
#@data = map { join '/', @$_ } sort my_sort @data; @data = map { join '/', @$_ } sort { my_sort() } @data;
Can anyone shed any light on what is going on here? This is purely for my own education and does not reflect what I would be doing in production code.

Cheers - L~R

* - It works when the data does not contain duplicates

Update: In a nutshell, autovivication of @_ with goto &sub is broke when using the sort SUBNAME LIST prototype but not the sort BLOCK LIST prototype. I still have no idea why though!

Replies are listed 'Best First'.
Re: Sorting, recursion, and tail-call optimizations
by BrowserUk (Patriarch) on Jan 06, 2006 at 17:01 UTC

    See implicit sort disables a chained subroutine?. Same problem and answer I think?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      BrowserUk,
      Maybe - but I don't think so. First, I don't have a chained subroutine. I am calling sort with a user defined sort routine and a list only. There is no additional routine to modify the list before passing it to sort that could get confused as the comparator. Also, your post indicated that by b0rk it didn't produce any results. In my case, I am getting a pop up saying that perl encountered a problem and needed to close requesting a report be sent to M$.

      Cheers - L~R

        Perhaps not then, though I still think it is to do with the capriciousness of the dwimery that allows (and actually requires), you to supply a bareword for the user sub instead the more normal and rational \&usersub in place of the bare block.

        I always use an inline block for sorting because the rules surrounding when you can get away with using a bareword usersub seem to be undocumented and vary according to the phases of the moon as best as I can discern :)

        Much the same as I always use a bareblock with map and grep in preference to a undelimited expression, despite it's creating an extra level of scope. It is just much to easy for the significance of that comma separating the code from the list to get lost, especially if the complexity of the expression increases.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Sorting, recursion, and tail-call optimizations
by Roy Johnson (Monsignor) on Jan 06, 2006 at 19:41 UTC
    You might notice that you can get the same effect you want without the overhead of using goto by doing a slightly unorthodox loop instead.
    sub my_sort { SUB_BODY: { my $i = $_[0] || 0; if ( defined $a->[$i] && defined $b->[$i] ) { if ( my $result = $a->[$i] <=> $b->[$i] ) { return $result; } ++$_[0]; redo SUB_BODY; } return defined $a->[$i] ? 1 : -1; } }
    Update: Except this doesn't work (at least, it doesn't sort in any way that I consider sensible). Not sure what I did wrong. And oddly enough, when I translated it into what I thought was an equivalent for-loop, I got a different arrangement. I think it's what you want.
    sub my_sort { my $i; for ($i = 0; defined $a->[$i] && defined $b->[$i]; ++$i) { if ( my $result = $a->[$i] <=> $b->[$i] ) { return $result; } } return defined $a->[$i] ? 1 : -1; }
    So I guess I'm just having a high-Bozo-quotient day. But maybe you can make use of the idea.

    Caution: Contents may have been coded under pressure.

      You weren't incrementing $i.

      - tye        

        Shouldn't incrementing $_[0] effectively increment $i because of the assignment at the top of the block?

        Caution: Contents may have been coded under pressure.
      Roy Johnson,
      Heh. My first inclination was to go iterative. I am trying to wrap my head around thinking recursively so that I might grok Haskell, and hence avoided it. My iterative solution looked nothing like this though ;-)

      Cheers - L~R

Re: Sorting, recursion, and tail-call optimizations
by kwaping (Priest) on Jan 06, 2006 at 16:56 UTC
    I wonder if it's related to having a bareword (my_sort) while strict subs is in use? I changed your code slightly:
    @data = map { join '/', @$_ } sort {my_sort} @data;

    ... and got this error:
    Bareword "my_sort" not allowed while "strict subs" in use
      kwaping,
      I doubt it. The barewords error only comes into play when you put a block around my_sort without the parens. It becomes ambigous to parser and the parens are needed to tell it that it is a function call. This doesn't explain (to me at least) why
      @data = map { join '/', @$_ } sort { my_sort() } @data; # works @data = map { join '/', @$_ } sort my_sort @data; # breaks

      Cheers - L~R

        Well, this is purely speculation, but I was just thinking that maybe there was an error generated that was somehow internal to the sort function and wasn't elegantly passed along to $!. For what it's worth, the error received on OSX was "Bus error".
Re: Sorting, recursion, and tail-call optimizations
by Roy Johnson (Monsignor) on Jan 06, 2006 at 18:30 UTC
    The sort docs aren't much help, but I suspect that when you give the sub name instead of a sub ref, the args aren't passed in @_. You could try adding
    @_ ||= ($a,$b);

    Caution: Contents may have been coded under pressure.
      Roy Johnson,
      The function isn't supposed to be called with any explicit args. That is the point of using the tail call optimization (re-using @_). I first auto-vivify the first element of @_ and then re-use it throughout. That part works great - it is the undefined $a and $b that baffles me. As does the mysterious *boom*.

      Cheers - L~R

Re: Sorting, recursion, and tail-call optimizations
by kwaping (Priest) on Jan 06, 2006 at 18:41 UTC
    Forgive me if you already knew this, but simply removing the word "goto" allows it to execute properly. I suggest we start exploring that path.

    Update:

    Okay, this might be something here...

    From "perldoc -f goto":
    The "goto-&NAME" form is quite different from the other forms
    of "goto".  In fact, it isn't a goto in the normal sense at
    all, and doesn't have the stigma associated with other gotos.
    Instead, it exits the current subroutine.
    
    Now from "perldoc -f sort":
    You also cannot exit out of the sort block or subroutine using
    any of the loop control operators described in perlsyn or with
    "goto".
    
      kwaping,
      I suggest we start exploring that path.

      That is the whole point of the question. I am using the special form of goto for an optimization. I am wondering why it breaks. I am even more interesting in why changing it from one sort prototype to a different sort prototype, it starts working. In case you were wondering, the prototypes I am referring to are for sort and not the user-defined comparator sub you referred to elsewhere in this thread.

      With regards to your update. I think that is referring to goto label where you are actually trying to exit the routine. Using the special goto &sub you are immediately resuming execution after replacing the current call stack. At least that's my understanding of it. If that wasn't the case then neither example should work. At this point, I throw my hands up.

      Cheers - L~R

        It's not particularly special, and it doesn't optimize. I'll explain:
        When you use goto &sub, you are doing a real goto. You're not calling a sub, you're just identifying a location. So the sub you leave never returns. Instead, the sub you went to returns.

        The only optimization that buys you, relative to calling &sub, is that you don't have multiple returns. In my testing, I found that the use of goto more than offsets those gains.


        Caution: Contents may have been coded under pressure.
        At this point, I throw my hands up.

        Me too, lol.