DrWhy has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

Updated: Several typos and omissions fixed.

Here's a weird thing I don't understand. Can another more enlightened share his knowledge? I am working on a large set of scripts and have a 'Utilities' module for simple stuff that's used repeatedly across multiple packages. I wanted to put some sort routines in there, but got tripped up by the fact that the $a and $b used in sort routines are package globals, and these sort routines are used in packages different from the one they are defined in. So I thought, hey, simple answer: export $a and $b from 'Utilities' so the package globals are aliased. That didn't work. I checked and made sure that $a and $b in the Utilities package and the use-ing package were really aliases of e/o, and they were; I even printed out $a inside the sorting sub and the correct value was being printed.

What did turn out to work was exporting *a and *b. I don't understand why that worked and exporting $a/b didn't. Anybody got a clue as to why exporting $a/b doesn't work, but exporting the corresponding type globs does?

Here's some example code:

file1: main.pl

#!/usr/bin/perl use lib '.'; use Sorts; $, = "\n"; print sort caseless ('a','s','d','f');
file 2: Sorts.pm

version 1 -- b0rken, a/b not exported:

use Exporter; package Sorts; use base 'Exporter'; @Sorts::EXPORT = qw(caseless); sub caseless { lc $a cmp lc $b }
version 2 -- still b0rken, $a/b exported:
use Exporter; package Sorts; use base 'Exporter'; @Sorts::EXPORT = qw(caseless $a $b); sub caseless { lc $a cmp lc $b }
version 3 -- works!, *a/b exported:
use Exporter; package Sorts; use base 'Exporter'; @Sorts::EXPORT = qw(caseless *a *b); sub caseless { lc $a cmp lc $b }
running main.pl with the first two versions of Sorts.pm produces:
a s d f
but with the third version outputs:
a d f s
--DrWhy

"If God had meant for us to think for ourselves he would have given us brains. Oh, wait..."

Replies are listed 'Best First'.
Re: sort routines crossing package lines
by ides (Deacon) on Nov 24, 2004 at 21:18 UTC

    This is because $a and $b are really $main::a and $main::b. You can solve this in two other ways:

    1. Use prototypes on your sort subs so instead you would have  sub caseless ($$) { lc($_[0]) cmp lc($_[1]) }
    2. Or use the full name of $a and $b like this  sub caseless { lc($main::a) cmp lc($main::b) }

    Frank Wiles <frank@wiles.org>
    http://www.wiles.org

      Use prototypes on your sort subs so instead you would have sub caseless ($$) { lc($_[0]) cmp lc($_[1]) }

      Yes, but I don't want to. perldoc -f sort claims that this introduces a performance hit and the application is already slow enough. I'm sorting lists with thousands of elements.

      Or use the full name of $a and $b like this sub caseless { lc($main::a) cmp lc($main::b) }
      That is only true if all uses of the sort routine are in package main. In this application this is not the case. This routine has to work when called from a number of different packages.

      BTW, I'm not necessarily looking for an alternate implementation. What I'm looking for is understanding the reason why exporting $a and $b does not make this work, but exporting their typeglobs does.

      --DrWhy

      "If God had meant for us to think for ourselves he would have given us brains. Oh, wait..."

        I've heard that prototypes slow down subroutines, however in practice I've found that it doesn't usually effect an application. I would suggest trying it to see.

        I believe the reason is they aren't just package variables, but "special package variables" according to man perlvar.

        Frank Wiles <frank@wiles.org>
        http://www.wiles.org

        According to other monks' messages, below, you can sort short lists in microseconds or nanoseconds. So even though your lists have thousands of elements, I doubt that sorting is the cause of your speed concerns.

        Profile your program.

        I bet you'll find huge whacks of time spent copying arrays in a function call in an inner loop, where you should be passing a reference, or some other efficiency error. Or it might be your algorithmn that needs some re-thinking. Are you processing things in an item-by-item loop, when you could leave Perl to process the whole set at once?

        --
        TTTATCGGTCGTTATATAGATGTTTGCA

Re: sort routines crossing package lines
by fglock (Vicar) on Nov 25, 2004 at 00:12 UTC

    As a side note, it is interesting to see how List::Util deals with ($a,$b):

    sub reduce (&@) { my $code = shift; return shift unless @_ > 1; my $caller = caller; local(*{$caller."::a"}) = \my $a; local(*{$caller."::b"}) = \my $b; $a = shift; foreach (@_) { $b = $_; $a = &{$code}(); } $a; }
Re: sort routines crossing package lines
by ysth (Canon) on Nov 24, 2004 at 23:49 UTC
    $a and $b are not always forced into package main (the way punctuation vars are), and are not magic. But they are localized and aliased to the elements being sorted. The localization temporarily removes the imported ones when you import the scalars rather than the globs. When the globs are imported, the localization replaces what's in the scalar slot of the glob, so it affects the exporting package's "copy".

    You can see the same effect here:

    $ perl -we'$Foo::a = 2; *a = \$Foo::a; for $a (4) { print $a, $Foo::a +}' 42 $ perl -we'$Foo::a = 2; *a = \*Foo::a; for $a (4) { print $a, $Foo::a +}' 44
    Update: meant to say, the above is consistent whether either or both var names are a or b or not.
      Eureka! I do believe ysth's got it.

      /me wishes he could up vote more than once...

      --DrWhy

      "If God had meant for us to think for ourselves he would have given us brains. Oh, wait..."

Re: sort routines crossing package lines
by tall_man (Parson) on Nov 24, 2004 at 23:22 UTC
    If you look at the internals of Exporter::Heavy, you will see the two exports in your example are equivalent to:
    # Regular $a, $b export: *{main::a} = ${Sorts::a}; *{main::b} = ${Sorts::b}; # Glob *a, *b exports: *{main::a} = *{Sorts::a}; *{main::b} = *{Sorts::b};
    The first one copies only the scalar value slot of the glob, the latter copies all slots. Therefore, I suspect the reason that the glob export works and the other doesn't is that $a and $b don't reside in the normal scalar slots. Update: (Or maybe their "magic" flags don't get copied without the glob. There's a section in perlsub about "localization of globs" that says you can make variables lose their magic by localizing their glob).

    The perlvar page indicates $a and $b are special, "Because of this specialness $a and $b don't need to be declared (using local(), use vars, or our()) even when using the strict vars pragma."

    Update 2: I believe [id://ysth] has the right answer.

Re: sort routines crossing package lines
by Arunbear (Prior) on Nov 24, 2004 at 21:20 UTC
    You can avoid exporting $a and $b by defining the sort routine like this:
    sub caseless ($$) { lc $_[0] cmp lc $_[1] }
    though the docs for sort say this is slower than using the un-prototyped version with $a and $b.

      Update: I'm now saving the values for more realism, and corrected the extra 's' BrowserUK mentioned (which I had already fixed in the .pl file, just not in the post).

      Some concrete numbers:

      use Benchmark (); sub argless { $a cmp $b } sub proto($$) { $_[0] cmp $_[1] } sub args { $_[0] cmp $_[1] } my @array = qw( Perl may be copied only under the terms of either the Artistic Lice +nse or the GNU General Public License, which may be found in the Perl 5 source + kit. ); Benchmark::cmpthese(0, { inline => sub { my @sorted = sort { $a cmp $b } @array; }, argless => sub { my @sorted = sort argless @array; }, proto => sub { my @sorted = sort proto @array; }, args => sub { my @sorted = sort { args($a, $b) } @array; }, }); __END__ Rate args proto argless inline args 4634/s -- -57% -63% -81% proto 10735/s 132% -- -14% -57% argless 12462/s 169% 16% -- -50% inline 24920/s 438% 132% 100% --

      Here's a different comparison: (One that yeilds less unfair optimization)

      use Benchmark (); sub argless { $hash{$a } cmp $hash{$b } } sub proto($$) { $hash{$_[0]} cmp $hash{$_[1]} } sub args { $hash{$_[0]} cmp $hash{$_[1]} } our %hash = map { (''.rand()) => $_ } qw( Perl may be copied only under the terms of either the Artistic Lice +nse or the GNU General Public License, which may be found in the Perl 5 source + kit. ); Benchmark::cmpthese(0, { inline => sub { my @sorted = sort { $hash{$a} cmp $hash{$b} } k +eys %hash; }, argless => sub { my @sorted = sort argless k +eys %hash; }, proto => sub { my @sorted = sort proto k +eys %hash; }, args => sub { my @sorted = sort { args($a, $b) } k +eys %hash; }, }); __END__ Rate args proto argless inline args 2832/s -- -33% -42% -43% proto 4258/s 50% -- -13% -14% argless 4871/s 72% 14% -- -2% inline 4973/s 76% 17% 2% --

      Active Perl 5.6.1

        Not optimisations, errors.

        #! perl -slw use strict; #use sort use Benchmark (); sub argless { $a cmp $b } sub proto($$) { $_[0] cmp $_[1] } sub args { $_[0] cmp $_[1] } my @array = qw( Perl may be copied only under the terms of either the Artistic Lice +nse or the GNU General Public License, which may be found in the Perl 5 source + kit. ); Benchmark::cmpthese(-1, { inline => q[ sort { $a cmp $b } @array; ], argless => q[ sort argless @array; ], proto => q[ sort proto @array; ], args => q[ sort { args($a, $b) } @array; ], }); __END__ P:\test>410266 Possible attempt to separate words with commas at P:\test\410266.pl li +n Name "main::b" used only once: possible typo at P:\test\410266.pl line Name "main::a" used only once: possible typo at P:\test\410266.pl line Useless use of sort in void context at (eval 4) line 1. ... Useless use of sort in void context at (eval 182) line 1. Rate argless proto inline args argless 6619798/s -- -2% -8% -19% proto 6748958/s 2% -- -6% -18% inline 7189879/s 9% 7% -- -12% args 8216252/s 24% 22% 14% --

        Examine what is said, not who speaks.
        "But you should never overestimate the ingenuity of the sceptics to come up with a counter-argument." -Myles Allen
        "Think for yourself!" - Abigail        "Time is a poor substitute for thought"--theorbtwo         "Efficiency is intelligent laziness." -David Dunham
        "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
        I ran this on my machine -- a Dell Precision 340 running Fedora 2.6.something, Perl 5.8.4 --, fixing the arg(s)less problems and giving it a more hefty list to sort (~2500 items). My results from three runs:
        Rate proto argsless inline args proto 189740/s -- -0% -31% -32% argsless 190421/s 0% -- -30% -32% inline 273132/s 44% 43% -- -3% args 280139/s 48% 47% 3% -- Rate argsless proto inline args argsless 188543/s -- -0% -31% -33% proto 188746/s 0% -- -31% -33% inline 273988/s 45% 45% -- -3% args 281040/s 49% 49% 3% -- Rate proto argsless inline args proto 191010/s -- -1% -27% -30% argsless 192801/s 1% -- -27% -30% inline 262320/s 37% 36% -- -4% args 274019/s 43% 42% 4% --
        I'm not terribly surprised that inline is faster than the other named sub methods, but am quite surprsied that the 'args' version is the fastest.

        Update: Okay, adding an assignment in the benchmarked code gives more sensible results:

        Rate args proto argsless inline args 17.5/s -- -45% -49% -51% proto 31.8/s 82% -- -8% -11% argsless 34.6/s 98% 9% -- -3% inline 35.5/s 103% 12% 3% -- Rate args proto argsless inline args 17.3/s -- -46% -51% -52% proto 32.0/s 85% -- -9% -11% argsless 35.1/s 103% 10% -- -2% inline 35.7/s 107% 12% 2% -- Rate args proto inline argsless args 17.4/s -- -47% -49% -50% proto 33.0/s 90% -- -4% -5% inline 34.4/s 97% 4% -- -1% argsless 34.8/s 100% 5% 1% --
        --DrWhy

        "If God had meant for us to think for ourselves he would have given us brains. Oh, wait..."

        An interesting anomoly:

        sub argsless ...

        argless => sub { sort arg?less


        Examine what is said, not who speaks.
        "But you should never overestimate the ingenuity of the sceptics to come up with a counter-argument." -Myles Allen
        "Think for yourself!" - Abigail        "Time is a poor substitute for thought"--theorbtwo         "Efficiency is intelligent laziness." -David Dunham
        "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
Re: sort routines crossing package lines
by rinceWind (Monsignor) on Nov 25, 2004 at 12:41 UTC