in reply to Re: numeric sort on substring
in thread numeric sort on substring

I'm wondering why you add the complication of an inner anonymous array and a three-argument split. I think neither are necessary and, since split defaults to operation on $_ one argument suffices.

print for map { $_->[ 0 ] } sort { $a->[ 1 ] <=> $b->[ 1 ] || $a->[ 2 ] <=> $b->[ 2 ] } map { [ $_ , ( split m{,} )[ 1, 0 ] ] } <DATA>;

You could also use a Guttman Rosler transform.

print for map { substr $_, 8 } sort map { pack q{NNA*}, ( split m{,} )[ 1, 0 ], $_ } <DATA>;

I hope this is of interest.

Cheers,

JohnGG

Replies are listed 'Best First'.
Re^3: numeric sort on substring
by Jim (Curate) on Jan 08, 2011 at 00:14 UTC

    In hindsight, the complication of the inner anonymous array is needless. It reflects how my mind reckoned the data structure at the moment I wrote the transform.

    The three-argument split is just a habit. The habit is based on the documentation, which states: "In time critical applications it behooves you not to split into more fields than you really need." I don't know if the OPs application is time-critical or not. I went with the more conservative assumption. Like I said: habit.

    I like the regular expression pattern matching version better anyway.

      The three-argument split is just a habit.

      I finally got around to benchmarking this and it seems to be a habit you should keep :-)

      ok 1 - grtRegex ok 2 - grtSplit ok 3 - grtSplit3 ok 4 - nSubRegex ok 5 - nSubSplit ok 6 - nSubSplit3 ok 7 - stRegex ok 8 - stSplit ok 9 - stSplit3 Rate nSubSplit nSubRegex nSubSplit3 stSplit grtSplit stSp +lit3 stRegex grtRegex grtSplit3 nSubSplit 8.10/s -- -69% -71% -86% -88% +-93% -93% -94% -94% nSubRegex 25.8/s 219% -- -8% -57% -62% +-77% -77% -82% -82% nSubSplit3 28.1/s 247% 9% -- -53% -59% +-75% -75% -80% -80% stSplit 59.8/s 639% 132% 113% -- -12% +-47% -47% -58% -58% grtSplit 68.1/s 741% 164% 142% 14% -- +-39% -39% -52% -52% stSplit3 112/s 1283% 334% 299% 87% 64% + -- -0% -21% -22% stRegex 112/s 1284% 334% 299% 87% 65% + 0% -- -21% -22% grtRegex 143/s 1661% 452% 408% 138% 109% + 27% 27% -- -0% grtSplit3 143/s 1663% 453% 408% 139% 110% + 28% 27% 0% --

      Not constraining the split to just the fields you need (given many fields as here, I'm guessing) is a significant performance hit but it seems that the three-argument split is level-pegging with the regular expression approach. The code.

      Sorry for the slow reply, I hope this is of interest.

      Cheers,

      JohnGG

        It you want to sort anything in Perl fast, then go for Sort::Key!!!
        use Sort::Key::Multi qw(ii_keysort); ... my %methods = ( ... skm => sub { my @sorted = ii_keysort { (split m{,}, $_, 3 )[ 1, 0 ] } @{$_[0 +]}; return \@sorted } );
        That's what I get on my computer:
        Rate nSubSplit nSubRegex nSubSplit3 stSplit stSplit3 stRe +gex grtSplit grtRegex grtSplit3 skm nSubSplit 21.7/s -- -54% -61% -75% -83% - +84% -85% -91% -91% -95% nSubRegex 47.6/s 119% -- -14% -46% -63% - +65% -67% -80% -81% -89% nSubSplit3 55.6/s 156% 17% -- -36% -56% - +60% -62% -77% -77% -87% stSplit 87.5/s 303% 84% 57% -- -31% - +36% -40% -64% -64% -80% stSplit3 127/s 486% 167% 129% 45% -- +-8% -12% -48% -48% -70% stRegex 138/s 534% 189% 147% 57% 8% + -- -5% -43% -44% -68% grtSplit 145/s 569% 205% 161% 66% 14% + 6% -- -40% -41% -66% grtRegex 243/s 1021% 411% 338% 178% 91% +77% 68% -- -1% -43% grtSplit3 245/s 1029% 415% 341% 180% 93% +78% 69% 1% -- -43% skm 431/s 1885% 805% 674% 393% 239% 2 +13% 197% 77% 76% --