in reply to Fastest way to turn coordinate list into array

I've got a couple of possibilities here for speeding up your code. Best performance thus far: a capturing regular expression:

#!/usr/bin/perl use strict; use warnings; use Benchmark qw(cmpthese :hireswallclock); my $coords = 'lon,lat,alt ' x 20000; cmpthese(100, { OP => sub { my @list = split(/[, ]/,$coords); my $size = @list; my @poly = (); for (my $i=0; $i<=($size-3); $i+=3) { my $lon = $list[$i+0]; my $lat = $list[$i+1]; my $point = [$lon,$lat]; push (@poly, $point); } }, double_split => sub { my @list = split(' ',$coords); my @poly; $poly[@list-1] = 0; for (@list) { push (@poly, [ (split /,/, $_)[0, +1] ]); } }, while_shift => sub { my @list = split(/[, ]/,$coords); my @poly = (); while (@list) { push @poly, [shift @list, shift @l +ist]; shift @list; } }, regex => sub { my @poly = (); while ($coords =~ /([^,]+),([^,]+),[^\ +s]+ /g) { push @poly, [$1,$2]; } }, chromatic => sub { my @poly = map { [ (split /,/, $_)[ +0, 1] ] } split / /, $coords;}, jwkrahn => sub { my @poly = map [ ( split /,/ )[ 0, 1 +] ], split ' ', $coords;}, ike1 => sub { my @poly = map [ (split /,/, $_)[0, 1] ] +, split ' ', $coords;}, ike2 => sub { my @poly = map [ /([^,]*),([^,]*)/ ], sp +lit ' ', $coords; }, }); __END__ Rate OP while_shift double_split ike1 jwkrahn chromat +ic ike2 regex OP 6.14/s -- -17% -49% -50% -51% -5 +2% -54% -59% while_shift 7.42/s 21% -- -38% -40% -40% -4 +2% -45% -51% double_split 11.9/s 95% 61% -- -3% -4% - +7% -11% -21% ike1 12.3/s 101% 66% 3% -- -1% - +4% -8% -18% jwkrahn 12.4/s 103% 67% 4% 1% -- - +3% -8% -18% chromatic 12.9/s 109% 73% 8% 4% 3% +-- -4% -15% ike2 13.4/s 119% 81% 13% 9% 8% +5% -- -11% regex 15.1/s 147% 104% 27% 23% 22% 1 +8% 13% --

Update: Following ikegami's post below, I reran with more tests, closing some apps I had open and changing the spec to match the OP (originally ran for x 1000, whereas spec says "tens of thousands") and got the same ordering.

This is perl, v5.8.9 built for MSWin32-x86-multi-thread (with 9 registered patches, see perl -V for more detail)

Replies are listed 'Best First'.
Re^2: Fastest way to turn coordinate list into array
by ikegami (Patriarch) on Apr 09, 2010 at 18:46 UTC

    Those results are different than what I would expect.

    On perl, v5.10.0 built for i486-linux-gnu-thread-multi, I get what I'd expect:

    Rate OP while_shift regex ike2 chromatic + ike1 OP 245/s -- -17% -28% -38% -48% + -50% while_shift 297/s 21% -- -13% -26% -36% + -39% regex 340/s 39% 15% -- -15% -27% + -31% ike2 398/s 63% 34% 17% -- -15% + -19% chromatic 467/s 91% 57% 37% 17% -- + -5% ike1 490/s 100% 65% 44% 23% 5% + --

    On perl 5, version 12, subversion 0 (v5.12.0-RC1) built for i686-linux, I the same results.

    Rate OP while_shift regex ike2 chromatic + ike1 OP 117/s -- -22% -28% -42% -52% + -54% while_shift 151/s 29% -- -7% -25% -38% + -41% regex 163/s 39% 8% -- -19% -33% + -36% ike2 202/s 72% 33% 24% -- -18% + -21% chromatic 244/s 109% 62% 50% 21% -- + -4% ike1 254/s 117% 68% 56% 26% 4% + --

    The two extra tests are:

    ike1 => sub { my @poly = map [ (split /,/, $_)[0, 1] ], split ' ', $coords; }, ike2 => sub { my @poly = map [ /([^,]*),([^,]*)/ ], split ' ', $coords; },

      Brilliant, thank you all for your speedy replies. FWIW the regex was simply hanging apache with real data at my end (some lists have 300K points) but the chromatic and ikegami solutions are definitely helping.

      It has highlighted that actually the real problem is algorithmic rather than performance of this section of code as the speed up, whilst beneficial, isn't the main problem. I can think of a way of whittling down the data to test in another part of the code so I'll try that out next. Again, many thanks - and ikegami's code will remain in as the winner so far. Roger
      As I've pointed out in the update above, I reran and maintained ordering. I thought perhaps it was an 5.8/10 issue, but ran on my Linux box and got the following:

      Rate while_shift OP regex ike2 double_split chromatic + jwkrahn ike1 while_shift 15.0/s -- -11% -30% -40% -46% -51% + -54% -54% OP 16.8/s 12% -- -22% -33% -40% -45% + -48% -48% regex 21.5/s 43% 28% -- -14% -23% -29% + -34% -34% ike2 25.0/s 67% 49% 16% -- -10% -18% + -23% -23% double_split 27.9/s 86% 66% 30% 11% -- -8% + -14% -14% chromatic 30.4/s 102% 81% 41% 22% 9% -- + -6% -6% jwkrahn 32.4/s 116% 93% 50% 29% 16% 6% + -- -0% ike1 32.5/s 116% 93% 51% 30% 17% 7% + 0% -- This is perl, v5.8.8 built for x86_64-linux-gnu-thread-multi

      Anyone have an idea why my regex engine is so fast for ActiveState Perl 5.8.9?