in reply to Re: Splitting a string to chunks
in thread Splitting a string to chunks

Hi,

I added another version, that split string that split with a smaller last chunk. Added also a /o, to improve performace (that can be used if you have several lines to split.

Added this to the benchmark:

'regex' => sub { my @arr = $string =~ /(........)/g; }, 'regexo' => sub { my @arr = $string =~ /(.{1,8})/og; },
The results:
Rate split_pos split grep_split substr_map substr_lo +op unpack regex regexo split_pos 7295/s -- -57% -60% -68% -7 +7% -78% -100% -100% split 16900/s 132% -- -7% -26% -4 +7% -50% -100% -100% grep_split 18241/s 150% 8% -- -20% -4 +3% -46% -100% -100% substr_map 22883/s 214% 35% 25% -- -2 +9% -32% -99% -100% substr_loop 32139/s 341% 90% 76% 40% +-- -4% -99% -99% unpack 33495/s 359% 98% 84% 46% +4% -- -99% -99% regex 4342185/s 59421% 25593% 23705% 18876% 1341 +1% 12864% -- -6% regexo 4596612/s 62909% 27098% 25099% 19988% 1420 +2% 13623% 6% --

Replies are listed 'Best First'.
Re^3: Splitting a string to chunks
by Fengor (Pilgrim) on Nov 29, 2006 at 14:30 UTC
    umhmm you got my typo. i accidentally used $string instead of $str in my post first. that explains the high rates for the regex solution. here is the timing with the typo corrected:
    Rate split_pos grep_split substr_map regexo regex subst +r_loop unpack split_pos 5587/s -- -65% -69% -76% -77% + -79% -81% grep_split 15974/s 186% -- -12% -32% -34% + -40% -45% substr_map 18051/s 223% 13% -- -23% -26% + -32% -38% regexo 23474/s 320% 47% 30% -- -3% + -12% -20% regex 24272/s 334% 52% 34% 3% -- + -9% -17% substr_loop 26596/s 376% 66% 47% 13% 10% + -- -9% unpack 29240/s 423% 83% 62% 25% 20% + 10% --

    --
    "WHAT CAN THE HARVEST HOPE FOR IF NOT THE CARE OF THE REAPER MAN"
    -- Terry Pratchett, "Reaper Man"

Re^3: Splitting a string to chunks
by Limbic~Region (Chancellor) on Nov 29, 2006 at 14:35 UTC
    themage,
    Your benchmark disagrees with mine (with x 20 and x 200). Additionally, I think you should re-read perlre with regards to what /o does.

    I am sure diotalevi will improve upon my explanation but in a nutshell, /o is an old optimization predating qr//. If you needed to interpolate a variable inside a regex such as /$regex/ but knew that $regex would never change, the flag would tell perl to only compile the regex once. In fact, if you broke your promise and changed $regex then it would still not recompile it leading to buggy code. Then came along qr// and improved things greatly (see /o is dead, long live qr//!).

    Since you are not using a variable in your interpolation - the /o is having no effect.

    See also this regarding how current perl's optimize regex compiling. Unfortunately I couldn't seem to find this in any perldelta from 5.6.1 to 5.9.4 which makes me suspicious so I posted Questions concerning /o regex modifier.

    Cheers - L~R

Re^3: Splitting a string to chunks
by BrowserUk (Patriarch) on Nov 29, 2006 at 14:33 UTC

    Without having run your benchmark, the huge disparity between your solutions and the others make me very suspicious that your code is not producing the same results as the others. Have you checked?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      the fault for the huge disparity seems to be mine. i accidentally wrote $string instead of $str in the source code as i first posted my regex solution and since themage seems to have his solution based on mine he copied the mistake. The huge results are based on $string being uninitialized. i found the mistake as i tried my solution later on my pc and changed the typo. but that was after themage wrote his reply

      --
      "WHAT CAN THE HARVEST HOPE FOR IF NOT THE CARE OF THE REAPER MAN"
      -- Terry Pratchett, "Reaper Man"