in reply to Splitting into variables columned data without delimiters with a regexp ?

I need to do it using a regexp instead of unpack('A6 A5',$chunk_of_data) because the later is soooo slow...

The regex engine will not be quicker than unpack.

Your perceived problem with the performance of unpack is probably because you are calling it for each chunk rather than unpacking all the chunks in one go:

my @bits = unpack '(A6A5)*', $all_the_data;

This is much faster than unpacking each 11 byte chunk individually.

eg:

$data = 'AAAAAABBBBB' x 10;; @bits = unpack '(A6A5)*', $data;; print for @bits;; AAAAAA BBBBB AAAAAA BBBBB AAAAAA BBBBB AAAAAA BBBBB AAAAAA BBBBB AAAAAA BBBBB AAAAAA BBBBB AAAAAA BBBBB AAAAAA BBBBB AAAAAA BBBBB

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: Splitting into variables columned data without delimiters with a regexp ?
by gerleu (Novice) on Nov 01, 2011 at 09:38 UTC
    Hi BrowserUK and thank again for your gentle help ! You are totally right: I will investigate the global unpack you suggest and will let you know the result... I was thinking about solutions with sprintf or split too but I doubt they will be faster than a global unpack.
      I was thinking about solutions with sprintf or split too

      split also invokes the regex engine, and doesn't really lend itself to the task, so it's not going to help any.

      I've no idea how sprintf could be used for this as its primary purpose is composing strings, not decomposing them.

      The regex engine can decompose fix field data surprisingly efficiently, but it will never beat unpack that was designed for this express purpose:

      cmpthese -1,{ a => q[ my $s='x'x1100; my@bits= unpack'(A6A5)*',$s; ], b => q[ my $s='x'x1100; my@bits= $s=~m[(.{6})(.{5})]g; ], };; Rate b a b 4516/s -- -6% a 4780/s 6% --

      Not huge savings but they grow with size:

      cmpthese -1,{ a => q[ my $s='x'x11000; my@bits= unpack'(A6A5)*',$s; ], b => q[ my $s='x'x11000; my@bits= $s=~m[(.{6})(.{5})]g; ], };; Rate b a b 427/s -- -10% a 471/s 11% --

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.