Re^2: How can I reduce this with a REGEX?

I would also use matching instead of substitution. You mentioned in a private message that the substitution is faster (using such a trick would be worth a comment), but my benchmark doesn't show the effect. To speed things up, rather use my %h instead of %h = ().

#!/usr/bin/perl
use warnings;
use strict;

use Test::More;
use Benchmark qw{ cmpthese };

my @input = split /\n/, << '__INPUT__';
12-12
12-12-12
12-13-12-13
12-12-13-13
12-13-13-14
__INPUT__

sub lenno {
    my @back;
    my @i = @input;
    for (@i) {
        my %h;
        undef $h{$1} while s/(\d\d)//;
        push @back, join ("-", sort keys %h);
    }
    \@back
}


sub choro {
    my @back;
    my @i = @input;
    for (@i) {
        my %h;
        undef $h{$1} while m/(\d\d)/g;
        push @back, join ('-', sort keys %h);
    }
    \@back
}


is_deeply(lenno(), choro(), 'same');
done_testing();

cmpthese(-3,
         { lenno => 'lenno',
           choro => 'choro',
         });
[download]

Output:

         Rate lenno choro
lenno 25358/s    --   -4%
choro 26458/s    4%    --
[download]

Perl 5.16.2, x86_64-linux-thread-multi.

Update: Significantly faster: use a hash slice undef @h{m/\d\d/g}; instead of the inner loop.

لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

Comment on Re^2: How can I reduce this with a REGEX? Select or Download Code

Replies are listed 'Best First'.
Re^3: How can I reduce this with a REGEX? by Lennotoecom (Pilgrim) on Mar 15, 2014 at 18:56 UTC
Yes, thank you for your reply, I apparently got mistaken, I'm deeply sorry. You were right. My results: `use Benchmark(cmpthese); cmpthese(1000000, { 'a' => sub { $_ = '12-12-12-12-12'; undef $h{$1} while (s/(\d\d)//); }, 'b' => sub { $_ = '12-12-12-12-12'; undef $h{$1} while (m/(\d\d)/g); }, 'c' => sub { $_ = '12-12-12-12-12'; undef @h{m/\d\d/g}; } });` [download] results: `Rate a b c a 195427/s -- -19% -35% b 241896/s 24% -- -20% c 300933/s 54% 24% --` [download]	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^3: How can I reduce this with a REGEX?
by Lennotoecom (Pilgrim) on Mar 15, 2014 at 18:56 UTC

use Benchmark(cmpthese);
cmpthese(1000000, {
    'a' => sub {
        $_ = '12-12-12-12-12';
        undef $h{$1} while (s/(\d\d)//);
    },
    'b' => sub {
        $_ = '12-12-12-12-12';
        undef $h{$1} while (m/(\d\d)/g);
    },
    'c' => sub {
        $_ = '12-12-12-12-12';
        undef @h{m/\d\d/g};
    }
});
[download]

      Rate    a    b    c
a 195427/s   -- -19% -35%
b 241896/s  24%   -- -20%
c 300933/s  54%  24%   --
[download]

[reply]
[d/l]
[select]