RE question...yup, another one ;)

snafu has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.

Re: RE question...yup, another one ;)
by btrott (Parson) on May 26, 2001 at 23:26 UTC

A regex is *way* too big a tool for this job.

print $_ % 10, "\n" for 0..300;
[download]

Here's your hint: think of a mathematical operator that might do what you want here. It's in perlop, under Multiplicative Operators.

[reply]
[d/l]

Re: Re: RE question...yup, another one ;)

by Anonymous Monk on May 27, 2001 at 00:31 UTC

print $_ . "-->" . $_ % 10 for 0..300
[download]

[reply]
[d/l]

Re: Re: Re: RE question...yup, another one ;)

by btrott (Parson) on May 27, 2001 at 01:46 UTC

What version of perl are you running where it prints 3000?

[reply]

Re: Re: Re: Re: RE question...yup, another one ;)

by Anonymous Monk on May 27, 2001 at 01:57 UTC

Re: Re: Re: RE question...yup, another one ;)

by larryk (Friar) on May 27, 2001 at 00:49 UTC

"Argument is futile - you will be ignorralated!"

[reply]

japhy regex analysis: case study (RE question...)
by japhy (Canon) on May 27, 2001 at 08:41 UTC

%

There are four generic approaches to this:

/(\d)*(\d)/: useless capturing of preceeding digits; actually captures one digit at a time over and over again; useless backtracking is forced; 2.96 x slower than modulus
/(\d*)(\d)/: useless capturing of preceeding digits; useless backtracking is forced; 2.80 x slower than modulus
/\d*(\d)/: useless backtracking is forced; 2.79 x slower than modulus
/(\d)$/: optimized (goes to end of string automatically); 2.54 x slower than modulus

use Benchmark 'timethese';

$x = int (1_000_000 * rand 1_000_000);

timethese(-5, {
  multiple    => sub { $x =~ /(\d)*(\d)/ },
  backtrack_c => sub { $x =~ /(\d*)(\d)/ },
  backtrack   => sub { $x =~ /\d*(\d)/   },
  opt         => sub { $x =~ /(\d)$/     },
  mod         => sub { $x % 10           },
});
[download]

snafu

my book

japhy

Perl and Regex Hacker

[reply]
[d/l]

(tye)Re: japhy regex analysis: case study (RE question...)

by tye (Sage) on May 28, 2001 at 02:42 UTC

Well, since you resorted to benchmarks (updated)...

            Rate    bt_c    mult      bt     opt     mod    chop
bt_c    180870/s      --     -1%     -5%    -28%    -62%    -74%
mult    181987/s      1%      --     -4%    -27%    -62%    -74%
bt      189426/s      5%      4%      --    -24%    -60%    -73%
opt     249612/s     38%     37%     32%      --    -48%    -64%
mod     476214/s    163%    162%    151%     91%      --    -31%
chop    692944/s    283%    281%    266%    178%     46%      --
[download]

chop

use Benchmark 'cmpthese';

$x = int (1_000_000 * rand 1_000_000);

cmpthese( -3, {
  mult => sub { $x =~ /(\d)*(\d)/  },
  bt_c => sub { $x =~ /(\d*)(\d)/  },
  bt   => sub { $x =~ /\d*(\d)/    },
  opt  => sub { $x =~ /(\d)$/      },
  mod  => sub { $x % 10            },
  chop => sub { my $x= $x; chop $x },
});
[download]

Following are the original bogus results. Thanks to dkubb for mentioning my over local. I realized I'd made a mistake and came back but not quick enough. So it looks like local is quite a bit slower than my (which makes sense), so I'd be interested in how japhy's machine compares the new code.

        Rate    mult   bt_c     bt    opt    mod     chop
mult 180759/s     --    -1%    -7%   -31%   -62%     -71%
bt_c 182581/s     1%     --    -6%   -31%   -62%     -70%
bt   193680/s     7%     6%     --   -26%   -60%     -69%
opt  263234/s    46%    44%    36%     --   -45%     -57%
mod  481067/s   166%   163%   148%    83%     --     -22%
chop 618559/s   242%   239%   219%   135%    29%       --
[download]

chop

use Benchmark 'cmpthese';

$x = int (1_000_000 * rand 1_000_000);

cmpthese( -3, {
  mult => sub { $x =~ /(\d)*(\d)/ },
  bt_c => sub { $x =~ /(\d*)(\d)/ },
  bt   => sub { $x =~ /\d*(\d)/   },
  opt  => sub { $x =~ /(\d)$/     },
  mod  => sub { $x % 10           },
  chop => sub { local $x; chop $x },
});
[download]

tye

[reply]
[d/l]
[select]

Re: (tye)Re: japhy regex analysis: case study (RE question...)

by japhy (Canon) on May 28, 2001 at 05:22 UTC

Benchmark.pm

timethese( -3, {
  mod  => sub { $x % 10           },
  chop => sub { chop(my $x = $x) },
  substr => sub { substr($x, -1) },
});
__END__
(they ran for at least 3 seconds)

  chop:  8707.14/s  (n= 29256)
substr:  43620.45/s (n=136532)
   mod:  48906.87/s (n=163838)
[download]

chop

substr

japhy

Perl and Regex Hacker

[reply]
[d/l]

(tye)Re2: japhy regex analysis: case study (RE question...)

by tye (Sage) on May 28, 2001 at 07:53 UTC

Re: Re: (tye)Re: japhy regex analysis: case study (RE question...)

by snafu (Chaplain) on May 30, 2001 at 01:56 UTC

Re: (tye)Re: japhy regex analysis: case study (RE question...)

by japhy (Canon) on May 28, 2001 at 03:39 UTC

local $x

use Benchmark 'cmpthese';

$x = int (1_000_000 * rand 1_000_000);

cmpthese( -3, {
  mod => sub { $x % 10 },
  chop => sub { local $x; chop $x },
  chop2 => sub { chop(local $x = $x) },
});

__END__

          Rate chop2  chop   mod
chop2  25430/s    --  -88%  -93%
chop  204396/s  704%    --  -41%
mod   348051/s 1269%   70%    --
[download]

chop()

japhy

Perl and Regex Hacker

[reply]
[d/l]

Re: japhy regex analysis: case study (RE question...)

by snafu (Chaplain) on May 30, 2001 at 01:42 UTC

Thanks!

----------
- Jim

[reply]

Re: RE question...yup, another one ;)
by nardo (Friar) on May 26, 2001 at 23:23 UTC

perl -e 'for ( $dec = 0 ; $dec <= 300 ; $dec++ ) {$dec =~ /(\d)$/ && print "$dec -> $1\n"; }'

[reply]
[d/l]

Re: Re: RE question...yup, another one ;)

by snafu (Chaplain) on May 26, 2001 at 23:25 UTC

nardo

perl -e 'for ( $dec = 0 ; $dec <= 300 ; $dec++ ) { ($one = $dec)  =~ s
+/(\d)+(\d)/$2/;print "$dec -> $one\n"; }'
[download]

----------
- Jim

[reply]
[d/l]

Re: Re: Re: RE question...yup, another one ;)

by nardo (Friar) on May 26, 2001 at 23:38 UTC

perl -e 'for (0..300) {print "$_ -> ", chop, "\n"; }'

perl -e 'for (0..300) {/(\d)$/ && print "$_ -> $1\n"; }'

[reply]
[d/l]
[select]

Re: Re: Re: Re: RE question...yup, another one ;)

by snafu (Chaplain) on May 27, 2001 at 00:21 UTC