comment on

'use Benchmark' advice usually makes me shudder. As so often is the case, you have a tiny mistake in your code and so are benchmarking nearly identical do-nothing chunks of code.

#!/usr/bin/perl -w

use strict;

use Benchmark "cmpthese";

my $string  = pack "A*" => map { chr (32 + int rand 95) } 0..1024000;

cmpthese( -1, {
    subsingle   => '(my $s = $string) =~ s/[^a-zA-Z0-9 _-]//g',
    subplus     => '(my $s = $string) =~ s/[^a-zA-Z0-9 _-]+//g',
    tran        => '(my $s = $string) =~ tr/a-zA-Z0-9 _-//cd',
} );
warn "Second version\n";
cmpthese( -1, {
    subsingle   => sub { (my $s = $string) =~ s/[^a-zA-Z0-9 _-]//g },
    subplus     => sub { (my $s = $string) =~ s/[^a-zA-Z0-9 _-]+//g },
    tran        => sub { (my $s = $string) =~ tr/a-zA-Z0-9 _-//cd },
} );
__END__
Use of uninitialized value in transliteration (tr///) at (eval 14) lin
+e 1.
[about a million warnings]
Use of uninitialized value in transliteration (tr///) at (eval 140) li
+ne 1.
Second version
              Rate   subplus subsingle      tran
subplus   114916/s        --       -4%      -14%
subsingle 119259/s        4%        --      -10%
tran      133187/s       16%       12%        --
               Rate   subplus subsingle      tran
subplus   2171607/s        --       -4%      -70%
subsingle 2256550/s        4%        --      -68%
tran      7143583/s      229%      217%        --
[download]

The most important take-away from this should be that, even with Benchmark.pm going to extraordinary efforts to try to subtract out the "overhead", I had to resort to ridiculously long strings before it could really tell a difference between the three choices.

So you are not going to notice a difference.

When something takes 0.0000004 seconds for an extraordinarily long string, making it take only 0.0000001 seconds rarely actually matters (especially when you don't have extraordinarily long strings), especially since, outside of Benchmark.pm's imagined view of things, the overhead of actually getting to the point of running the regex or tr/// is going to swamp that 0.0000001-second fiction.

- tye

In reply to Re^3: Remove all non alphanumeric characters excluding space, underscore and minus sign (Benchmark--) by tye
in thread Remove all non alphanumeric characters excluding space, underscore and minus sign by ikkeniet

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.