tr/// is faster, but s/// can be used for more complex matches.
I'll support my answer with a
Benchmarked example:
#!/usr/bin/perl -w
use strict;
use Benchmark;
# Generate 1 MB of random data
my $data;
for (my $i = 0; $i <= 1048576; $i++){
$data .= chr int rand 256;
}
my $copy = $data;
study $data;
study $copy;
# Remove X'es
Benchmark::cmpthese(-10, {
's///' => sub { (my $dummy = $data) =~ s/X//g; },
'tr///' => sub { (my $dummy = $copy) =~ tr/X//d; }
});
print $copy eq $data ? "OK\n" : "NOT OK\n";
This script's output:
Benchmark: running s///, tr///, each for at least 10 CPU seconds...
s///: 12 wallclock secs (10.80 usr + 0.02 sys = 10.82 CPU) @ 37
+.89/s (n=410)
tr///: 15 wallclock secs (10.08 usr + 0.08 sys = 10.16 CPU) @ 42
+.81/s (n=435)
Rate s/// tr///
s/// 37.9/s -- -11%
tr/// 42.8/s 13% --
OK
Update 200112212151: I forgot to copy the string before removing the X'es. After the first iteration, there'd be no X'es left. For your entertainment, I present the previous version of my post:
Hi,
I was going to answer that tr/// was faster, and s/// can be used for more complex matches.
I was going to support my answer with a Benchmarked example, which would show that tr/// was a lot faster.
BUT my benchmark told me s/// is the winner. On 1 MB of random data, s/X//g is faster than tr/X//d. If anyone can tell me why this is, or what I'm doing wrong, I'd really appreciate that.
#!/usr/bin/perl -w
use strict;
use Benchmark;
# Generate 1 MB of random data
my $data;
for (my $i = 0; $i < 1048576; $i++){
$data .= chr int rand 256;
}
my $copy = $data;
# Remove X'es
Benchmark::cmpthese(-10, {
's///' => sub { $data =~ s/X//g; },
'tr///' => sub { $copy =~ tr/X//d; }
});
print $copy eq $data ? "OK\n" : "NOT OK\n";
This script's output:
Benchmark: running s///, tr///, each for at least 10 CPU seconds...
s///: 11 wallclock secs (10.52 usr + 0.01 sys = 10.53 CPU) @ 72
+.08/s (n=759)
tr///: 10 wallclock secs (10.53 usr + 0.01 sys = 10.54 CPU) @ 61
+.29/s (n=646)
Rate tr/// s///
tr/// 61.3/s -- -15%
s/// 72.1/s 18% --
OK
2;0 juerd@ouranos:~$ perl -e'undef christmas'
Segmentation fault
2;139 juerd@ouranos:~$
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.