Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

next question...

by 2501 (Pilgrim)
on Dec 22, 2001 at 00:48 UTC ( [id://133875]=note: print w/replies, xml ) Need Help??


in reply to Re: Re: Removing commas and dollar signs from a variable.
in thread Removing commas and dollar signs from a variable.

I am missing the benefits/drawbacks of s/// vs. tr/// .
is it something to truly consider or is it just "more correct" ?

thanks!

Replies are listed 'Best First'.
Re: next question...
by Juerd (Abbot) on Dec 22, 2001 at 01:11 UTC
    tr/// is faster, but s/// can be used for more complex matches.
    I'll support my answer with a Benchmarked example:

    #!/usr/bin/perl -w use strict; use Benchmark; # Generate 1 MB of random data my $data; for (my $i = 0; $i <= 1048576; $i++){ $data .= chr int rand 256; } my $copy = $data; study $data; study $copy; # Remove X'es Benchmark::cmpthese(-10, { 's///' => sub { (my $dummy = $data) =~ s/X//g; }, 'tr///' => sub { (my $dummy = $copy) =~ tr/X//d; } }); print $copy eq $data ? "OK\n" : "NOT OK\n";

    This script's output:
    Benchmark: running s///, tr///, each for at least 10 CPU seconds... s///: 12 wallclock secs (10.80 usr + 0.02 sys = 10.82 CPU) @ 37 +.89/s (n=410) tr///: 15 wallclock secs (10.08 usr + 0.08 sys = 10.16 CPU) @ 42 +.81/s (n=435) Rate s/// tr/// s/// 37.9/s -- -11% tr/// 42.8/s 13% -- OK


    Update 200112212151: I forgot to copy the string before removing the X'es. After the first iteration, there'd be no X'es left. For your entertainment, I present the previous version of my post:
    Hi,

    I was going to answer that tr/// was faster, and s/// can be used for more complex matches.
    I was going to support my answer with a Benchmarked example, which would show that tr/// was a lot faster.

    BUT my benchmark told me s/// is the winner. On 1 MB of random data, s/X//g is faster than tr/X//d. If anyone can tell me why this is, or what I'm doing wrong, I'd really appreciate that.

    #!/usr/bin/perl -w use strict; use Benchmark; # Generate 1 MB of random data my $data; for (my $i = 0; $i < 1048576; $i++){ $data .= chr int rand 256; } my $copy = $data; # Remove X'es Benchmark::cmpthese(-10, { 's///' => sub { $data =~ s/X//g; }, 'tr///' => sub { $copy =~ tr/X//d; } }); print $copy eq $data ? "OK\n" : "NOT OK\n";
    This script's output:
    Benchmark: running s///, tr///, each for at least 10 CPU seconds... s///: 11 wallclock secs (10.52 usr + 0.01 sys = 10.53 CPU) @ 72 +.08/s (n=759) tr///: 10 wallclock secs (10.53 usr + 0.01 sys = 10.54 CPU) @ 61 +.29/s (n=646) Rate tr/// s/// tr/// 61.3/s -- -15% s/// 72.1/s 18% -- OK

    2;0 juerd@ouranos:~$ perl -e'undef christmas' Segmentation fault 2;139 juerd@ouranos:~$

substitution speed vs transliteration speed
by boo_radley (Parson) on Dec 22, 2001 at 01:43 UTC
    yes, tr/// is not a regex, where s/// is. This means that tr/// should move faster. Let's find out! I'm using 3 sets of randomly generated data, and removing any occurence of the letter 'e' in the string. The results? A winner is tr///!

    use Benchmark; $reps=500000; $x.=("a".."z")[rand 26] for (1..256); Benchmark::cmpthese($reps, { 'sub256' => '$_=$x;s/e//g;', 'trn256' => '$_=$x;tr/e//d;', }); print "-"x40,"\n"; $x=""; $x.=("a".."z")[rand 26] for (1..1024); Benchmark::cmpthese($reps, { 'sub1024' => '$_=$x;s/e//g;', 'trn1024' => '$_=$x;tr/e//d;', }); print "-"x40,"\n"; $x=""; $x.=("a".."z")[rand 26] for (1..5120); Benchmark::cmpthese($reps, { 'sub5k' => '$_=$x;s/e//g;', 'trn5k' => '$_=$x;tr/e//d;', }); print "-"x40,"\n";
    Benchmark: timing 500000 iterations of sub256, trn256...
        sub256:  3 wallclock secs ( 3.84 usr +  0.00 sys =  3.84 CPU) @ 130208.33/s (n=500000)
        trn256:  2 wallclock secs ( 1.81 usr +  0.00 sys =  1.81 CPU) @ 276243.09/s (n=500000)
               Rate sub256 trn256
    sub256 130208/s     --   -53%
    trn256 276243/s   112%     --
    ----------------------------------------
    Benchmark: timing 500000 iterations of sub1024, trn1024...
       sub1024: 15 wallclock secs (14.56 usr +  0.00 sys = 14.56 CPU) @ 34340.66/s (n=500000)
       trn1024:  6 wallclock secs ( 6.81 usr +  0.00 sys =  6.81 CPU) @ 73421.44/s (n=500000)
               Rate sub1024 trn1024
    sub1024 34341/s      --    -53%
    trn1024 73421/s    114%      --
    ----------------------------------------
    Benchmark: timing 500000 iterations of sub5k, trn5k...
         sub5k: 66 wallclock secs (65.36 usr +  0.00 sys = 65.36 CPU) @ 7649.94/s (n=500000)
         trn5k: 31 wallclock secs (30.64 usr +  0.00 sys = 30.64 CPU) @ 16318.54/s (n=500000)
             Rate sub5k trn5k
    sub5k  7650/s    --  -53%
    trn5k 16319/s  113%    --
    ----------------------------------------
    
    
Re: next question...
by mrbbking (Hermit) on Dec 22, 2001 at 01:20 UTC
    tr///; uses straight substitution.
    s///; uses the regex engine.
    Straight substitution is relatively faster and 'cheaper'.
Re: next question...
by archen (Pilgrim) on Dec 22, 2001 at 04:51 UTC
    tr/// is for characters, s/// is for strings. tr is generally a better choice where it works because it's more efficent. s/// can do everything tr can because it can work with strings of length 1 (ie - one character), it's just slightly slower. I guess it's more about taste and speed than anything else

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://133875]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (3)
As of 2024-04-20 00:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found