Re^3: Fast Replacement (0.01 seconds)

Looking again I see you're right.

But still, rather than invoking the regex engine 50,000 times, better to search for the position of the 50,000th ! and then replace in one pass.

#! perl -slw
use strict;
use Time::HiRes qw[ time ];

my $s = '1234!' x 55e3;

my $start = time;
my( $p, $c )  = ( 0, 50e3 );

1 while --$c and $p = 1+ index $s, '!', $p;

substr( $s, 0, $p ) =~ tr[!][\n];

printf "Took %f seconds\n", time() - $start;

__END__
C:\test>junk71;;
Took 0.011771 seconds

C:\test>junk71;;
Took 0.009690 seconds
[download]

That could probably be sped up with a binary chop for the position, but it hardly seems worth it.

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

Comment on Re^3: Fast Replacement (0.01 seconds) Download Code

Replies are listed 'Best First'.
Re^4: Fast Replacement (0.01 seconds) by davido (Cardinal) on Jun 14, 2013 at 15:49 UTC
For what it's worth, depending on how I measure, this is at least ten times faster than my solution. In real life it would be quite a bit more than 10x the speed of my solution -- I just used Benchmark to test, and that required making a copy of the input string on each test iteration so as not to mess with the original. Since the OP wouldn't be making copies (hopefully), that could be factored out, and would make the difference between our two algorithms all the more significant. Dave	[reply]
Re^5: Fast Replacement (0.01 seconds) by BrowserUk (Patriarch) on Jun 14, 2013 at 19:42 UTC
making a copy of the input string on each test iteration so as not to mess with the original. Indeed, it is a pig to benchmark. Here's my attempt. What I did was have the first iteration do `tr[!][\n]` and the second `tr[\n][!]`, using a flag to keep track of odd & even. It also shows how the problem some people level at tr -- the need to know the lists at compile time -- can be addressed: #! perl -slw use strict; use Benchmark qw[ cmpthese ]; sub makeTR{ eval "sub{ \$_[ 0 ] =~ tr[$_[0]][$_[1]] }"; } our $N //= 10; die "$N must be even and positive" if $N &1 or $N < 2; our $tr1 = makeTR( '!', "\n" ); our $tr2 = makeTR( "\n", '!' ); our $flag = 0; our $s = '1234!' x 55e3; cmpthese $N, { a => q[ if( $flag ) { my( $p, $c ) = ( 0, 50e3 ); 1 while --$c and $p = index $s, "\n", $p; $tr2->( substr $s, 0, $p ); $flag ^= 1; } else { my( $p, $c ) = ( 0, 50e3 ); 1 while --$c and $p = index $s, "!", $p; $tr1->( substr $s, 0, $p ); $flag ^= 1; } ], b => q[ if( $flag ) { $s =~ s/\n(??{ ( $myregexp::count++ < 50000 ) ? '' : '(?!) +' })/!/g; $flag ^= 1; } else { $s =~ s/!(??{ ( $myregexp::count++ < 50000 ) ? '' : '(?!)' + })/\n/g; $flag ^= 1; } ], }; [download] And the results put tr 5x to 30x times faster, so your benchmark isn't bad at all: `C:\test>junk71 -N=2 (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) s/iter a b a 5.84 -- -85% b 0.899 550% -- C:\test>junk71 -N=4 (warning: too few iterations for a reliable count) s/iter a b a 5.81 -- -92% b 0.492 1081% -- C:\test>junk71 -N=10 s/iter a b a 5.78 -- -95% b 0.273 2013% -- C:\test>junk71 -N=20 s/iter a b a 5.74 -- -97% b 0.176 3167% --` [download] With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l] [select]
Re^6: Fast Replacement (0.01 seconds) by davido (Cardinal) on Jun 14, 2013 at 21:16 UTC
Wait, please tell me I'm reading the results wrong.... In my benchmarks yours was faster. But in your benchmarks, "a", which is your algorithm, is taking 5.xx seconds per iteration, whereas "b", which is mine, is taking 0.17-0.9 seconds per iteration. Your benchmark seems to be showing the regexp approach winning by a landslide. Dave	[reply]
Re^7: Fast Replacement (0.01 seconds) by BrowserUk (Patriarch) on Jun 14, 2013 at 22:13 UTC
Re^8: Fast Replacement (0.01 seconds) by davido (Cardinal) on Jun 14, 2013 at 22:21 UTC
Re^8: Fast Replacement (0.01 seconds) by davido (Cardinal) on Jun 16, 2013 at 16:52 UTC
Some notes below your chosen depth have not been shown here
Re^7: Fast Replacement (0.01 seconds) by BrowserUk (Patriarch) on Jun 14, 2013 at 21:52 UTC
Re^4: Fast Replacement (0.01 seconds) by muba (Priest) on Jun 14, 2013 at 12:19 UTC
poetry po·e·try /ˈpəʊɪtri/ noun [mass noun] `1 while --$c and $p = 1+ index $s, '!', $p;`	[reply] [d/l]