in reply to Re^2: Fast Replacement (0.000025s)
in thread Fast Replacement

Looking again I see you're right.

But still, rather than invoking the regex engine 50,000 times, better to search for the position of the 50,000th ! and then replace in one pass.

#! perl -slw use strict; use Time::HiRes qw[ time ]; my $s = '1234!' x 55e3; my $start = time; my( $p, $c ) = ( 0, 50e3 ); 1 while --$c and $p = 1+ index $s, '!', $p; substr( $s, 0, $p ) =~ tr[!][\n]; printf "Took %f seconds\n", time() - $start; __END__ C:\test>junk71;; Took 0.011771 seconds C:\test>junk71;; Took 0.009690 seconds

That could probably be sped up with a binary chop for the position, but it hardly seems worth it.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^4: Fast Replacement (0.01 seconds)
by davido (Cardinal) on Jun 14, 2013 at 15:49 UTC

    For what it's worth, depending on how I measure, this is at least ten times faster than my solution. In real life it would be quite a bit more than 10x the speed of my solution -- I just used Benchmark to test, and that required making a copy of the input string on each test iteration so as not to mess with the original. Since the OP wouldn't be making copies (hopefully), that could be factored out, and would make the difference between our two algorithms all the more significant.


    Dave

      making a copy of the input string on each test iteration so as not to mess with the original.

      Indeed, it is a pig to benchmark. Here's my attempt.

      What I did was have the first iteration do tr[!][\n] and the second tr[\n][!], using a flag to keep track of odd & even. It also shows how the problem some people level at tr -- the need to know the lists at compile time -- can be addressed:

      #! perl -slw use strict; use Benchmark qw[ cmpthese ]; sub makeTR{ eval "sub{ \$_[ 0 ] =~ tr[$_[0]][$_[1]] }"; } our $N //= 10; die "$N must be even and positive" if $N &1 or $N < 2; our $tr1 = makeTR( '!', "\n" ); our $tr2 = makeTR( "\n", '!' ); our $flag = 0; our $s = '1234!' x 55e3; cmpthese $N, { a => q[ if( $flag ) { my( $p, $c ) = ( 0, 50e3 ); 1 while --$c and $p = index $s, "\n", $p; $tr2->( substr $s, 0, $p ); $flag ^= 1; } else { my( $p, $c ) = ( 0, 50e3 ); 1 while --$c and $p = index $s, "!", $p; $tr1->( substr $s, 0, $p ); $flag ^= 1; } ], b => q[ if( $flag ) { $s =~ s/\n(??{ ( $myregexp::count++ < 50000 ) ? '' : '(?!) +' })/!/g; $flag ^= 1; } else { $s =~ s/!(??{ ( $myregexp::count++ < 50000 ) ? '' : '(?!)' + })/\n/g; $flag ^= 1; } ], };

      And the results put tr 5x to 30x times faster, so your benchmark isn't bad at all:

      C:\test>junk71 -N=2 (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) s/iter a b a 5.84 -- -85% b 0.899 550% -- C:\test>junk71 -N=4 (warning: too few iterations for a reliable count) s/iter a b a 5.81 -- -92% b 0.492 1081% -- C:\test>junk71 -N=10 s/iter a b a 5.78 -- -95% b 0.273 2013% -- C:\test>junk71 -N=20 s/iter a b a 5.74 -- -97% b 0.176 3167% --

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Wait, please tell me I'm reading the results wrong.... In my benchmarks yours was faster. But in your benchmarks, "a", which is your algorithm, is taking 5.xx seconds per iteration, whereas "b", which is mine, is taking 0.17-0.9 seconds per iteration. Your benchmark seems to be showing the regexp approach winning by a landslide.


        Dave

Re^4: Fast Replacement (0.01 seconds)
by muba (Priest) on Jun 14, 2013 at 12:19 UTC

    poetry po·e·try /ˈpəʊɪtri/
    noun [mass noun]
        1 while --$c and $p = 1+ index $s, '!', $p;