in reply to Re: should this backspace removal code be done better?
in thread should this backspace removal code be done better?

Small bug:

#!/usr/bin/perl -wl $s = "this is an\b\b not correct usage"; $s =~ s[.\cH][]g; print $s; print "and one's still there" if grep { ord == ord"\b" } split //,$s; __END__ output: this is not correct usage and one's still there

It looks correct at first glance, but the second backspace is still there.

Update: Much better. Only problem is in the case of a leading \b, you can get into an infinite loop. However only the regex needs alteration.

$s =~ s/(?:[^\cH]\cH|^\cH+)//g while ...

antirice    
The first rule of Perl club is - use Perl
The
ith rule of Perl club is - follow rule i - 1 for i > 1

Replies are listed 'Best First'.
Re: Re: Re: should this backspace removal code be done better?
by smackdab (Pilgrim) on Oct 05, 2003 at 06:46 UTC
    AWESOME, it sinks in slowly ;-)

    This solves the problem from the poster below...
    $s = "\bthis is an\b correct\b\b\b usage\b"; do 1 while ($s =~ s/(?:[^\cH]\cH|^\cH+)//g);

      A matter of style, but I don't care for do 1 while ... since you could just do a while with an empty block. Anyhow, I decided to benchmark all three ways:

      #!/usr/bin/perl -w use Benchmark qw(cmpthese); $s = "\bthis is an\b correct\b\b\b usage\b"; sub uk { $a = $s; $a =~ s[(?:[^\cH]\cH|^\cH)][]g while 1+index $a, chr(8); $a; } sub new1 { $a = $s; while ($a =~ s/(?:[^\cH]\cH|^\cH+)//g) {} $a; } sub smack { $a = $s; do 1 while ($a =~ s/(?:[^\cH]\cH|^\cH+)//g); $a; } cmpthese(-5,{uk=>\&uk,new1=>\&new1,smack=>\&smack}); $s x= 100; cmpthese(-5,{uk=>\&uk,new1=>\&new1,smack=>\&smack}); __END__ Benchmark: running new1, smack, uk for at least 5 CPU seconds... new1: 5 wallclock secs ( 5.23 usr + 0.00 sys = 5.23 CPU) @ 31 +024.33/s (n=162393) smack: 6 wallclock secs ( 2.16 usr + 3.05 sys = 5.22 CPU) @ 29 +16.79/s (n=15222) uk: 5 wallclock secs ( 5.19 usr + 0.00 sys = 5.19 CPU) @ 36 +570.60/s (n=189710) Rate smack new1 uk smack 2917/s -- -91% -92% new1 31024/s 964% -- -15% uk 36571/s 1154% 18% -- Benchmark: running new1, smack, uk for at least 5 CPU seconds... new1: 6 wallclock secs ( 5.48 usr + 0.00 sys = 5.48 CPU) @ 36 +8.11/s (n=2016) smack: 6 wallclock secs ( 5.01 usr + 0.41 sys = 5.41 CPU) @ 32 +6.56/s (n=1768) uk: 5 wallclock secs ( 5.23 usr + 0.00 sys = 5.23 CPU) @ 46 +2.14/s (n=2419) Rate smack new1 uk smack 327/s -- -11% -29% new1 368/s 13% -- -20% uk 462/s 42% 26% --

      For whatever reason, your version hits sys very hard. However, BrowserUK's version is the fastest of the pack.

      Hope this helps.

      antirice    
      The first rule of Perl club is - use Perl
      The
      ith rule of Perl club is - follow rule i - 1 for i > 1

        Not any more. This new smack is a lot faster by eliminating the alternation.

        #!/usr/bin/perl -w use Benchmark qw(cmpthese); $s = "\bthis is an\b correct\b\b\b usage\b"; sub uk { $a = $s; $a =~ s[(?:[^\cH]\cH|^\cH)][]g while 1+index $a, chr(8); $a; } sub new1 { $a = $s; while ($a =~ s/(?:[^\cH]\cH|^\cH+)//g) {} $a; } sub smack { $a = $s; $a =~ s/^\cH+//; 1 while ($a =~ s/[^\cH]\cH//g); $a; } cmpthese(-5,{uk=>\&uk,new1=>\&new1,smack=>\&smack}); $s x= 100; cmpthese(-5,{uk=>\&uk,new1=>\&new1,smack=>\&smack}); Benchmark: running new1, smack, uk, each for at least 5 CPU seconds... new1: 6 wallclock secs ( 5.39 usr + 0.03 sys = 5.42 CPU) @ 11 +344.28/s (n=61486) smack: 6 wallclock secs ( 5.24 usr + 0.02 sys = 5.26 CPU) @ 36 +814.64/s (n=193645) uk: 4 wallclock secs ( 5.26 usr + 0.06 sys = 5.32 CPU) @ 13 +261.09/s (n=70549) Rate new1 uk smack new1 11344/s -- -14% -69% uk 13261/s 17% -- -64% smack 36815/s 225% 178% -- Benchmark: running new1, smack, uk, each for at least 5 CPU seconds... new1: 5 wallclock secs ( 5.22 usr + 0.05 sys = 5.27 CPU) @ 14 +0.04/s (n=738) smack: 6 wallclock secs ( 5.26 usr + 0.07 sys = 5.33 CPU) @ 87 +3.55/s (n=4656) uk: 5 wallclock secs ( 5.18 usr + 0.04 sys = 5.22 CPU) @ 17 +6.44/s (n=921) Rate new1 uk smack new1 140/s -- -21% -84% uk 176/s 26% -- -80% smack 874/s 524% 395% --

        /-\

      Catering for multiline input and assuming backspaces at beginning of lines should simply be removed:

      $s = "\b\bthis is an\b correct\b\b\b usage\b\n\b2nd\n\b\b3rd\n"; $s =~ s/^[\b]+//mg; 1 while $s =~ s/[^\b][\b]//g;

      /-\