in reply to Cleaning the Log

Actually this is more complicated than a pure regex can handle (perl regexes _may_ be able to handle this, but the code would be scaaaary), this problem is even more difficult than handling balanced constructs and cannot be solved using a formal regular expression.

This is because we cant simply look for .{BS}, as is clear from your sample data such as bnack{BS}{BS}{BS}{BS} where the last {BS} actually remove the 'n'.

But only a slight amount of additional code will allow a working solution....

my $logdata=<<EOLOG; pass{BS}{BS}{BS}{BS}{BS}{BS} hee{BS}llo, I'll be bnack{BS}{BS}{BS}{BS}ack next saturday.{BS}{BS}{BS +}{BS}{BS}{BS}{BS}{BS}{BS}{BS}{BS}{BS} www.yahoo.com "keyloggers" + "linux"{BS}{BS}{BS} EOLOG my $clean; while ($logdata=~/\G(\{BS}|.)/sg) { length ($1)>1 && (length($clean)==0 || substr($clean,length($clean +)-1,1,"")) or $clean.=$1; } print $clean;
YMMV

Yves / DeMerphq
---
Software Engineering is Programming when you can't. -- E. W. Dijkstra (RIP)

Replies are listed 'Best First'.
Re^2: Scaring the Log
by Aristotle (Chancellor) on Sep 03, 2002 at 14:34 UTC
    perl regexes _may_ be able to handle this, but the code would be scaaaary
    I felt like doing something scary. *grin* The trick is to use sexeger - a regex that operates on the reversed string.
    s/{BS}/\b/g; $_ = reverse $_; my $k = 0; s/(\010+)(??{ $k += length $+; "([^\010]{0,$k})" })(?{ $k -= length $+ + })//g; $_ = reverse $_;
    Don't forget to tune in for the next issue of H.R. Giger meets Perl. *grin*

    Makeshifts last the longest.