in reply to Removing characters

Just for fun, I'll assume that I got the output from "man" and someone else has removed the backspaces for me so I want to remove the duplicates... And based on my experience, I may end up with tripled or quadrupled letters as well.

You could do pretty well by finding "words" that contain only doubled letters and modifying those. But a "good" way to do this hasn't popped into my brain yet...

my $inWord= "[-\\w'(),]"; my $notWord= "[^-\\w'(),]"; s#($notWord)(($inWord)\3(?:$inWord)*($inWord)\4)($notWord)# my( $pre, $word, $post )= ( $1, $2, $5 ); my $len= length($word); for( $word =~ /(.)(\1*)/g ) { $len= length($2) if length($2) < $len; } $word =~ s/(.)\1{$len}/$1/g if 0 < $len; $pre . $word . $post; #ge
Like I said, that doesn't seem like a great way to do it (untested as well). :-}

        - tye (but my friends call me "Tye")

Replies are listed 'Best First'.
Re: (tye)Re: Removing characters
by chipmunk (Parson) on Jan 09, 2001 at 04:03 UTC
    Here's another way of doing it, inspired by your code:
    my $word = q{-\w'(),}; s{ (^|[^$word]) ( (?:([$word])\3)+ ) (?=[^$word]|$) }{ my @x = ($1, $2); $x[1] =~ s/(.)./$1/g; join '', @x; }xige; }
    That matches a complete "word" that consists entirely of doubled characters. The "word" is in $2; $1 holds the preceeding character. In the replacement, since I know that $2 contains only doubled characters, I just delete every other character. (Tested.)