eduardo has asked for the wisdom of the Perl Monks concerning the following question:

ok, a friend of mine is taking a perl/cgi class in school and she had a problem posed to her. using perl, how do you make duplicate letters collapse to a single letter? example: let us say you have the word 'letters' it should colapse to 'leters' and 'foo' should collapse to 'fo' but 'fooo' should be 'foo' because the 3rd o isn't a duplicate... am i making myself clear? i am sure there is some neato tr// or s// that i could have used, however i am pitiful with regexp's so the best i could come up with was:
#!/usr/bin/perl -w use strict; while (<>) { my $temp = ''; print join ('', map { if ($_ eq $temp) { $temp = ''; } else { $temp = $_; } } split('', $_) ); }
anyone have any better ideas? i would love to see some other cool perlish ways of doing this...

Replies are listed 'Best First'.
Re: removing duplicate letters
by plaid (Chaplain) on Jun 18, 2000 at 23:50 UTC
    This seems to work:
    s/(.)\1/$1/g;
    The tr operator does what you want, except it squashes all duplicates, not just two. For reference though:
    tr/a-zA-Z//s;
      Actually, the dot operator doesn't fit his definition. eduardo specified letters so we should change it to
      s/(\w)(\1)/$1/g;
      Update: Oops! takshaka exposed me for the fake I am! (see reply below)
        If we're going to be pedantic, \w matches nonletters as well. Use a character class with the appropriate definition of "letters".

        s/([a-zA-Z])\1/$1/g;

      damn it ;) that's it, that's what i wanted...
      s/(.)\1/$1/g;
      oh well, what i kept on trying was:
      s/(.)\$1/$1/g;
      and that sure as hell didn't work ;) thank you very much!
Re: removing duplicate letters
by chromatic (Archbishop) on Jun 19, 2000 at 01:45 UTC
    Here's a different approach. I may benchmark this later:
    $newword = join '', split(/(\w+)\1/, $word);
    Whoops, missed the + the first time! Let's say the first was a warm up, and the second actually works:
    $newword = join '', split(/((\w)+)\1(\2{0,1})/, $word);
    There. Appropriately hard to read.

    Update: Yes, I had the backreferences backward, which breaks things, as the Anonymous Monk points out below. That'll teach me to be crafty.

      Trying the latest suggestion...

      $s = "foooobaaaarrr";
      print "Before: $s\n";
      $newword = join '', split(/((\w)+)\2(\1{0,1})/, $s);
      print "After:  $newword\n";
      

      Result: no change. Not hereabouts, anyway.

(jcwren) RE: removing duplicate letters
by jcwren (Prior) on Jun 18, 2000 at 23:53 UTC
    It seems like your rule needs some clarification before this problem could be solved. What dictatates that 'fooo' should be collapsed to 'foo', but not 'fo'? What does 'foooo' collapse to? 'foo' or 'fooo'?

    --Chris