wtritchie has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I need a regex to only show every 2nd character of the same character in a string. I've tried racking my head to do it with back references and look ahead and look back but can't figure it out. So, to clarify if you had a string of "Hello this seems to be a good day" then if you wanted only the second 'e' kept it would strip be: "Hllo this sems to b a good day" So only the second 'e' is kept, hope this clarifies things. Thanks Will
  • Comment on Regex to only show 2nd character of the same character in a string

Replies are listed 'Best First'.
Re: Regex to only show 2nd character of the same character in a string
by AnomalousMonk (Archbishop) on Aug 23, 2010 at 03:29 UTC

    wtritchie:

    I'm having trouble figuring out just what you want. Given the string 'abccabbcabc', just what output do you need?

    Also, can you elaborate on just how you differentiate between the first character and the second character of 'the same character in a string' given that they are the same character? E.g., are you looking for the positions of each second character?

    Update: Clarified who 'you' refers to in this reply.

      then...
      my $str = 'abccabbcabc'; my %seen; while ($str =~ /(.)/g) { printf "character %s found at pos %d, reps %d\n", $1, pos($str), $ +seen{$1} if $seen{$1}++; }

        Yes, but is that what wtritchie really wants? That's what I'm still in the dark about, and likely to remain so until wtritchie throws some light on the subject.

        BTW: ikegami's guess (Update: anent UTF-16 or some such) seems like it might be somewhere near the ballpark; ikegami possesses a preternatural ability for making such guesses.

        Update: ikegami has withdrawn his first reply in its entirety, but I'd still be willing to bet a doughnut to a million computrons that something like his UTF-16 guess is not far wrong. But wtritchie holds the key to all this, and further wtritchie sayeth not.

Re: Regex to only show 2nd character of the same character in a string
by ikegami (Patriarch) on Aug 23, 2010 at 03:27 UTC
    my @chars = /.(.)/sg;

    By any chance, are you trying to decode UTF-16 improperly? (Is every second character a NUL, "\0"?)

    (I missed the "same character" bit.)

Re: Regex to only show 2nd character of the same character in a string
by ikegami (Patriarch) on Aug 23, 2010 at 17:20 UTC
    my @chars = grep 0==(++$seen{$_})%2, /./sg;

    For example,

    $ perl -E'say for grep 0==(++$seen{$_})%2, "abccabbcabc" =~ /./sg;' c a b b c
Re: Regex to only show 2nd character of the same character in a string
by BrowserUk (Patriarch) on Aug 23, 2010 at 18:02 UTC

    My guess to your meaning would be:

    print $2 while 'abccabbcabc' =~ m[(c).*?(\1)]g;; c c

    As for your purpose, that's beyond me. Care to enlighten us?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Regex to only show 2nd character of the same character in a string
by ww (Archbishop) on Aug 23, 2010 at 20:11 UTC
    And just for variety (and also because this really *IS* the one way I can make OP's description make sense), do you mean you want to "show" the second char of each contiguously paired char?

    That would be, in the example provided by salva, 'abccabbcabc', and used by several other respondents:

    c   b

    which this provides:

    #!/usr/bin/perl use warnings; use strict; # 856624 my $str = 'abccabbcabc'; my @chars = split(//, $str); my $seen = '-'; for my $char(@chars) { if ($char =~ $seen ) { print $char . "\t|\t"; # skip the tabs and pipe if desired } else { $seen = $char; } } =head Output: c | b | =cut
      If you're right, then the following would do:
      my @chars = /(.)\1/sg;