Micz has asked for the wisdom of the Perl Monks concerning the following question:

hello,
I am replacing strings ina string, using regexs. I would like to match e.g. 2: when it stands alone (a 2: U) but not when up against another string (a 2:A). Additionally, it should also work when the 2: is at the end of the string. Here is what I tried:

$f =~ s/\s(i|a|e|o|u|y|E):\s/ \1 /;

but \s doesn't work at the end of a string. Word boundary \b doesn't work for the ":", what is the best way to do this?

thanks! jan

Replies are listed 'Best First'.
Re: finding nonword character at end of strings
by Abigail (Deacon) on Jun 26, 2001 at 23:35 UTC
    First, you shouldn't use all that alternation, it's much slower (and harder to read) than a character class. Second, you want to have a so called zero width negative lookahead. You want to make sure that what follows doesn't match some regular expression. Third, don't use \1 in the replacement, use $1.
    $f =~ s/\s([iaeouyE]):(?!\S)/ $1/;
    Replace a whitespace, a vowel, and a colon, not followed by something that isn't whitespace, with a space and said vowel. Alternatively, if you know the first whitespace is always a space (or if you just want to keep whatever whitespace it was), you could use a zero width positive lookbehind:
    $f =~ s/(?<=\s[iaeouyE]):(?!\S)//;
    The (?<= ) construct is the lookbehind. There's more about lookaheads and lookbehinds in the perlre manual page.

    -- Abigail

Re: finding nonword character at end of strings
by busunsl (Vicar) on Jun 26, 2001 at 14:27 UTC
    Use a character class:

    $f =~ s/\s(i|a|e|o|u|y|E):(\s|$)/ $1 /;

    I also changed the \1 to a $1, since $1 will catch the matched value in parens.

    Update: Of course iakobski and Hofmator are right, so don't look here, look further down.
    I changed the errornous code.

      Good try, but you cannot use "end of string" in a character class. Also, the parser sees $] as a variable and cannot find the closing brace.

      Try this one:

      $f =~ s/\s(i|a|e|o|u|y|E):(\s|$)/ $1$2/;
      It also captures the space or end of string and uses that rather than a space: this may or may not be what you want.

      -- iakobski

        In your solution the $ in the character class uses its special meaning and becomes a literal '$'. You have to use a |-construct. Furthermore I would use a characterclass for the first part ... this leads to

        $f =~ s/\s([iaeouyE]):(?:\s|$)/ $1 /;

        Update: Oops, this was meant as a reply to busunsl not to iakobski ...

        Update 2: and bikeNomad discovered some left out paras ... thanks, fixed it

        -- Hofmator