in reply to Regex is eating up whitespace

Aside from this probably being an inadequate approach to solving the actual problem, you are using character set matches where you need anchors. Changing [^<] to (?<!<) avoids matching the preceding character, but ensures that it isn't <. Try:

$text =~ s/\b(?<!<)S\.(?!=)/<S.=initial>/g;

Remember that the substitution replaces all the characters matched so you must either capture and insert any "extra" matched characters or not match them in the first place (anchors don't "match" characters in this sense).


Perl reduces RSI - it saves typing

Replies are listed 'Best First'.
Re^2: Regex is eating up whitespace
by gatito (Novice) on Sep 29, 2008 at 23:42 UTC
    Interesting, I wasn't aware of anchors at all. That definitely makes it easier.

    I ended up coming up with this, which seems to work fine and avoids previously tagged cases, but the anchored version is better.

    $text =~ s/(\bS\.(?=\s))/\<\1=initial\>/g;

      You should use $1 instead of \1 in the substitution. < and > are not magical in regexen and, apart from $, there are no magical characters in the substitution string in any case - you do not need to escape < and >. Perl allows you to use different delimiters for regexes which can often make the regex much easier to read. Consider:

      $text =~ s!(\bS\.(?=\s))!<$1=initial>!g;

      Perl reduces RSI - it saves typing
      You should definitely read the Regex chapter in the Camel book. It's a tough slog - took me about 10 rereads to finally get it. But, it's extremely worth it. I'm no regex master by any measure, but understanding that chapter went a long way to getting me fluent in Perl.

      My criteria for good software:
      1. Does it work?
      2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?