Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I need to substitute a word on many many pages. The word "old" need to be changed to "new" I dont want something like this to happen: lettersoldmoreletters switched to lettersnewmoreletters. I only want it changed like this: old to new with no letters around the word. Do I need to use a boundry for reg expression to do this?
s/old/new s/\bbold\b/\bnew\b/gi

Replies are listed 'Best First'.
Re: boundary reg expression
by Skeeve (Parson) on Jul 21, 2003 at 12:11 UTC
    You don't need too, but it might simplify your regexp ;-)
    s/\bbold\b/new/gi
    should do it.
      rather s/\bold\b/new/gi, this typo is already in the original post ...

      And be aware that this changes: I'm doing fine. Old things are nice. to I'm doing fine. new things are nice.

      -- Hofmator

        of course s/\bold\b/new/gi. That's Why I wrote it.
Re: boundary reg expression
by allolex (Curate) on Jul 21, 2003 at 12:32 UTC

    s/\bold\b/new/gi;, just like Skeeve said. :)

    It might help you to remember that the boundary matcher is a so-called "zero-width assertion", which only appears on the left-hand side of a substitution. The newline (\n) and other special sequences that involve formatting and printing can be on both sides. That's because Perl treats the regular expressions like double-quoted strings. You can get more info on this by doing perldoc perlre at the shell/command prompt.

    HTH

    --
    Allolex

      Thanks all, So If I need to replace OLD with NEW then I do this?
      s/OLD\b/NEW/g;
      All my replacements words with OLD are capitalized and the new word is capitalized.

        Well, sort of ;) Take these examples:

        s/\bold\b/new/gi; # 1: replaces all instances of the string, e.g. "old", # "OLD", "Old", "OlD" on either side with "new". s/\bold\b/new/g; # 2: replaces "old" with "new" -- case sensitive here. s/\bOLD\b/NEW/g; # 3: replaces "OLD" with "NEW" s/\bOLD\b/NEW/gi; # 4: replaces same variants of "OLD" as in 1 with "NEW"

        So, to finally answer your question: yes, but I think you might need that other "\b".

        Something that might help you a lot, as it did for me, is YAPE::Regex and YAPE::Regex::Explain by our very own japhy. They can be downloaded off the CPAN. For example:

        #!/usr/bin/perl use YAPE::Regex::Explain; print YAPE::Regex::Explain->new(qr/\bold\b/i)->explain;

        gives the following explanation.

        The regular expression: (?i-msx:\bold\b) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?i-msx: group, but do not capture (case-insensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- \b the boundary between a word char (\w) and something that is not a word char ---------------------------------------------------------------------- old 'old' ---------------------------------------------------------------------- \b the boundary between a word char (\w) and something that is not a word char ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

        --
        Allolex

        Add a boundary to the beginning and that should do the trick (watch out for words like FOLD, TOLD, etc). Also note that you don't need to write a full Perl script to make the changes to the files in question. You can do this with a one-liner:
        perl -pi -e's/\bOLD\b/NEW/g' *.txt
        And if you are paranoid and want to keep the orignals:
        perl -pi.bak -e's/\bOLD\b/NEW/g' *.txt
        will preserve them with a .bak "extension" added to the end of the filename. (For example, foo.txt becomes foo.txt.bak)

        jeffa

        L-LL-L--L-LL-L--L-LL-L--
        -R--R-RR-R--R-RR-R--R-RR
        B--B--B--B--B--B--B--B--
        H---H---H---H---H---H---
        (the triplet paradiddle with high-hat)