d4vis has asked for the wisdom of the Perl Monks concerning the following question:

Hullo,
I've been reading an online tutorial that illustrates a simple substitution of "them" for "us" in the following string:
$_='Us ? The bus usually waits for us, unless the driver forgets us.';
The tutorial's solution is the following:
$_='Us ? The bus usually waits for us, unless the driver forgets us.'; print "$_\n"; s/\b([Uu])s(\W)/chr(ord($1)-1).hem.$2/eg; print "$_\n";
This works, but when I got to thinking about it...
$_='Us? The bus usually waits for us, unless the driver forgets us.'; print "$_\n"; s/\bus\b/them/ig; print "$_\n";
also works and seems simpler to a newbie like me. I guess I'm just wondering if there's something about substitutions that I don't know yet which makes the simpler solution 'bad practice' or sloppy programming? I get the sense that the tutorial chose the solution it did based on something I don't yet know about regex's, but I can't figure out what that is.
Any guidance greatly apreciated.
-d4vis
#!/usr/bin/fnord

Replies are listed 'Best First'.
(Ovid) Re: a simple substitution question
by Ovid (Cardinal) on Aug 21, 2000 at 22:31 UTC
    tilly: The author of the tutorial may have been trying to show off, but his/her answer is more correct than the answer that d4vis provided. What happens with the following?

    'Who\'s going to the store?" "Us," she replied."'

    Yeah, I know. "She" is using pitiful grammar, but it's all I could come up with on the spur of the moment.

    Here, you lose capitalization with d4vis's regex. The unknown author came up with a nifty trick to preserve capitalization. However, it's not very clear. I'm also wondering if some versions of locale might break it depending upon the alphabet used.

    Instead, I'd use something like the following (which I feel is more clear -- but untested):

    s/\b([Uu])s\b/$1 eq 'U' ? Them : them/eg;
    Cheers,
    Ovid

    Update: Oy! That's what I get for untested code. I guess tilly and I will send the rest of the day spanking each other (figuratively speaking).

    Here's the correct version of the regex (which I still think is clearer than tilly's solution):

    $test =~ s/\b([Uu])s\b/$1 eq 'U' ? "Them" : "them"/eg;
    Tilly's solution, however, is better with multiple substitutions.
      D'oh! Actually the author was even more clever and figured out how to leave 'US' alone. Your solution breaks under strict though. Try this:
      my %tr = qw(U T u t); s/\b([Uu])s\b/$tr{$1}hem/g;
      BTW I would wonder about the author's solution working with Unicode...
RE (tilly) 1: a simple substitution question
by tilly (Archbishop) on Aug 21, 2000 at 22:19 UTC
    Looks to me like the person writing the tutorial was just trying to show off. I would write it exactly like you did. To my eyes that is simpler, clearer, and should execute faster as well.

    Did I mention more reliable? Try the string, "What should happen with a trailing us"?