b4swine has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks, I want to replace a whole word s/\bfind\b/repl/ig, but if the word is all lowercase, all uppercase, first letter caps, or mixed case, then so should the replacement (the last is meaningless, so we make it all lowercase). A slow and bad way to do this is:
s/\bfind\b/repl/g; s/\bFIND\b/REPL/g; s/\bFind\b/Repl/g; s/\bfind\b/repl/ig;
But this would do the wrong thing if the replacement text contained part of the find text as a substring. As well as would be slow.

Better perhaps would be to write a loop, doing a case insensitive search, followed by a replace. What would be fastest (and most elegant)?

Replies are listed 'Best First'.
Re: Respect case in substitution
by ysth (Canon) on Feb 27, 2008 at 09:19 UTC
    Something like:
    s/\b(find)\b/$1 eq uc($1) ? "REPL" : $1 eq ucfirst(lc($1)) ? "Repl" : +"repl"/gie
    or more flexibly:
    my %repl = ( find => 'repl', FIND => 'REPL', Find => 'Repl', ); s/\b(find)\b/$repl{$1}||$repl{lc $1}/gie
Re: Respect case in substitution
by moritz (Cardinal) on Feb 27, 2008 at 09:30 UTC
    I know that won't help you very much, but Perl 6 knows the :ii modifier, which can transport case information on a char by char basis, or it detects if the matched text has a "simple" case (like all upper, all lower, ucfirst, lcfirst, captilized), and applies that informaion to the substitution string.

    You can implement something like that in perl 5, but not with such a nice syntax:

    s/\b(find)\b/transport_case($1, 'repl')/eig; sub transport_case { ... }
    The complexity of transport_case strongly depends on what you want to achieve. The first described behaviour is a simple matter of iterating over all the chars, and testing/applying the case.
Re: Respect case in substitution
by ikegami (Patriarch) on Feb 27, 2008 at 10:42 UTC
    s/\b(find)\b/ uc('repl') | ( $1 ^ uc($1) ) /eig;

    Only guaranteed to work for search strings and replacement strings consisting entirely of ASCII letters. Non-letter and accented characters won't work. EBCDIC won't work. It also only works if $1 and $find are the same length.

    Update: Added clarification (by adding "guaranteed") and an extra failure mode.

      What is your definition of "won't work" here?
      #!/usr/bin/perl -l use strict; my $find="find=1"; my $repl="repl:2"; for my $trial (qw/find=1 Find=1 FIND=1 fInD=1/) { $_ = "here is >$trial< data"; s/\b($find)\b/ uc($repl) | ( $1 ^ uc($1) ) /eig; print }
      For me, that produces:
      here is >repl:2< data here is >Repl:2< data here is >REPL:2< data here is >rEpL:2< data
      Do you see something wrong with that? I'll grant that many accented characters won't work properly, in the sense that you'll get an incorrect character as the result, but actually, there are a fair number of them where the case distinction is a matter of a single bit being on or off (just like in the ASCII letters), and I'd expect those to work. (But I don't have time to test that just now -- and I'm sure not going to argue about EBCDIC...)

        It will *happen* to work for some invalid inputs. Here are cases that support what I said:

        Non-letters in $repl:

        my $find= "find"; my $repl= "@@@@"; $_ = "Find"; print; s/\b($find)\b/ uc($repl) | ( $1 ^ uc($1) ) /eig; print; # @``` # Should be @@@@

        Non-letters in $find:

        my $find= chr(1234) . 'find'; my $repl= "aaaaa"; $_ = chr(1234) . 'FiNd'; print; s/\b($find)\b/ uc($repl) | ( $1 ^ uc($1) ) /eig; print; # AAaAa # Should be aAaAa