Regexes for Case Change

arunhorne has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Regexes for Case Change by Molt (Chaplain) on May 08, 2002 at 11:23 UTC
One way to do this is the code below. It goes through the string finding a word boundary followed by a series of word-characters and another word boundary and replaces it with the lcfirst'ed version of the word. Another way would be to match on \b\w and replace with the lowercased version of the letter. To me though this version seems more readable. The other version may well be more efficient, but I think that's a question for the benchmarkers. `#!/usr/bin/perl -w use strict; my $test = "This Is A [Big] [Nasty Old] [Test]"; $test =~ s/\b(\w+)\b/lcfirst $1/eg; print $test;` [download] Update: Okay, with the odd formatting I misread things as needing uppercasing.. fixed now though. If you're just trying to reduce everything to lowercase though just use 'lc $test'- It's quicker, easier, and does exactly what it says on the tin.	[reply] [d/l]
Re: Regexes for Case Change by jmcnamara (Monsignor) on May 08, 2002 at 11:36 UTC
You could use something like this: `#!/usr/bin/perl -w use strict; my $line = "Hello world [Pyruvate] [pyruvate]\n"; $line =~ s/\b(\w+)\b/\l$1/g; print $line; __END__ prints: hello world [pyruvate] [pyruvate]` [download] However, this may be overkill if you can just `lc()` the entire line. -- John.	[reply] [d/l]
Re: Regexes for Case Change by rob_au (Abbot) on May 08, 2002 at 11:40 UTC
If you are looking to drop the case on all of the characters in your string, you could easily perform this with the transliteration operator - For example: `$string =~ tr [A-Z] [a-z];` [download]	[reply] [d/l]
Re: Re: Regexes for Case Change by Molt (Chaplain) on May 08, 2002 at 12:08 UTC
Ithink it's generally better to do this with the lc operator rather than tr since lc handles localisation character sets (Umlauts and so forth) and unicode properly. Not that I think it matters in this case, but it's probably one of those things where when you get into one style you may as well get into the one which won't make you trip when you expand what you're working with.	[reply]
Re: Re: Re: Regexes for Case Change by jmcnamara (Monsignor) on May 08, 2002 at 13:28 UTC
I think it's generally better to do this with the lc operator rather than tr since lc handles localisation character sets Only if "use locale" is in effect. Otherwise the following is unlikely to do anything: `print uc 'ü';` This assertion also depends on what the "general" case is considered to be. The general case is probably a single character set so a transliteration, as shown by rob_au, is probably sufficient. -- John.	[reply] [d/l]
Re: Re: Re: Regexes for Case Change by rob_au (Abbot) on May 08, 2002 at 12:53 UTC
While the perlfunc:tr operator may not handle localisation character sets, it does have the advantage over substitution of speed as it doesn't perform interpolation or use the regex engine. As such, the choice between functions really comes down to the data being manipulated and whether character and locale classes will come into effect. The transliteration solution was provided more so for proof of TMTOWTDI, YMMV.	[reply]
Re: Regexes for Case Change by arunhorne (Pilgrim) on May 08, 2002 at 11:20 UTC
Sorry for my lame use of formatting, I clicked 'submit' when I meant preview having made changes	[reply]
Fixing those slips of a finger by talexb (Chancellor) on May 08, 2002 at 13:35 UTC
Sorry for my lame use of formatting, I clicked 'submit' when I meant preview having made changes Just go back, click on the title, and you can edit again to your hearts content. That's how people Update their questions or replies. The manual is your friend. :) --t. alex "Nyahhh (munch, munch) What's up, Doc?" --Bugs Bunny	[reply]