Quick question about pattern matching uppercase letters

cranberry13 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Quick question about pattern matching uppercase letters by mce (Curate) on Apr 27, 2004 at 14:47 UTC
Hi, Some context code would be nice. Anyway, this should do the trick `perl -p -e "s\|\b([A-Z]+)\b\|<i>\1</i>\|g;" yourfile` [download] --------------------------- Dr. Mark Ceulemans Senior Consultant BMC, Belgium	[reply] [d/l]
Re: Re: Quick question about pattern matching uppercase letters by dragonchild (Archbishop) on Apr 27, 2004 at 14:52 UTC
Better is `s\|\b([A-Z]+)\b\|<i>$1</i>\|gm;`. Don't use backreferences if you don't have to. They're difficult to debug when they have an error. ~~Plus, you'll need the /m modifier to match across multiple lines.~~ Update: As pointed out to me, /m isn't needed here. This is a case of learning a rule early and never learning the reasons behind the rule. (/m is for "multiple lines") ------ We are the carpenters and bricklayers of the Information Age. Then there are Damian modules.... sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon.* - flyingmoose	[reply] [d/l]
Re: Re: Re: Quick question about pattern matching uppercase letters by Anomynous Monk (Scribe) on Apr 27, 2004 at 17:06 UTC
//m changes the meaning of ^ and $ and there aren't any of those there, so it isn't needed. //m doesn't do anything else. And using \1 on the right side of a subst isn't actually a backreference and doesn't make anything harder to debug, it's just deprecated syntax. The use of \1 is a sign that the poster forgot to enable warnings, though.	[reply]
Re: Quick question about pattern matching uppercase letters by dragonchild (Archbishop) on Apr 27, 2004 at 14:35 UTC
Firstly, wrap your code in <code> tags. Secondly, you'll need to add more info. Namely: What isn't working? The rest of your script, because the regex may be fine, but you might not be writing the altered text back to the file (for one). Remember, our mind-reading helmets are usually broken. :-) ------ We are the carpenters and bricklayers of the Information Age. Then there are Damian modules.... sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon.* - flyingmoose	[reply]
Re: Quick question about pattern matching uppercase letters by tkil (Monk) on Apr 28, 2004 at 04:42 UTC
Consider using something besides `/` as your regex delimiter when the pattern or replacement uses slashes. Also, you don't need to escape less-than or greater-than signs: `$myfile =~ s!([A-Z]+)!<i>$1</i>!g;` [download] That regex looks fine to me. If I test it on the command line, it works: `$ perl -lwe '$_="hi MOM and DAD"; print; s!([A-Z]+)!<i>$1</i>!g; print' hi MOM and DAD hi <i>MOM</i> and <i>DAD</i>` [download] Although, if you really want words, you should use the word boundary assertions (`\b`) around your series of upper-case letters: `$myfile =~ s!\b([A-Z]+)\b!<i>$1</i>!g;` [download] The fact that you are trying to bind it to something called `$myfile` worries me, though; is that a variable that contains the entire contents of the file, or is it the filehandle itself? If it is the contents, you need to write them back out for the changes to be visible on disk. If it is a filehandle, then you need to loop over all the lines in the file and apply the regex to each line. (Or read the whole thing in.) Either way, you end up as above — you need to write it out for it to be visible on disk. If this is all you are doing, note that perl has a convenience switch (command-line option) to do exactly this: take a list of files, save the original, then apply a program to each file and write out the results to the original filename. See perlrun, look at the -i switch. It is most often used with -p or -n, and often with -l (dash ell), -a, and -e. As an example, to replace "mom" with "dad" in every .txt file in the current directory, saving backup copies of the original to .txt~ files: `perl -i~ -plwe 's/mom/dad/g' *.txt` [download] If you are doing more processing than -i can accomodate, or if this is a part of another process, the template given in the -i documentation can help. To read in a text file and italicize all-caps words for display in HTML, I might do something like this: `open my $fh, "source-data.txt" or die "opening source-data.txt: $!"; print "<blockquote>\n"; while ( my $line = <$fh> ) { # protect against most egregious HTML violations $line =~ s/&/&/g; $line =~ s/</</g; $line =~ s/>/>/g; # mark upper-case words as italic. not locale-safe. $line =~ s!\b([A-Z])+\b!<i>$1</i>!g; # output the result print $fh $line; } print $fh "</blockquote>\n"; close $fh or die "closing source-data.txt: $!";` [download] This code can also show why the default variable (`$_`) is so nice. Notice how much cleaner the while loop gets if we take advantage of the default variabe: `while ( <$fh> ) { # protect against most egregious HTML violations s/&/&/g; s/</</g; s/>/>/g; # mark upper-case words as italic. not locale-safe. s!\b([A-Z])+\b!<i>$1</i>!g; # output the result print $fh $_; }` [download]	[reply] [d/l] [select]
Re: Quick question about pattern matching uppercase letters by dreadpiratepeter (Priest) on Apr 27, 2004 at 14:47 UTC
don't you want: `$myfile =~ s/([A-Z]+)/\$1\<\/i\>/g;` [download] Note the `[]` around the character classes. You were mathing a captal A followed by a dash, followed by 1 or more capital Z's. UPDATE: I just noticed that you were missing the code tags and that you may have put the `[]` in. In that case, ignore my response. -pete "Worry is like a rocking chair. It gives you something to do, but it doesn't get you anywhere."	[reply] [d/l] [select]