Consider using something besides / as your regex delimiter when the pattern or replacement uses slashes. Also, you don't need to escape less-than or greater-than signs:

$myfile =~ s!([A-Z]+)!<i>$1</i>!g;

That regex looks fine to me. If I test it on the command line, it works:

$ perl -lwe '$_="hi MOM and DAD"; print; s!([A-Z]+)!<i>$1</i>!g; print' hi MOM and DAD hi <i>MOM</i> and <i>DAD</i>

Although, if you really want words, you should use the word boundary assertions (\b) around your series of upper-case letters:

$myfile =~ s!\b([A-Z]+)\b!<i>$1</i>!g;

The fact that you are trying to bind it to something called $myfile worries me, though; is that a variable that contains the entire contents of the file, or is it the filehandle itself? If it is the contents, you need to write them back out for the changes to be visible on disk.

If it is a filehandle, then you need to loop over all the lines in the file and apply the regex to each line. (Or read the whole thing in.) Either way, you end up as above — you need to write it out for it to be visible on disk.

If this is all you are doing, note that perl has a convenience switch (command-line option) to do exactly this: take a list of files, save the original, then apply a program to each file and write out the results to the original filename. See perlrun, look at the -i switch. It is most often used with -p or -n, and often with -l (dash ell), -a, and -e.

As an example, to replace "mom" with "dad" in every .txt file in the current directory, saving backup copies of the original to .txt~ files:

perl -i~ -plwe 's/mom/dad/g' *.txt

If you are doing more processing than -i can accomodate, or if this is a part of another process, the template given in the -i documentation can help. To read in a text file and italicize all-caps words for display in HTML, I might do something like this:

open my $fh, "source-data.txt" or die "opening source-data.txt: $!"; print "<blockquote>\n"; while ( my $line = <$fh> ) { # protect against most egregious HTML violations $line =~ s/&/&amp;/g; $line =~ s/</&lt;/g; $line =~ s/>/&gt;/g; # mark upper-case words as italic. not locale-safe. $line =~ s!\b([A-Z])+\b!<i>$1</i>!g; # output the result print $fh $line; } print $fh "</blockquote>\n"; close $fh or die "closing source-data.txt: $!";

This code can also show why the default variable ($_) is so nice. Notice how much cleaner the while loop gets if we take advantage of the default variabe:

while ( <$fh> ) { # protect against most egregious HTML violations s/&/&amp;/g; s/</&lt;/g; s/>/&gt;/g; # mark upper-case words as italic. not locale-safe. s!\b([A-Z])+\b!<i>$1</i>!g; # output the result print $fh $_; }

In reply to Re: Quick question about pattern matching uppercase letters by tkil
in thread Quick question about pattern matching uppercase letters by cranberry13

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.