UPDATE: I have to agree with the rest of them. For safety reasons (so you don't demolish the test file), you may want to open $file but save to $file2 just incase the unexpected happens..
I agree, and will update my node to do so.
How exactly isn't this going to treat HTML correctly? ... It's not interpreting the file as HTML at all
That's all I meant; it won't look for HTML tags, it will look for literal text, including what it finds in comments, script, etc.
My question to you was, what exactly is line 5 doing with the joining, maping and sorting? You're playing with length which I thought only stored the length in characters of the item you're using it with.
Sorting greatest length first ensures that the match will work if you have e.g. "<A " and "<A HREF". Without the sort, you get results like:
$ perl
use warnings;
use strict;
my @codes = ("<a ", "<a href");
my $codes_regex = join "|", map quotemeta $_,
# sort { length $b <=> length $a }
@codes;
my $text = "testing a link: <A HREF=\"fooble.html\">boofle</a>";
print "in: $text\n";
$text =~ s/($codes_regex)/lc $1/gie;
print "out: $text\n";
__END__
output with the sort:
in: testing a link: <A HREF="fooble.html">boofle</a>
out: testing a link: <a href="fooble.html">boofle</a>
and without:
in: testing a link: <A HREF="fooble.html">boofle</a>
out: testing a link: <a HREF="fooble.html">boofle</a>
This is because the perl regexes prefer the leftmost |'d alternative, even if it makes a shorter match.
The map is just to apply the quotemeta; the join is to put
| between tags.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|