in reply to Re^2: Converting HTML tags to upper case
in thread Converting HTML tags to upper case

Bern: In keeping with your description of your experience, here's a rather basic trick you may find useful

Re #2 - One good way to spoeed your learning is to run such constructs through a "try-out." Here's a fairly simply way to do so (using one specific bit of your proposed code):

my @files = ("foo.htm","two.htm","three.HTM","FOUR.HTML"); # etc etc for as many variants as you like foreach $file(@files) { if ($file =~ s/html|htm$/i) { print $file; } }

(What's happening above is that we're stuffing a variety of possible names into an array, which makes for an easy, compact way to check them all.)

and then tell perl to run a check ( -c ):

>perl -c bern.pl
Substitution replacement not terminated at bern.pl line 6. >

Now, Perl has told you there's something wrong with that scheme (Hint: You've said you're checking to make sure the file has an .htm or .html extension. So why are you using substitution ( s/// )? Perhaps that's an "aha" moment. We don't want to substitute in the test (and yes, we do need that if in order to test without being sensitive to case).

So what happens if we edit the script to test for a MATCH instead of attempting substitution?

my @files = ("foo.htm","two.htm","three.HTM","FOUR.HTML"); # etc etc +for as many variants as you like foreach $file(@files) { if ($file =~ /html|htm$/i) { print $file; } }

Now, it passes the check... and running the script does this:

>perl bern.pl
foo.htmtwo.htmthree.HTMFOUR.HTML >

Well, ugly, but "yep" -- all four bits of test data matched either htm or html.

So we're done, right? BRRRRRRRRAATTTTTT!
  NO!

Let us suppose some evil user (and if you don't think "evil user" is redundant, beware!) tried to foist a file like "oneHTML.xyz" on you? Well, try it!

            (...short pause while you do so)

OK, so now we know the test works in part, but not well enough to do what you want -- that is, not well enough to restrict the acceptable files to those with an extension of .htm, .html, .HTM or .HTML (though your trailing "i" does provide the case insensitivity you probably want in the uploaded filename.)

That means it's time to read some more; say, for example, perldoc perlretut or one of the many nodes here on regular expressions. And then, just for the record, the advice you'll see frequently in what you read here, "Don't use regexen to parse html," refers to the NEXT step of your journey. Using a regex is just fine (at least IMO) to test a filename.