comment on

Bern: In keeping with your description of your experience, here's a rather basic trick you may find useful

Re #2 - One good way to spoeed your learning is to run such constructs through a "try-out." Here's a fairly simply way to do so (using one specific bit of your proposed code):

my @files = ("foo.htm","two.htm","three.HTM","FOUR.HTML");  
                # etc etc for as many variants as you like

foreach $file(@files) {
  if ($file =~ s/html|htm$/i) {
    print $file;
  }
}
[download]

(What's happening above is that we're stuffing a variety of possible names into an array, which makes for an easy, compact way to check them all.)

and then tell perl to run a check ( -c ):

>perl -c bern.pl
Substitution replacement not terminated at bern.pl line 6. >

Now, Perl has told you there's something wrong with that scheme (Hint: You've said you're checking to make sure the file has an .htm or .html extension. So why are you using substitution ( s/// )? Perhaps that's an "aha" moment. We don't want to substitute in the test (and yes, we do need that if in order to test without being sensitive to case).

So what happens if we edit the script to test for a MATCH instead of attempting substitution?

my @files = ("foo.htm","two.htm","three.HTM","FOUR.HTML");  # etc etc 
+for as many variants as you like

foreach $file(@files) {
  if ($file =~ /html|htm$/i) {
    print $file;
  }
}
[download]

Now, it passes the check... and running the script does this:

>perl bern.pl
foo.htmtwo.htmthree.HTMFOUR.HTML >

Well, ugly, but "yep" -- all four bits of test data matched either htm or html.

So we're done, right? BRRRRRRRRAATTTTTT!
NO!

Let us suppose some evil user (and if you don't think "evil user" is redundant, beware!) tried to foist a file like "oneHTML.xyz" on you? Well, try it!

(...short pause while you do so)

OK, so now we know the test works in part, but not well enough to do what you want -- that is, not well enough to restrict the acceptable files to those with an extension of .htm, .html, .HTM or .HTML (though your trailing "i" does provide the case insensitivity you probably want in the uploaded filename.)

That means it's time to read some more; say, for example, perldoc perlretut or one of the many nodes here on regular expressions. And then, just for the record, the advice you'll see frequently in what you read here, "Don't use regexen to parse html," refers to the NEXT step of your journey. Using a regex is just fine (at least IMO) to test a filename.

In reply to Re^3: Converting HTML tags to upper case by ww
in thread Converting HTML tags to upper case by Bern

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.