comment on

Frankly, using a regex for this is even a little overkill.

Ah, I fear I must disagree with you again. :) The simplest non-regex solution I can think of (where "simple" is some vague measure of how easy or difficult it is to comprehend what's going on) is this:

  my ( $just_name, $just_ext );
  my $last_dot_pos = rindex $filename, '.';
  if ( $last_dot_pos > -1 )
  {
      $just_name = substr $filename, 0, $last_dot_pos;
      $just_ext  = substr $filename, $last_dot_pos+1;
  }
  else
  {
      $just_name = $filename;
      $just_ext  = undef;
  }
[download]

Which has its own potentially subtle problems (yay off-by-one errors!), but is as straightforward as it gets. (In particular, this is the sort of code that a perl novice would likely write - so if that is the level of user / maintenance programmer we're aiming for...)

While the power and flexibility of regexes might not be used to their fullest in splitting a filename from its extension, the "template" of the operation is a very common and easily comprehended one:

try a match;
see if it worked;
capture subexpressions if it did;
complain if it didn't.

That is the aspect of the regex solution that I find compelling: the regex is a bit hairy, but the structure it is embedded in is one of the core patterns in perl. And the difference between "Here's a core pattern, I can instantly see what's going on, now I just need to grok the regex" versus "Wait, what does that module do again? What does this parameter mean? What are the special cases? What happens if it doesn't match? What errors can it throw?"... that is the difference I was trying to highlight.

Finally, I know this is all tradeoff; and different people have different thresholds where they'd draw the line. For me, I don't think that regexes are overkill: in perl, they are a first-class citizen, and an essential part of the programmer's vocabulary. (Heck, even your solutions use it, as the first argument to split -- which is still a regex, so you had better escape it. :-)

In reply to Re^6: Removing File Extensions by tkil
in thread Removing File Extensions by BalochDude

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.