Applications need a way of knowing if an extention was supplied. For example, it needs to know that to decide whether it should add the default extention.

Not really. In the example you give the application has its own idea of "extension", so all it has to do is:

$filename =~ s/$/$ext/ unless $filename =~ m/$ext$/;

So it's not "does this filename look like it has an extension", but rather, "does this filename look like it has this particular extension" This allows the application to use anything and in no way restricts the characters that can appear in the extension as your second to last paragraph states.

Still, what does it mean to be a filename "extension". I keep using scare-quotes on that word because I've always found it slightly moronic. "The N characters we've allowed you weren't enough? Now you get M more!" It was an efficiency hack that was exposed to the end user and what's more named so that users have a conceptual handle to hang on to the hack. Terrible, terrible, terrible. They should have just stuck with filenames with a larger maximum length.

Anyway, if I were making recommendations as to what to consider an "extension" in general, I'd say it must match /\.[a-zA-Z0-9]+\z/ because that gives a nod to history and another nod to the modern day by not restricting it to 3 characters. (Is .jpeg a valid filename "extension"? :-) But, of course, this only covers 99% of the cases. There will still be strange suffixes used by some applications and that's okay.


In reply to Re^4: AWTDI: Renaming files using regexp by duff
in thread AWTDI: Renaming files using regexp by nimdokk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.