I'm trying to validate some input from a datafile, just to clear out some obvious typos. In this case, it's to keep the Windows teammembers from trying to do something silly.

The task: to validate unix permission settings. However, using octal is just crazy - trying to remember the setuid bit is something better suited to a computer than a human. So I've adopted (mostly) the string that the chmod unix command uses.

According to the manpage, the symbolic mode is "[ugoa...][[+-=][rwxXs-tugo...]...][,...]" which is quite confusing. The idea is that there is a "USER", and an "OPERATION". The "USER" is one or more of "ugoa". The "OPERATION" is one of "+-=" (add, remove, or assign) and one or more of "rwxXs-tugo". Finally, you can do more than one operation (on different users) by separating them with commas.

In my particular case, I only need a small subset of this (I only allow assignment '=', not adding '+' or removing '-', and I want /[rwxsS]/ rather than /[rwxXs]/ to show up the same way as the ls command shows the letters), but that's not too important.

The question is ... what is the most efficient regexp available for this type of task? The idea is that I'm trying to take a single regexp, and have it match multiple times with an optional separator.

Checking Regexp::Common::list seems to be that if there is only one item (no separator), it fails to match, which is not what I'm looking for:

$ perl5.8.6 -MRegexp::Common=list -e 'print map {$_.$/} grep { /$RE{li +st}{-pat=>"[ugoa]+=[rwxsS]+"}{-sep=>","}/ } @ARGV' ugo=rw,a=r ugo=rw ugo=rw,a=r
Desired output is that both arguments are output as matches. Update: Yes, I know this is documented behaviour. I put this in here to show I've looked at some ways to do this, and this is the one that is the closest to what I'm looking for that I've found so far.

What I'm doing is this:

my $perm_re = qr/[ugoa]+=[rwxsS]+/; $val =~ /^(?:$perm_re(?:,$perm_re)*)$/;
Note that a blank string is also valid for my purposes. Is there a better way?

Update: A bit of clarification. First off, what I'm doing is very similar to the chmod input. But not precisely the same. It's close enough that anyone entering this data will intuitively know what letters mean what, or anyone reading the data manually will understand it, assuming any unix experience. But there is zero desire to be able to modify permissions, only to set them. The other dissimilarity is that I'm using "S" the way that the "ls" output uses it because I'm only assuming a certain level of unix experience, and s-bits are confusing enough when chmod and ls disagree about them ;-)

Or, if you want to ignore all of the above, I want to match a comma-separated string, with zero or more items, and each item matches a particular regexp, which I think I have about as good as I'm going to get.

Update 2: I'm not necessarily looking for speed, but idiomaticness. Duplication of $perm_re seems a bit wrong to me, which is why I extracted it to a separate regular epression - because if I want to add, remove, or fix something, it'll likely need to be added, removed, or fixed for both before and after the comma.


In reply to Validation of unix permissions by Tanktalus

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.