I'm trying to validate some input from a datafile, just to clear out some obvious typos. In this case, it's to keep the Windows teammembers from trying to do something silly.
The task: to validate unix permission settings. However, using octal is just crazy - trying to remember the setuid bit is something better suited to a computer than a human. So I've adopted (mostly) the string that the chmod unix command uses.
According to the manpage, the symbolic mode is "[ugoa...][[+-=][rwxXs-tugo...]...][,...]" which is quite confusing. The idea is that there is a "USER", and an "OPERATION". The "USER" is one or more of "ugoa". The "OPERATION" is one of "+-=" (add, remove, or assign) and one or more of "rwxXs-tugo". Finally, you can do more than one operation (on different users) by separating them with commas.
In my particular case, I only need a small subset of this (I only allow assignment '=', not adding '+' or removing '-', and I want /[rwxsS]/ rather than /[rwxXs]/ to show up the same way as the ls command shows the letters), but that's not too important.
The question is ... what is the most efficient regexp available for this type of task? The idea is that I'm trying to take a single regexp, and have it match multiple times with an optional separator.
Checking Regexp::Common::list seems to be that if there is only one item (no separator), it fails to match, which is not what I'm looking for:
Desired output is that both arguments are output as matches. Update: Yes, I know this is documented behaviour. I put this in here to show I've looked at some ways to do this, and this is the one that is the closest to what I'm looking for that I've found so far.$ perl5.8.6 -MRegexp::Common=list -e 'print map {$_.$/} grep { /$RE{li +st}{-pat=>"[ugoa]+=[rwxsS]+"}{-sep=>","}/ } @ARGV' ugo=rw,a=r ugo=rw ugo=rw,a=r
What I'm doing is this:
Note that a blank string is also valid for my purposes. Is there a better way?my $perm_re = qr/[ugoa]+=[rwxsS]+/; $val =~ /^(?:$perm_re(?:,$perm_re)*)$/;
Update: A bit of clarification. First off, what I'm doing is very similar to the chmod input. But not precisely the same. It's close enough that anyone entering this data will intuitively know what letters mean what, or anyone reading the data manually will understand it, assuming any unix experience. But there is zero desire to be able to modify permissions, only to set them. The other dissimilarity is that I'm using "S" the way that the "ls" output uses it because I'm only assuming a certain level of unix experience, and s-bits are confusing enough when chmod and ls disagree about them ;-)
Or, if you want to ignore all of the above, I want to match a comma-separated string, with zero or more items, and each item matches a particular regexp, which I think I have about as good as I'm going to get.
Update 2: I'm not necessarily looking for speed, but idiomaticness. Duplication of $perm_re seems a bit wrong to me, which is why I extracted it to a separate regular epression - because if I want to add, remove, or fix something, it'll likely need to be added, removed, or fixed for both before and after the comma.
In reply to Validation of unix permissions by Tanktalus
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |