Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Re: Permissions Utility Script for sysadmins - seeking comments

by BuddhaNature (Beadle)
on Apr 21, 2004 at 22:19 UTC ( [id://347152]=note: print w/replies, xml ) Need Help??


in reply to Re: Permissions Utility Script for sysadmins - seeking comments
in thread Permissions Utility Script for sysadmins - seeking comments

First off, thanks for al the constructive criticism. I have embarked and a number of the changes you mentioned.

I am getting a bit confused with one or two things and was hoping you could elucidate a bit more on what might be going on and how I might make it better. Specifically where you mention "use a hash" - I am a bit unsure on how to best impliment this in this context. Also, do you have any insight into why (as you mentioned) if there are both dir-mode and dir-owner that the latter will be silently ignored. Thanks in advance.

-Shane

Replies are listed 'Best First'.
Re: Re: Re: Permissions Utility Script for sysadmins - seeking comments
by hv (Prior) on Apr 22, 2004 at 09:09 UTC

    Ok, here's a simplistic way to replace one of the string of elsifs with a hash:

    my %vars = ( 'dir-mode' => \$d_mode, 'file-mode' => \$f_mode, 'link-mode' => \$l_mode, 'dir-owner' => \$d_own, 'file-owner' => \$f_own, 'link-owner' => \$l_own, 'dir-group' => \$d_grp, 'file-group' => \$f_grp, 'link-group' => \$l_grp, ); for my $testy (keys %{ $config{$test}{$tester} }) { unless ($vars{$testy}) { warn "Unexpected test key '$testy'"; next; } ${ $vars{$testy} } = $config{$test}{$tester}{$testy}; }

    This is simply saying: we have a string of (els)ifs, and whichever string we match we're going to execute almost the same code, so let's abstract out the one bit that varies (the variable name in this case) and let the hash associate the string to match with the variable name to modify, leaving the rest of the code constant.

    However, we also have a series of similar variable names, which is also often a sign that a hash is needed. So let's look at the point we're using these variables: I'd characterise the wanted_c subroutine as doing something like:

    • where specified, update any of the mode, owner or group of the file
    • but select a different set of updates depending on whether the file to update is a directory, a link or a plain file
    so I'd be inclined to map that to a data structure that looks like:
    my $wanted_this_time = { mode => $wanted_mode, owner => $wanted_owner, group => $wanted_group, };
    but pick one of three such structures depending on the type of file:
    my $wanted = { if_directory => { mode => $dir_mode, owner => $dir_owner, group => $dir_group, }, if_link => { ... } if_file => { ... } };
    and having made that conceptual transformation, we can now get rid of the matrix of individually named variables:
    my %wanted; my %validfile = map +($_ => 1), qw/ dir file link /; my %validupdate = map +($_ => 1), qw/ mode owner group /; for (keys %{ $config{$test}{$tester} }) { my($filetype, $updatetype} = split /-/, $_, 2; unless ($validfile{$filetype} && $validupdate($updatetype}) { warn "Unexpected directive '$_'"; next; } $wanted{$filetype}{$updatetype} = $config{$test}{$tester}{$_}; } [...] sub wanted_c { stat($_); my $wanted = (-d _ ? $wanted{'dir'}) : (-l _ ? $wanted{'link'}) : (-f _ ? $wanted{'file'}) : return; # ?? warn unexpected type? return unless $wanted; # nothing to change if (defined $wanted->{'mode'}) { change_mode($_, $wanted->{'mode'}); } if (defined $wanted->{'owner'}) { change_owner($_, $wanted->{'owner'}); } if (defined $wanted->{'group'}) { change_group($_, $wanted->{'group'}); } }

    Note that you could as easily represent this data using arrays rather than hashes, but it then becomes much less self-documenting unless you also use named constants for the indexes into each array.

    In general, whenever I see code that needs to cope with several different cases, my instinct is to ask: what effort will be required when another new case needs to be added to the list? How far can I reduce that effort? How can I minimise the danger of getting it wrong when I need to add a new case, for example by adding it in one place and not in another?

    Usually, the answer is: use data structures to minimise code duplication; where possible, encapsulate all the information for the cases in a single data structure, so all the work involved in adding a new case is reduced to adding a new entry into that data structure.

    It may be overkill for this script, but we can take that extra step by taking advantage of code references. The data structures might then look something like this:

    my %filetypes = ( 'dir' => sub { -d _ }, 'link' => sub { -l _ }, 'file' => sub { -f _ }, ); my %updatetypes = ( 'mode' => \&change_mode, 'owner' => \&change_owner, 'group' => \&change_group, ); sub wanted_c { stat($_); for my $filetype (keys %wanted) { next unless &{ $filetypes{$filetype} }; my $wanted = $wanted{$filetype}; for my $update (keys %$wanted) { &{ $updatetypes{$update} }($_, $wanted->{$update}); } } }

    However, I suspect that would not be desirable, at least for the update types, since it makes it much harder to add interdependencies between the update types such as (for example) updating ownership before mode.

    Hugo

      So I took your advice and made the following changes:

      my %wanted; my %validfile = map +($_ => 1), qw/ dir file link /; my %validupdate = map +($_ => 1), qw/ mode owner group /; for (keys %{ $config{$test}{$tester} }) { my($filetype, $updatetype) = split /-/, $_, 2; unless ($validfile{$filetype} && $validupdate{$updatet +ype}) { warn "Unexpected directive '$_'"; next; } $wanted{$filetype}{$updatetype} = $config{$test}{$test +er}{$_}; } [...] sub wanted_c { print "wanted_c is working on $_\n"; my %wanted_prev = %wanted; my $wanted_here; stat($_); if (-d $_) {$wanted_here = $wanted_prev{'dir'}} elsif (-l $_) {$wanted_here = $wanted_prev{'link'}} elsif (-f $_) {$wanted_here = $wanted_prev{'file'}} else {return;} return unless $wanted_here; # nothing to change print Dumper($wanted_here); if (defined $wanted_here->{'mode'}) { change_mode($_, $wanted_here->{'mode'}) + } if (defined $wanted_here->{'owner'}) { change_owner($_, $wanted_here->{'owner'}); } if (defined $wanted_here->{'group'}) { change_group($_, $wanted_here->{'group'}); } }

      The changes work like a charm. I have one problem and one question though. The problem is that I keep getting a diagnostic warning:

      Variable "%wanted" will not stay shared at permmy7.pl line 317 (#1)

      I am a little unsure about how best to go about getting rid of it. Can't _really_ see an easy way to sub something out...

      The question I have is, previously I was having issues with the file/directory/link being checked by the wanted_c function hitting only one if and then returning, so I had to toss in some recurssion - why doesn't that problem occur with the above code? I am just not seeing it and it is bugging me. It works and I am thankful for that but I want to know _why_ it works... Thanks in advance.

      -Shane

        The problem is that I keep getting a diagnostic warning:

        Variable "%wanted" will not stay shared at permmy7.pl line 317 (#1)

        The description from diagnostics (search in perldoc perldiag, or run the script with perl -Mdiagnostics) describes what is happening here:

        An inner (nested) named subroutine is referencing a lexical variable defined in an outer named subroutine.

        When the inner subroutine is called, it will see the value of the outer subroutine's variable as it was before and during the *first* call to the outer subroutine; in this case, after the first call to the outer subroutine is complete, the inner and outer subroutines will no longer share a common value for the variable. In other words, the variable will no longer be shared.

        This problem can usually be solved by making the inner subroutine anonymous, using the sub {} syntax. When inner anonymous subs that reference variables in outer subroutines are created, they are automatically rebound to the current values of such variables.

        In this case, you can avoid the problem just as described:

        my $wanted_c = sub { print "wanted_c is working on $_\n"; ... etc ... }; File::Find::find({wanted => $wanted_c}, $tester);

        The question I have is, previously I was having issues with the file/directory/link being checked by the wanted_c function hitting only one if and then returning, so I had to toss in some recurssion - why doesn't that problem occur with the above code?

        Recursion was not likely to be the right solution to this problem - it was not that wanted_c was only called once, but that within that one call the chain of elsif tests ensured that only one test could ever match. See another response for my suggested solution to that.

        You'll see, I hope, that the solution there and the new code are effectively the same - we only want to match one filetype, so we check for possible filetypes with a chain of elsif tests; we want to match all of the defined updates, so we use a sequence of if tests.

        Hugo

Re: Re: Re: Permissions Utility Script for sysadmins - seeking comments
by hv (Prior) on Apr 21, 2004 at 23:07 UTC

    Also, do you have any insight into why (as you mentioned) if there are both dir-mode and dir-owner that the latter will be silently ignored.

    Looking again, this was an error on my part - I'd missed the surrounding loop. Apologies for the misdirection.

    Update: bother, I was right the first time. See Re^5: Permissions Utility Script for sysadmins - seeking comments instead.

    I'll try tomorrow to come up with a couple of examples of how you might use a hash to improve things.

    Hugo

      Ah, but you _were_ right - it does happen (or did since I fixed it). It was because the wanted_c function was only being called once for each file, and if it had more than one applicable directive it just passed up the other two. So I changed the wanted_c function calls to something like this:

      sub wanted_c { if ((-d $_) && ($d_mode) { change_mode($_, $d_mode); $d_mode = ""; wanted_c(); } elsif ((-d $_) && ($d_own)) { change_owner($_, $d_own); $d_own = ""; wanted_c(); }

      That way it gets called as many times as is applicable to the file. Thanks for the pointer - I look forward to looking into your hash idea, which is slightly coalescing in my brain, but wouldn't mind your nudging it a bit with some advice/examples.

      -Shane UPDATE: This fixed the problem but created another - it is doing everything, but only on one file! I see it is because of setting the variables to "", but am kinda unsure of what to do instead...

      2nd UPDATE: Things are good now except for one thing, in order to fix the problem I created

      my %info = %{ $config{$test}{$tester} };

      after the spot you suggest a hash. I then put

      else { while(my($key,$val) = each(%info)) { if ($key eq "dir-mode") { $d_mode = $val} elsif ($key eq "file-mode") { $f_mode = $val} elsif ($key eq "link-mode") { $l_mode = $val} elsif ($key eq "dir-owner") { $d_own = $val} elsif ($key eq "file-owner") { $f_own = $val} elsif ($key eq "link-owner") { $l_own = $val} elsif ($key eq "dir-group") { $d_grp = $val} elsif ($key eq "file-group") { $f_grp = $val} elsif ($key eq "link-group") { $l_grp = $val} } return 0; }

      as the final part of the wanted_c function. Everything works fine now, except since I am using diagnostics I get:

      Variable "%info" will not stay shared at permmy6.pl line 372 (#1) (W closure) An inner (nested) named subroutine is referencing a lexical variable defined in an outer subroutine. When the inner subroutine is called, it will probably see the value of the outer subroutine's variable as it was before and during the *first +* call to the outer subroutine; in this case, after the first call to th +e outer subroutine is complete, the inner and outer subroutines will no longer share a common value for the variable. In other words, the variable will no longer be shared. Furthermore, if the outer subroutine is anonymous and references a lexical variable outside itself, then the outer and inner subroutines will never share the given variable. This problem can usually be solved by making the inner subroutine anonymous, using the sub {} syntax. When inner anonymous subs that reference variables in outer subroutines are called or referenced, the +y are automatically rebound to the current values of such variables.

      Any thoughts?

        Bother, I was right the first time, and looking at the wrong bit of code the second time.

        The wanted_c routine is called once for each file, and in the original code it goes through a series of elsifs until it finds a match, so it will only act on the first match - the first combination of filetype and updatetype that is appropriate.

        So to fix it you need to avoid that; the minimal fix would probably be something like this:

        sub wanted_c { if (-d $_) { if ($d_mode) { change_mode($_, $d_mode); } if ($d_owner) { change_owner($_, $d_owner); } if ($d_group) { change_group($_, $d_group); } } elsif (-l $_) { ... } elsif (-f $_) { ... } }

        The point is that the different filetypes are exclusive: you're only interested in the first type that matches, so an if/elsif chain is appropriate. The updatetypes are not exclusive: you want to act on each updatetype that is specified, so you much check each of them without jumping out on the first one. An if/elsif chain is not appropriate for that case.

        Hugo

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://347152]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2024-03-29 07:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found