meirgold has asked for the wisdom of the Perl Monks concerning the following question:

Hi

I am trying to create a structure which holds names and values which will later be translated into linux environment variables.

For example: $hash{VAR1}="me" will later turn into an environment variable called $VAR1 which value will be "me".

However, i would like to do it in a more sophisticated way where each variable has a priority. This way, i can decide that a certain variable will be created before the other.

I would like my structure to look something like:

$hash{10}{VAR1}="me" $hash{20}{VAR2}="you"
In this way, $VAR1 will be created before $VAR2.

My problem here is efficiany: during the program, i would like to run queries on this structure so i can know the value of a variable and it's priority.

For example: if my structure would be built in the way i described above, i could just run a statement like  if($hash{20}) { ... } to check if a priority i have given already exists, however, if i wanted to know the value of a variable, i would have to run a  foreach loop which would go through the whole hash inorder to find it.

If i build the structure as following:

$hash{VAR1}{priority}{value1}

It would be easier for me to get the value of the variable, or check if it already exists, however, if i would like to check if a priority exists, or if one priority is lower than another, i would have to go through the whole hash again to check it.

What would be your suggestion for building such a structure ?

Thanks

20040718 Janitored by Corion: Converted TR tags to P tags

Replies are listed 'Best First'.
Re: a hash with priority
by Joost (Canon) on Jul 18, 2004 at 20:35 UTC
    Efficiency in this case depends on a couple of things:
    • a. How many different variable names are there going to be?
    • b. How many different priority levels are there going to be?
    • c. Do they have (m)any "gaps" - i.e. levels are 10, 15, 20, 30 ... or 1, 2, 3, 4...
    • d. How many checks on existance of a priority level are there going to be?
    • e. How many checks on on existance of a variable name are there here going to be?
    • f. Are there multiple uses of the same priority level or variable name?

    You can assume the efficiency doesn't really matter as long as a, b, d, and e are "low" for some value of low.

    If you have lots of lookups on priority, put priority first : $hash{$priority}->{$name} - you can use an array if you have a "continuous" list of priorities in this case: $array[$priority]->{$name}.

    If you have lots of lookups on name you can put name first: $hash{$name}->{$priority}.

    If you have lots of lookups on both name and priority and less "inserts" make 2 indexes: $names{$name}->{$priority} and $priorities{$priority}->{$name}.

    If you only use 1 priority per name and vice versa, you can collapse them so it will be faster to look up:

    # for named lookup $names{$name} = { priority => $priority, value => $value }; # for priority lookup $priorities{$priority} = { name => $name, value => $value };

    As you can see, there are quite a lot of options.

    note that, on Unix, setting ENV will only affect programs exec'd from the current program (including system(), backticks etc.), NOT the "calling" environment (shell).

    Updated: slightly better list layout

Re: a hash with priority
by graff (Chancellor) on Jul 18, 2004 at 23:34 UTC
    If you're talking about "environment variables" in the sense of "shell environment variables", which are provided to perl scripts via the "%ENV" global hash, then I'm curious why the order in which variables are defined should make any difference. As I understand it, environment variables are treated (by Perl and shells) as an unordered set of "name,value" pairs -- that's why ENV is a hash in Perl.

    To the extent that certain variables define "search paths", the ordering of elements within the value of such a variable is of course very important. So I could imagine a process, for example, that will come up with two or more distinct elements to be appended to the value of a variable named "PATH", and the order in which these elements are appended can make the difference between success and failure.

    But why would it be important (for example) to make sure that you assign a value to "PATH" before (or after) you assign a value to some other variable? Just curious...

    (update: reworded last paragraph)

    Another update: On second thought, I realize that people often design and implement shell environments by using dependencies among variables -- e.g. "FOO=baz; BAR=$FOO.bar" and so on. But I'm having trouble imagining any other situations where ordering of variable definitions is an issue, and in this case, assigning numeric priorities to the assignments would not seem like the best approach. It would be better for the organization of the hash to represent the dependencies directly.

      At a guess, I would say its because some variables contain others..

      Real world example:

      MIS_ROOT=/path/to/root MIS_LOADPATH=$MIS_ROOT/loadfiles
      If MIS_LOADPATH is exported before MIS_ROOT is set it will be incorrect.

      C.

        some variables contain others

        Agreed, and as graff's second update points out, this is a dependency ordering problem which can better be solved with graph techniques that don't involve priority numbering.

Re: a hash with priority
by ccn (Vicar) on Jul 18, 2004 at 20:33 UTC
    You can check if a priority already exists with if (exists $hash{20}){...}

    further

    foreach my $priority (sort keys %hash) { print "Priority: $priority\n"; foreach (keys %{$hash{$priority}}) { print "\t $_ => $hash{$priority}{$_}\n"; } }