Adam has asked for the wisdom of the Perl Monks concerning the following question:

I have a script that does some stuff and then e-mails a report to various interested parties based on the results. There are not enough e-mail addresses to warrant a database, or even a separate file. I just have an ugly nested data-structure. The structure is a hash where the keys are categories and the values hashes. The sub-hashes then have some number of name::address pairs.

This worked well until I wanted to look up someone's address by Name instead of category. So I wrote a subroutine to do that... but there went the efficiency of a hash! So now I'm wondering if anyone can think of a more efficient way to do this (keeping in mind that there are only about 15 names and 4 or 5 categories... so far.)

my %EMail = ( # This hash best accessed via GetEMailFor() Category1 => { "Name One", 'some1@address.com', }, Category2 => { "Name Two", 'some2@address.com', "Name Three", 'some3@address.com', }, Category3 => { "Name Four", 'some4@address.com', "Name Five", 'some5@address.com', "Name Six", 'some6@address.com', "Name Seven", 'some7@address.com', "Name Eight", 'some8@address.com', }, # And so on... ); # GetEMailFor( $key, [$key2, $key3...] ) # # Returns the requested e-mail address. # If more then one address is found, they are returned as # a comma delimeted scalar. # The $key may be the person's category or name. # sub GetEMailFor { my @addresses = (); while( @_ ) { my $request = shift; for( keys %EMail ) { if( $_ eq $request ) { push @addresses, values %{$EMail{$_}}; next; } for my $name ( keys %{$EMail{$_}} ) { if( $name eq $request ) { push @addresses, ${$EMail{$_}}{$name}; next; } } } } return join ',', reverse @addresses if $#addresses; return $addresses[0]; } # So print GetEMailFor( 'Category2' ); # prints: some2@address.com,some3@address.com print GetEMailFor( 'Name Five' ); # prints: some5@address.com print GetEMailFor( 'Category1', 'Name Seven' ) # prints: some1@address.com,some7@address.com
I suspect that I might just redo the hash to be less complex, maybe make names and categories keys, but have categories return an array of names... making the user have to re-access the hash with each name if a list gets returned... hmmm thats almost as ugly.

Replies are listed 'Best First'.
Re: Complex Hash
by merlyn (Sage) on Sep 06, 2000 at 04:12 UTC
    Look into the Memoize function, which can remember the results of prior queries to speed it up. First, break out the lookup into a separate function:
    sub lookup { die unless wantarray; # call me in list context only my $item = shift; return values %{$EMail{$item}} if exists $EMail{$item}; # category n +ame for my $cat (keys %EMail) { return $EMail{$cat}{$item} if exists $EMail{$cat}{$item}; # single + name } return; # not found } sub GetEmailFor { my @results = map { lookup($_) } @_; return join ",", @results; }
    This should be equivalent code (and quite faster, I might add {grin}). Now, let's speed it up further:
    use Memoize; memoize('lookup'); sub I_have_changed_EMail { unmemoize('lookup'); memoize('lookup'); }
    and be sure to call I_have_changed_EMail any time you change your dataset (none at all if you "set it and forget it", as you indicate already).

    There. Speed without too much work.

    -- Randal L. Schwartz, Perl hacker

      That would work well if GetEMailFor() was being called lots of times with the same query, but it isn't. Its being called several times with almost every query being different.
        Please relook. I'm memoizing the underlying individual items. Not the combination of what's being called to GetEMailFor.
        "Please read for comprehension!" -- PurlGurl {grin}

        -- Randal L. Schwartz, Perl hacker

Re: Complex Hash
by tye (Sage) on Sep 06, 2000 at 08:06 UTC
    my %EMail = ( Category1 => { "Name One", 'some1@address.com', }, Category2 => { "Name Two", 'some2@address.com', "Name Three", 'some3@address.com', }, Category3 => { "Name Four", 'some4@address.com', "Name Five", 'some5@address.com', "Name Six", 'some6@address.com', "Name Seven", 'some7@address.com', "Name Eight", 'some8@address.com', }, ); @EMail{keys %{$EMail{$_}}}= values %{$EMail{$_}} for keys %EMail; sub GetEMailFor { my @addresses= (); for my $dest (@_) { my $addr= $EMail{$dest} or die "No such e-mail destination: $dest\n"; push @addresses, ref($addr) ? values %$addr : $addr; } return join ',', reverse @addresses; }

    Read for apprehension; it hasn't been tested. (:

    Actually it has; it had lots of typos. ): But I fixed them.

            - tye (but my friends call me "Tye")
Re: Complex Hash
by BlaisePascal (Monk) on Sep 06, 2000 at 04:27 UTC
    I'd be tempted to throw this thing into three hashes: one keyed by category, one by name, one by address. You can throw them into a closures to give you controlled access to them:
    { my %n; my %c; my %a; sub addrecord ($$$) { my ($n,$c,$a) = @_; my $r = [$n,$c,$a]; push @$n{$n},$r; push @$c{$c},$r; push @$a{$a},$r; } sub getbyname($) { my $n = shift; return @$n{$n}; } sub getbycat($) { my $c = shift; return @$c{$c}; } sub getbyaddress($) { my $a = shift; return @$a{$a}; } }
    I haven't tested the above, and I might have some of the reference syntax wrong, but it should be clear how that works, and that you can lookup by any of category, name, and address, and allows a many-to-many-to-many relationship.

    Would this serve your need?

Re: Complex Hash
by extremely (Priest) on Sep 07, 2000 at 04:02 UTC

    OK, Maybe it's the DB admin in me but why not this:

    my %Emails = ( 'Name 1' => 'Some1@somewhere.com', 'Name 2' => 'Some2@whereelse.com', 'Name 3' => 'Some3@nowhere.com', 'Name 4' => 'Some4@someelse.com', 'Category1' => [ 'Name 1', 'Name 2' ], 'Category2' => [ 'Name 2', 'Name 3', 'Name 4'], ); sub GetEMailFor { my %names = (); foreach my $request ( @_ ) { if (ref ($Emails{$request}) eq 'ARRAY') { foreach my $em (@{$Emails{$request}}) { $names{$em}=1; } } else { $names{$request}=1; } } return join ',', map($Emails{$_}, keys %names); } print (GetEMailFor('Name 1', "Category2") ."\n");

    Actually tested... and I leave the recursion that would allow categories to contain other category references as an exercise for the reader.

    --
    $you = new YOU;
    honk() if $you->love(perl)

Re: Complex Hash
by Anonymous Monk on Sep 06, 2000 at 21:56 UTC
    I would invoke virtue #2, laziness and ask why your data structure can't look like this:
    my %email = { Category1 => { "Name One", 'some1@address.com }, Category2 => { "Name Two", 'some2@address.com', "Name Three", 'some3@address.com }, "Name One" => 'some1@address.com' }
    You see, the hash does not have to homogenous. When you do a lookup, either key the Category names so they're easily distinguishable (all caps, put a * in front) as a string or check to see whether the value of your hash key is a reference. If it is, dereference and lookup again. Or better yet, realize that what you really want is a hash of arrays, where when you provide a name you want one or more email addresses back. Note that you never return "Name One", only 'some1@address'. Try organizing like this:
    my %Email = { 'Category1' => [ q(some1@address.com) ], 'Category2' => [ q(some2@address.com some3@address.com) ], 'Name One' => [ q(some1@address.com) ] };
A reply falls below the community's threshold of quality. You may see it by logging in.