kayj has asked for the wisdom of the Perl Monks concerning the following question:

Hi All,

I have the data about 100 individuals each with the first name, middle name, last name, age, sex, and location.

I was wondering if I can read it into a hash with two keys (first and the middle name used as the key of the hash) and the values of the hash to be the rest of the data (age, sex, location)

I read about hashes in a perl book but I have seen only the key to be one variable. The reason I am looking for this is because so may records have the same first name so I can not use the first name as the key. Can a hash have two keys instead of one key for each record? If yes, how can this be done?

Your help is greatly appreciated

Replies are listed 'Best First'.
Re: question on Hashes
by markkawika (Monk) on Sep 01, 2009 at 23:01 UTC

    A hash cannot have two keys.

    However, you can combine values together to create a single key:

    $var{$firstname . $middlename} = ...

    Or you can have a multi-dimensional hash:

    $var{$firstname}{$middlename} = ...

    But neither of those addresses the issue of collisions; you already stated that many records have the same first name, but can you guarantee that firstname + middlename is unique? Are you sure?

    Typically in databases of the sort you're describing, the hash key is forced to be unique. Something like an "id number" or something. Then you can store data about the person without worrying about hash collisions.

    Really, the question is, what do you want to do with the data that you have? Why are you storing it, and what will you do with it after you store it?

Re: question on Hashes
by ig (Vicar) on Sep 01, 2009 at 23:53 UTC

    update: After writing the following, I realized you don't want multiple keys, you want one key incorporating multiple values. markkawika and bichonfrise74 have already shown you how to combine the values to create a composite key. You may incorporate that idea with the hash of lists below to deal with duplicate keys.

    You can create as many hashes as you want, so multiple keys to the same data is not a problem. You can also create a hash of arrays, where each hash value is a reference to an array, allowing you to store multiple values for each key. This might address your issue with collisions. Whatever combination of first, middle and last name you use for keys, you will have the possibility of collisions, so a hash of arrays is probably the way to go. The following demonstrates one way of doing it.

    use strict; use warnings; my ( %individuals_by_first_name, %individuals_by_middle_name); foreach my $line (<DATA>) { chomp($line); my ($first, $middle, $last, $age, $sex, $location) = split(/,/, $l +ine); my $record = [ $first, $middle, $last, $age, $sex, $location ]; push(@{$individuals_by_first_name{$first}}, $record); push(@{$individuals_by_middle_name{$middle}}, $record); } foreach my $first (sort keys %individuals_by_first_name) { foreach my $individual (@{$individuals_by_first_name{$first}}) { my ($first, $middle, $last, $age, $sex, $location) = @$individ +ual; print "$first: $middle, $last, $age, $sex, $location\n"; } } __DATA__ first1,middle1,last1,27,M,here first2,middle2,last2,56,M,there first3,middle3,last3,30,F,everywhere first1,middle4,last4,7,F,home first4,middle2,last3,22,M,away

    which produces

    first1: middle1, last1, 27, M, here first1: middle4, last4, 7, F, home first2: middle2, last2, 56, M, there first3: middle3, last3, 30, F, everywhere first4: middle2, last3, 22, M, away

    or, by adding one more layer of hash, creating a hash of hashes of arrays, you can add flexibility to add additional keys without adding variables and merge the two hashes into one.

    use strict; use warnings; my %individuals; foreach my $line (<DATA>) { chomp($line); my ($first, $middle, $last, $age, $sex, $location) = split(/,/, $l +ine); my $record = [ $first, $middle, $last, $age, $sex, $location ]; push(@{$individuals{first}{$first}}, $record); push(@{$individuals{middle}{$middle}}, $record); } foreach my $first (sort keys %{$individuals{first}} ) { foreach my $individual (@{$individuals{first}{$first}}) { my ($first, $middle, $last, $age, $sex, $location) = @$individ +ual; print "$first: $middle, $last, $age, $sex, $location\n"; } } __DATA__ first1,middle1,last1,27,M,here first2,middle2,last2,56,M,there first3,middle3,last3,30,F,everywhere first1,middle4,last4,7,F,home first4,middle2,last3,22,M,away

    This produces the same output as the first example. You can read more about references and data structures in perlref, perldsc, perllol and References.

Re: question on Hashes
by bichonfrise74 (Vicar) on Sep 01, 2009 at 23:37 UTC
    If I were you, I would consider my data structure to be something like this where I will use the last name then the combination of the first and middle names as my 'keys':
    $rec{$last_name} -> { "$first_name . $middle_name" } = [ age, sex, location ];
    Ignore the syntax as I have not tested it. Also, I'm not sure if this can be considered as hash of hash of array?

    Of course, this does not guarantees that collision of records will not happen.