in reply to Re: Perl and arrays of arrays?
in thread Perl and arrays of arrays?

Here is what it looks like as a hash of arrays. It's not as scalable as a hash of hashes, since it needs to grep through the array of aliases each time, but on small files you shouldn't notice.
use strict; # Hash of arrays # USAGE: $masterhash{$ipaddress} = @aliases my %masterhash; my $input_file = "test.in"; my $output_file = "test.out"; open INPUT_FILE, "<$input_file" or die "Could not open $input_file\n"; # Loop through each line in the input file while(my $line = <INPUT_FILE>) { # Split line on spaces my @line_parts = split(/\s/, $line); # Grab the IP address my $ipaddress = shift @line_parts; # Loop through the aliases foreach my $alias (@line_parts) { # Search through the array for that IP address to determine # whether or not the current alias is already there # # The value in $masterhash{$ipaddress} is an array reference, so # we have to wrap it with a @{} to use it as an array unless(grep {$_ eq $alias} @{$masterhash{$ipaddress}}) { # If we get here, its a new alias for this IP address so add it push @{$masterhash{$ipaddress}}, $alias; } } } # Printing output... open OUTPUT_FILE, ">$output_file" or die "Could not open $output_file\ +n"; # Print one line per IP address foreach my $ip (sort keys %masterhash) { # Print the IP, followed by all of its aliases joined by spaces. print OUTPUT_FILE ("$ip ", join(" ", sort @{$masterhash{$ip}}), "\n" +); }

test.in
IPADDRESS alias1 alias2 alias5 IPADDRESS2 alias3 alias4 IPADDRESS2 alias3 alias5 IPADDRESS2 alias3 alias5

test.out
IPADDRESS alias1 alias2 alias5 IPADDRESS2 alias3 alias4 alias5
Hope this helps!

Replies are listed 'Best First'.
Re^3: Perl and arrays of arrays?
by QM (Parson) on Aug 17, 2005 at 15:43 UTC
    (Was your reply misplaced?)

    Your comment is wrong, but your code works:

    # USAGE: $masterhash{$ipaddress} = @aliases
    That's the same as:
    $masterhash{$ipaddress} = scalar @aliases
    What you want in this case is:
    $masterhash{$ipaddress} = \@aliases
    The only justification for array of hashes I can come up with is if the number of entries is very large, and you manage duplicate hosts. (For example, the file is already sorted by host, and you keep track of the last host. If the current host is the same, the new data is added to the last host entry. Otherwise, a new host entry is created first.) But searching the host array to find the correct entry on a large array is problematic.

    -QM
    --
    Quantum Mechanics: The dreams stuff is made of

      Oops, I should have replied to the original post.

      The comment was meant to be an indicator of the usage, I wasn't worried about making it syntactically correct.

      I personally like the hash of arrays approach, as it more closely mirrors the way I think about the problem: a key (IP address) that can have one or more aliases. However, as I noted in my original post, it is not the most efficient way to do things...just a bit easier to understand for a beginner IMHO.
        I personally like the hash of arrays approach, as it more closely mirrors the way I think about the problem: a key (IP address) that can have one or more aliases. However, as I noted in my original post, it is not the most efficient way to do things...just a bit easier to understand for a beginner IMHO.
        I understand what you're saying, but I'd go in a different direction.

        If a beginner understands AoA:

        $array[$row][$col] = $pixel;
        and that hashes are sort of arrays with different brackets:
        $array{$row}{$col} = $pixel;
        then it's a small leap to suggest that (here, at least) the keys are the interesting bits, and the values don't matter:
        $hosts{$ip}{$alias} = 1;

        -QM
        --
        Quantum Mechanics: The dreams stuff is made of