kayak9630 has asked for the wisdom of the Perl Monks concerning the following question:

Oh great and wise Perl Monks... I have a question, and it is somewhat difficult to phrase, but I will try my best. I need an efficient way to crunch through some data. Here is what it looks like:

Login,Name,Location,Info,Info2,Infoblah

Here is my challenge, I need to be able to store the locations in an array, and then store the users (Logins) associated to that location in an array. In my head, I can kind of picture a hash that has a reference to the site as the key, then an anonymous array to hold the users as the value, but I'm not super experienced at data structures using references. The other challenge that I see is checking to make sure that the array that represents a location is only created once and then making sure that all users after that point get associated with that location.

Replies are listed 'Best First'.
Re: Categorization Problem
by ikegami (Patriarch) on Nov 02, 2005 at 22:17 UTC
    Sounds right on
    # For each record: push(@{$logged_in{$fields[2]}}, $fields[0]); # Sample usage: foreach my $loc (sort keys %logged_in) { foreach my $user (@{$logged_in{$loc}}) { print("$user logged in at location $loc.\n"); } }
    or if you want the users only listed at most once per location:
    # For each record: $logged_in{$fields[2]}{$fields[0]} = 1; # Sample usage: foreach my $loc (sort keys %logged_in) { foreach my $user (sort keys %{$logged_in{$loc}}) { print("$user logged in at location $loc.\n"); } }
Re: Categorization Problem
by blue_cowdawg (Monsignor) on Nov 02, 2005 at 22:21 UTC

    use strict; use Data::Dumper; my $table={}; while (my $line=<DATA>){ chomp $line; my ($login,$name,$location,$info,$info2,$infoblah)= split(",",$line); if ( not defined($table->{$location})){ $table->{$location}=[]; } push @{$table->{$location}},{name => $name, login => $login, info => $info, info2 => $info2, infoblah => $infoblah }; } print Dumper($table); __END__ geroges,George Smart,West,bla,blah,blahbla pams,Pam Smart,South,bla,blah,blahbla dumas,Ed Dumas,West,bla,blah,blahbla fink,George Fink,West,bla,blah,blahbla gerogel,George Learch,North,bla,blah,blahbla frankf,Frank Furter,West,bla,blah,blahbla

    This will yeild

    $VAR1 = { 'West' => [ { 'info' => 'bla', 'info2' => 'blah', 'infoblah' => 'blahbla', 'name' => 'George Smart', 'login' => 'geroges' }, { 'info' => 'bla', 'info2' => 'blah', 'infoblah' => 'blahbla', 'name' => 'Ed Dumas', 'login' => 'dumas' }, { 'info' => 'bla', 'info2' => 'blah', 'infoblah' => 'blahbla', 'name' => 'George Fink', 'login' => 'fink' }, { 'info' => 'bla', 'info2' => 'blah', 'infoblah' => 'blahbla', 'name' => 'Frank Furter', 'login' => 'frankf' } ], 'North' => [ { 'info' => 'bla', 'info2' => 'blah', 'infoblah' => 'blahbla', 'name' => 'George Learch', 'login' => 'gerogel' } ], 'South' => [ { 'info' => 'bla', 'info2' => 'blah', 'infoblah' => 'blahbla', 'name' => 'Pam Smart', 'login' => 'pams' } ] };
    when run.... hope this helps.

Re: Categorization Problem
by GrandFather (Saint) on Nov 02, 2005 at 22:23 UTC

    That depends a whole lot on what you want to do with the data, how much data there is, how often it changes, where its coming from ...

    If there is a small (for some definition of small) quantity of data that you want to process in some fashion indexed by location then something like this is likely what you are after:

    use strict; use warnings; my %data; while (<DATA>) { my ($id, $location, $info) = /^((?:[^,]*,){2})([^,]*),(.*)$/; push @{$data{$location}}, "$id$info"; #Array of user entries for eac +h location } for (sort keys %data) { print "Location $_:\n "; print join "\n ", @{$data{$_}}; print "\n"; } __DATA__ Login,Name1,Location1,Info,Info2,Infoblah Login,Name2,Location1,Info,Info2,Infoblah Login,Name3,Location2,Info,Info2,Infoblah Login,Name4,Location2,Info,Info2,Infoblah

    Prints:

    Location Location1: Login,Name1,Info,Info2,Infoblah Login,Name2,Info,Info2,Infoblah Location Location2: Login,Name3,Info,Info2,Infoblah Login,Name4,Info,Info2,Infoblah

    Perl is Huffman encoded by design.
Re: Categorization Problem
by injunjoel (Priest) on Nov 02, 2005 at 22:32 UTC
    Greetings,
    Well if I am reading your question right I would suggest using a hash for the locations array you mentioned. That way the location is a unique key in the hash and can thus be found a lot easier than grepping through an array. Now the user information. I would store that in a "users" key in the location hash, either a hash of login as the key to names or an array. This would allow for you to add more categorical information to the location structure, maybe ip address, or something similar.
    so conceptually I would think something like this
    %locations = ( "location1" => { "users" => { "Login1" => "Name1", "Login2" => "Name2", ... ], "ip" => "127.0.0.1", "otherinfo" => [ "info", "info2","infoblah"...] }, "location2" => { "users" => { "Login1" => "Name1", "Login2" => "Name2", ... ], "ip" => "127.0.0.1", "otherinfo" => [ "info", "info2","infoblah"...] }, ... )

    In order to do that you can take advantage of autovivification with regard to datastructures.
    Here is a start.
    use strict; #input format for testing my $string = 'fooLogin,barName,bazLocation,Info,Info2,Infoblah'; my %locations; #split it up on commas from our example. #you would need to decide what to do with the rest. my @temp_info = split /,/,$string; $locations{$temp_info[2]}->{users}->{$temp_info[0]} = $temp_info[1]; $locations{$temp_info[2]}->{otherinfo} = [@temp_info[3..$#temp_info]];
    First off though I would read perldsc and perlref. Oh and you will want to use Dumpvalue or Data::Dumper to check your work.
    Good luck and I hope that helped.
    -InjunJoel
    "I do not feel obliged to believe that the same God who endowed us with sense, reason and intellect has intended us to forego their use." -Galileo
Re: Categorization Problem
by kayak9630 (Acolyte) on Nov 03, 2005 at 17:26 UTC
    OK - you guys have hit what I need 100% on target.

    What I have to output is:
    location,login1,login2,login3 location2,login1,login2,login3,login4, ...

    If my data structure is: $table = {
    'West' => [login1,login2,login3],
    'East' => [login1,login2,login3],
    'North' => [login1,login2,login3,login4],
    'South' => [login1,login2,login3]
    }

    Then how do I iterate through to get the data into:
    West,login1,login2,login3 East,login1,login2,login3 ...