Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I've got 3 elements in each array1 and array2. They act as tables in a database with a one to many relationship. There are unique values in array1 and duplicate values in array2. So for every $id in array1 there are multiple $id's in array2.
Now that I've got the arrays defined and I can get the values by doing

foreach (@array1) { ($id,$name,$ref) = split( /,/, $_); }

and
foreach (@array2) { ($id,$name,$ref) = split( /,/, $_); }

I want to print out $id once from array1 and the number of times $id appears in array2. Once I do this I can massage the data as needed.

Any kind advice or direction is welcomed.

Replies are listed 'Best First'.
Re: @array1 vs @array2
by mikfire (Deacon) on Mar 19, 2001 at 22:11 UTC
    To be honest with you, I would be reaching for some hashes at this point. Given the relationship you have described, I would create a hash that looks kinda like this:
    # Sorry about the MANY keyword - I would choose a better name # for the secondary data, but I have no clue what would make # sense. my %keyhash = ( $id => { NAME => $name, REF => $ref, MANY => [] } );
    To populate it, I would do something like this:
    for ( @array1 ) { my ($id,$name,$ref) = split /,/; # You may wish to add some error checking to make sure the # hash key $id does not already exist $keyhash{$id} = { NAME => $name, REF => $ref, MANY => [], } } for ( @array2 ) { my ($id,$name,$ref) = split /,/; # Warn and do nothing if a record is found for which the # $id is not already in %keyhash unless ( defined( $keyhash{$id} ) ) { warn "No such record $id!\n"; next; } push @{$keyhash{$id}{MANY}}, [ $name, $ref ]; }
    Although you may want MANY to be a hash - it really depends on how you want to use your data later.

    Finally, to extract the number of records for each ID,

    # A little something to get the plurality correct for ( keys %keyhash ) { my $num = @{$keyhash{$_}{MANY}}; printf "%s appeared %d %s\n", $_, $num, $num > 1 ? "times" : "time"; }

    I will say my choice of data structures ( hashes instead of arrays ) is really dependant on how you intend using the data later. Hashes, for me, seem to better reflect the relationship between tables better than arrays. YMMV, of course.

    Updated 14:04 It was pointed out I had dropped an equals sign. Sigh.

    mikfire

      You rock!!!! Finally we've done it. Many many thanks to you and all who posted on this topic. I'm sure I'll be asking more questions. :)
      Using the has above, how would I print $name instead of $id. What exactly does the "MANY" keyword refer to? Many thanks for helping.
        I am storing the contents of the second file under the MANY key in the hash. Explore perldoc perldsc for further understanding of the structure I built. It isn't a proper keyword, but, well, it was a word I used as key.

        You can print the name out by changing the $_ in the printf to $keyhash{$_}{NAME}

        mikfire

      Keep in mind that I'm new to hashes. I'm getting this error when running the above code.

      syntax error at ./file.pl line 15, near "%keyhash ("

      Any suggestions?

      Thanks for all the help, I really appreciate it.
Re: @array1 vs @array2
by davorg (Chancellor) on Mar 19, 2001 at 21:59 UTC

    Use a hash to count the number of occurances of each id in @array2, like this:

    my %count; foreach (@array2) { my ($id) = split /,/; $count{$id}++; } foreach (sort @array2) { my ($id) = split /,/; print "$id: $count{$id}\n"; }

    If some of the ids in @array1 don't appear in @array2 then you'll need to account for that in the second loop.

    But, all in all, I think you should probably rethink your data structure.

    --
    <http://www.dave.org.uk>

    "Perl makes the fun jobs fun
    and the boring jobs bearable" - me

Re (tilly) 1: @array1 vs @array2
by tilly (Archbishop) on Mar 19, 2001 at 22:01 UTC
    While this is doable I would suggest playing around with DBI and DBD::RAM or else learning references and using arrays of hashes.
Re: @array1 vs @array2
by bjelli (Pilgrim) on Mar 19, 2001 at 21:52 UTC

    OK, here's some example-arrays as well.

    @array1 = ( "1,Larry Wall" , "2,Douglas Adams"); @array2 = ( "1,Programing Perl,1", "2,The Hitchhikers Guide to the Galaxy,2", "2,Mostly Harmless,2"); foreach (@array1) { ($id,$name) = split( /,/, $_); print "The author is $name"; $no = grep { m/,$id$/ } @array2; print "We have $no books by this Author\n"; }
    --
    Brigitte    'I never met a chocolate I didnt like'    Jellinek
    http://www.horus.com/~bjelli/         http://perlwelt.horus.at
      I tried the first example and it only printed the customer name with a count of 0.
      open(FILE, "data.txt"); open(FILE2, "info.csv"); @data = <FILE>; @data2 = <FILE2>; ## TEST CODE foreach (@data) { ($id,$name,$ref) = split( /,/, $_); print "The customer is $name"; $no = grep { m/,$id$/ } @data2; print "We have $no docs by this $id\n"; } close(FILE); close(FILE2); exit; ## END TEST CODE

      Am I not thinking clearly(as I usually don't) on this or what?
      Here's the output:
      he customer is Customer1 coWe have 0 docs by this 1
      The customer is Customer2 coWe have 0 docs by this 2
      The customer is Customer3 coWe have 0 docs by this 3
        Remove the comma from the regex.

        But really, do not do it this way. Given that you are processing the entire contents of @array2 for each element in @array1, you have suddenly created an O(n^2) algorithm. This will hurt very quickly if either @array1 or @array2 gets very large. Especially since I would expect @array2 >> @array1.

        Try the one of the two hash solutions presented ( either one does basically the same thing ).

        mikfire