Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I've got 3 elements in each array1 and array2. They act as tables in a database with a one to many relationship. There are unique values in array1 and duplicate values in array2. So for every $id in array1 there are multiple $id's in array2.
Now that I've got the arrays defined and I can get the values by doing

foreach (@array1) { ($id,$name,$ref) = split( /,/, $_); }

and
foreach (@array2) { ($id,$name,$ref) = split( /,/, $_); }

I want to print out $id once from array1 and the number of times $id appears in array2. Once I do this I can massage the data as needed.

Any kind advice or direction is welcomed.

Replies are listed 'Best First'.
Re: @array1 vs @array2
by mikfire (Deacon) on Mar 19, 2001 at 22:11 UTC
    To be honest with you, I would be reaching for some hashes at this point. Given the relationship you have described, I would create a hash that looks kinda like this:
    # Sorry about the MANY keyword - I would choose a better name # for the secondary data, but I have no clue what would make # sense. my %keyhash = ( $id => { NAME => $name, REF => $ref, MANY => [] } );
    To populate it, I would do something like this:
    for ( @array1 ) { my ($id,$name,$ref) = split /,/; # You may wish to add some error checking to make sure the # hash key $id does not already exist $keyhash{$id} = { NAME => $name, REF => $ref, MANY => [], } } for ( @array2 ) { my ($id,$name,$ref) = split /,/; # Warn and do nothing if a record is found for which the # $id is not already in %keyhash unless ( defined( $keyhash{$id} ) ) { warn "No such record $id!\n"; next; } push @{$keyhash{$id}{MANY}}, [ $name, $ref ]; }
    Although you may want MANY to be a hash - it really depends on how you want to use your data later.

    Finally, to extract the number of records for each ID,

    # A little something to get the plurality correct for ( keys %keyhash ) { my $num = @{$keyhash{$_}{MANY}}; printf "%s appeared %d %s\n", $_, $num, $num > 1 ? "times" : "time"; }

    I will say my choice of data structures ( hashes instead of arrays ) is really dependant on how you intend using the data later. Hashes, for me, seem to better reflect the relationship between tables better than arrays. YMMV, of course.

    Updated 14:04 It was pointed out I had dropped an equals sign. Sigh.

    mikfire

      Using the has above, how would I print $name instead of $id. What exactly does the "MANY" keyword refer to? Many thanks for helping.
        I am storing the contents of the second file under the MANY key in the hash. Explore perldoc perldsc for further understanding of the structure I built. It isn't a proper keyword, but, well, it was a word I used as key.

        You can print the name out by changing the $_ in the printf to $keyhash{$_}{NAME}

        mikfire

      Keep in mind that I'm new to hashes. I'm getting this error when running the above code.

      syntax error at ./file.pl line 15, near "%keyhash ("

      Any suggestions?

      Thanks for all the help, I really appreciate it.
      You rock!!!! Finally we've done it. Many many thanks to you and all who posted on this topic. I'm sure I'll be asking more questions. :)
Re: @array1 vs @array2
by davorg (Chancellor) on Mar 19, 2001 at 21:59 UTC

    Use a hash to count the number of occurances of each id in @array2, like this:

    my %count; foreach (@array2) { my ($id) = split /,/; $count{$id}++; } foreach (sort @array2) { my ($id) = split /,/; print "$id: $count{$id}\n"; }

    If some of the ids in @array1 don't appear in @array2 then you'll need to account for that in the second loop.

    But, all in all, I think you should probably rethink your data structure.

    --
    <http://www.dave.org.uk>

    "Perl makes the fun jobs fun
    and the boring jobs bearable" - me

Re (tilly) 1: @array1 vs @array2
by tilly (Archbishop) on Mar 19, 2001 at 22:01 UTC
    While this is doable I would suggest playing around with DBI and DBD::RAM or else learning references and using arrays of hashes.
Re: @array1 vs @array2
by bjelli (Pilgrim) on Mar 19, 2001 at 21:52 UTC

    OK, here's some example-arrays as well.

    @array1 = ( "1,Larry Wall" , "2,Douglas Adams"); @array2 = ( "1,Programing Perl,1", "2,The Hitchhikers Guide to the Galaxy,2", "2,Mostly Harmless,2"); foreach (@array1) { ($id,$name) = split( /,/, $_); print "The author is $name"; $no = grep { m/,$id$/ } @array2; print "We have $no books by this Author\n"; }
    --
    Brigitte    'I never met a chocolate I didnt like'    Jellinek
    http://www.horus.com/~bjelli/         http://perlwelt.horus.at
      I tried the first example and it only printed the customer name with a count of 0.
      open(FILE, "data.txt"); open(FILE2, "info.csv"); @data = <FILE>; @data2 = <FILE2>; ## TEST CODE foreach (@data) { ($id,$name,$ref) = split( /,/, $_); print "The customer is $name"; $no = grep { m/,$id$/ } @data2; print "We have $no docs by this $id\n"; } close(FILE); close(FILE2); exit; ## END TEST CODE

      Am I not thinking clearly(as I usually don't) on this or what?
      Here's the output:
      he customer is Customer1 coWe have 0 docs by this 1
      The customer is Customer2 coWe have 0 docs by this 2
      The customer is Customer3 coWe have 0 docs by this 3
        Remove the comma from the regex.

        But really, do not do it this way. Given that you are processing the entire contents of @array2 for each element in @array1, you have suddenly created an O(n^2) algorithm. This will hurt very quickly if either @array1 or @array2 gets very large. Especially since I would expect @array2 >> @array1.

        Try the one of the two hash solutions presented ( either one does basically the same thing ).

        mikfire