lampros21_7 has asked for the wisdom of the Perl Monks concerning the following question:

Hi to all the monks I have two arrays and each element is a string. Each string might exist more than once in each array. What i want is the number of elements that exist in both arrays. The problem is that if a string exists more than once in each array only add the smallest number. So if i have a string in one array twice and in the other one 5 times then add 2. I tried something but without being able to fix this problem, if i had the example before it would add 5.
%string_1 = map{$_ =>1} @string1; @intersection_of_the_2_arrays = grep( $string_1{$_}, @string2 ); $number_of_similar_words = scalar(@intersection_of_the_2_arrays);

Thanks for viewing and for anyone who can help.

Replies are listed 'Best First'.
Re: Comparing two arrays
by NetWallah (Canon) on Feb 25, 2006 at 17:48 UTC
    your grep statement needs to first filter out duplicates in @string2, before assigning to @intersection_of_the_2_arrays.

    You could do this by using a statement similar to your first map.

         "For every complex problem, there is a simple answer ... and it is wrong." --H.L. Mencken

Re: Comparing two arrays
by sk (Curate) on Feb 25, 2006 at 17:50 UTC
    Something like this? -

    #!/usr/bin/perl use strict; use warnings; my @x = qw (hi there hello there hello world); my @y = qw (hi there hello new element there there); my (%xh,%yh) = (); # count the number of repeats in each array $xh{$_}++ for (@x); $yh{$_}++ for (@y); my $common = 0; # for all keys in the hash check if it exists in the other hash. if it + does then choose the min of the two for number of repeats. for (keys %xh) { $common += min($xh{$_}, $yh{$_}) if (exists($yh{$_})); } print "Number common = $common\n"; sub min { return (($_[0] <= $_[1]) ? $_[0] : $_[1]); }

    Output

    Number common = 4

    In the above list we have hi (1), there (2), hello(1). Note: I have not tested it extensively.

    cheers

    SK

Re: Comparing two arrays
by leocharre (Priest) on Feb 25, 2006 at 19:38 UTC

    this was freaking weird.. sorry at least for me- maybe it's my terrible hangover. but.. here goes..

    #!/usr/bin/perl use strict; my @a1=qw(yo and this too too too a b c d d d d d); my @a2=qw(yo and too too x f 3 d d moon); my %a1=(); my %a2=(); for (@a1){ $a1{$_}++; #$found{$_}++; } for (@a2){ $a2{$_}++; #$found{$_}++; } for (keys %a1){ $a2{$_} or delete $a1{$_}; } for (keys %a2){ $a1{$_} or delete $a2{$_}; } # now a1 and a2 both contain the same keys, but possibly diff values my %highest=(); for (keys %a1){ # or a2, same thing if ($a1{$_} > $a2{$_} ){ $highest{$_}=$a1{$_}; } else { $highest{$_}=$a2{$_}; # also if they are the same, non higher +then the other } } for (keys %highest){ print "$_:$highest{$_}\n"; }

    output is:

    yo:1
    d:5
    and:1
    too:3
    

    hope that helps or amuses

Re: Comparing two arrays
by ayrnieu (Beadle) on Feb 25, 2006 at 22:23 UTC
    package Frobcounting; sub new { my ($class, $a) = @_; my $h = {}; $h->{$_}++ for @$a; bless $h, $class; } sub new_d { my ($class, $a, $b) = @_; my $h = {}; my $h = { a => Frobcounting->new($a), b => Frobcounting->new($b) }; bless $h, $class; } sub frobcount { $_[0]->{$_[1]} } sub frobcount_d { my ($fc_d, $e) = @_; my $ea = $fc_d->{a}->frobcount($e); my $eb = $fc_d->{b}->frobcount($e); $ea < $eb ? $ea : $eb } sub smaller_d { my ($fc_d) = @_; my @ka = keys %{$fc_d->{a}}; my @kb = keys %{$fc_d->{b}}; @ka < @kb ? @ka : @kb } package main; my $fc = Frobcounting->new_d(\@string1, \@string2); for (sort $fc->smaller_d) { print "$_: ${\$fc->frobcount_d($_)}\n" }
    All that by your problem. As for comparing two arrays, and getting such things as 'the number of elements that exist in both arrays', have you looked at List::Compare ?