Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Mutually Exclusive Elements

by Anonymous Monk
on Dec 02, 2004 at 22:27 UTC ( #411973=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have few arrays with some common elements. I like to get minimum 'X' items from each array such that they represent maximum mutually exclusive elements if possible, to show the uniqness of the array. The top 'X' elements should be in the order of insertion if possible. All these arrays will have atleast one common element, which should be there in the results. My idea is to have 'elements' that I can use 'identity' or original array.

So

If $a = [1,2,3,4,5,6] $b = [3,4,5,7,8] $c = [3,6,8,9] and X = 3 then $a_ = [1,2,3] $b_ = [3,7,8] $c_ = [3,6,9]
What is the good algorithm or modules I should use for the same ?
Thanks

Replies are listed 'Best First'.
Re: Mutually Exclusive Elements
by NetWallah (Canon) on Dec 02, 2004 at 23:58 UTC
    Here is an outline of an algorithm you can adapt:
    my %h; my $n=3; sub add{ my ($n,$v,@a)=@_; (%h < $n and not exists $h{$_} and $h{$_}=$v ) for @a }; add($n, # Allow $n unique elements 1, # This is the First array to be added # Numbers below are the contens of the array to be added 3,4,5,6,3,2,4,1); # First set of numbers $n+=$n; # Now increase the number of elements allowed add($n,2,4,3,2,7,8,2); # Second set of numbers print qq($_ => $h{$_} \n) for sort keys %h --OUTPUT--- 2 => 2 3 => 1 4 => 1 5 => 1 7 => 2 8 => 2
    The VALUES of the hash indicate the source array it came from.

    Yes - this can be optimized, and use of the global %h can be avoided - enhancements left as an exercise.

        ...each is assigned his own private delusion but he cannot see the baggage on his own back.

Re: Mutually Exclusive Elements
by Zed_Lopez (Chaplain) on Dec 03, 2004 at 00:17 UTC

    I'm not sure about your requirements. I wrote a solution that produces

    [1,2,3] [3,4,7] [3,6,9]

    This seems more consistent with your stated requirements than your example results. 4 & 8 are equally common -- they appear twice in the arrays. 4 occurs before 8 in $b, thus the elements appear in $b's order of insertion.

    At any rate, my algorithm was:

    1. Traverse all the arrays, counting how many of each element appear.
    2. Sort the results in order of ascending frequency and assign it to @frequency.
    3. Take the last of these (i.e. most frequent, or tied for most frequent, and thus common to all) and assign it to $common.
    4. Die with an error message if $common < the number of arrays -- there wasn't really an element common to all.
    5. Start building the results for each array:
      1. First, assign $common.
      2. Then, for each element in @frequency, test if that element is present in the current array. If it is, assign it to the results for this array. Do this X-1 times (assigning $common took care of one element.)
      3. Sort this array's results.
    6. Push an arrayref for this array's results on your final results array.

    Updated: my unstated assumption in the above was that that code would be in a subroutine to be called like so:

    my ($a_, $b_, $c_) = mutually_exclusive($X, $a, $b, $c);
Re: Mutually Exclusive Elements
by osunderdog (Deacon) on Dec 03, 2004 at 00:07 UTC

    Sounds like set arithmetic to me.. I would recommend Set::Scalar

    I threw this together as an example. Course order is wacked when you use sets...

    use strict; use Set::Scalar; my $a = [1,2,3,4,5,6]; my $b = [3,4,5,7,8]; my $c = [3,6,8,9]; my $sa = Set::Scalar->new(@$a); my $sb = Set::Scalar->new(@$b); my $sc = Set::Scalar->new(@$c); my $x = 3; my $mina = $sa - $sb - $sc + ($sa * $x); my $minb = $sb - $sa + ($sb * $x); my $minc = $sc - $sb + ($sc * $x); print("Minimum Set A:" . join(',', $mina->members) . "\n"); print("Minimum Set B:" . join(',', $minb->members) . "\n"); print("Minimum Set C:" . join(',', $minc->members) . "\n"); __END__ Minimum Set A:1,3,2 Minimum Set B:8,3,7 Minimum Set C:6,3,9

    "Look, Shiny Things!" is not a better business strategy than compatibility and reuse.


    OSUnderdog

      Unless I completely misunderstand the requirements, it's only by coincidence you're getting the same results. $x is the number of elements to return, which happens to be equal to the value of the element which is common to all in the given example. And it's a coincidence that you get two elements after your set differences such that after your union, you're up to 3 elements.

      With a $x that was different from the value of the common element, or more elements in common such that the set differences didn't have two elements, your results would be very different.

      Why did you do $sa - $sb - $sc for the first and not $sb - $sa - $sc and $sc - $sa - $sb for the second and third?

Re: Mutually Exclusive Elements
by TedPride (Priest) on Dec 03, 2004 at 00:13 UTC
    Not sure exactly what you're trying to do here, but I've assumed that you want an output of 3 values for each array, the first two being the most unique two values and equal uniqueness being decided by array position, and the third being the first value common to all arrays. I know this doesn't yield the same output as what you have, but I wasn't sure what you wanted:
    use strict; use warnings; my (@sets, $n, %c, @u, @a); while (<DATA>) { chomp; push @sets, [split /,/]; } for (@sets) { $c{$_}++ for (@$_); } $n = $#sets + 1; for (@sets) { print '[' . join(',', @$_) . '] = ['; @u = (); @a = (); for (@$_) { if ($c{$_} != $n) { push(@u, [$_, $c{$_}]); } else { push (@a, $_); } } for ((sort {@$a[1] cmp @$b[1]} @u)[0..1]) { print @$_[0] . ','; } print $a[0] . "]\n"; } __DATA__ 1,2,3,4,5,6 3,4,5,7,8 3,6,8,9
    The key is the counts:
    for (@sets) { $c{$_}++ for (@$_); }
    This gives you the total number of times each value is used over the entire nested array. If the number is equal to the number of arrays, then the value is common to all arrays; if the number is 1, then the value is unique to whichever array contains it. Numbers in between can be used to fill in if there aren't enough 1's, as shown above.

    I've assumed that there are always at least two values in each array not common to all arrays. I've also assumed that the arrays can contain non-numerical values, which is why I'm using cmp for the sort instead of <=>.

Re: Mutually Exclusive Elements
by dragonchild (Archbishop) on Dec 03, 2004 at 14:29 UTC
    Your requirements make no sense, from a mathematical perspective. You're looking for "mutually exclusive elements" ... what does that mean for you? My reading of it would say that you don't want the common element of '3' ... but, you say you want it.

    Another help would be for you to describe how you got your results by hand ... remember, a computer program is nothing more than the encapsulation of what someone else would have done by hand.

    Being right, does not endow the right to be rude; politeness costs nothing.
    Being unknowing, is not the same as being stupid.
    Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
    Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

      Having single common element is my requirement. You can take that out and calculate 'maximum possible mututally exclusive elments' with minimum number of items in the list as X.

      Assume that I have 1000 of such lists. Each list has upto 100 items. I am searching for a particular element. So I like to get all lists which has 'that' element'. Now I don't want to display entire list. I like to list top 10 elements from each list which can represent the relative uniqueness of that particular list compared to others.

        If you're searching for which lists have a certain item, you will want to use a hash for each list. If you don't want to display the whole list, maybe display the name of the list and an option to click through to see the whole list - kind of a summary vs. detail approach.

        Either way, it seems like a lot of work to calculate a rather complicated set-theory value, just for display purposes.

        Being right, does not endow the right to be rude; politeness costs nothing.
        Being unknowing, is not the same as being stupid.
        Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
        Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://411973]
Approved by kvale
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (3)
As of 2023-10-02 15:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?