ambrus has asked for the wisdom of the Perl Monks concerning the following question:

Here's a question about a non-perl homework.

I have a set of objects, say in the array @sg. These objects have a certain property which is typically different for most objects, but sometimes it's equal for two or more of them. I'd like to find and output these sets of objects (in any reasonable format).

We compute the property of each object like this: @sgprop = map { property($_) } @sg;. (This is the most time-consuming part of the task btw, but it's not our task here to improve it.) However, two equal properties can have different representations.

You can do two things with these properties. Firstly, you can create a fingerprint of the properties @sgpropfp = map { fingerprint($_) } @sg;. These fingerprints are plain array that you can easily compare lexicographically. Assume that the function for three-way comparing these arrays is available: fp_compare($sgpropfp[$i], $sgpropfp[$j]) returns -1, 0, +1 for less, equal, and greater resp. For equal properties, this fingerprint will always be the same, but it may accidentally match for different properties.

Secondly, there's an equivalence relation that compares two properties for equality value: prop_matching($sgprop[$i], $sgprop[$j]) returns true for equal properties and false for unequal, but can't tell the order of unequal properties.

So now I have to write code that outputs those sets of objects from @sg that have matching property value.

The code I wrote works like this: it sorts the stuff by fingerprints, then compares each two adjacent properties in this order, and if they are equal, the two objects are output. For example (this code is untested because the real code is not in perl):

@order = sort { fp_compare($sgpropfp[$a], $sgpropfp[$b]) } 0 .. @sg - +1; for (0 .. @sg - 2) { if (prop_matching($sgprop[$order[$_]], $sgprop[$order[$_ + 1]] +)) { print "property of $sg[$order[$_]] and $sg[$order[$_ + + 1]] is equal"; } }

This solution has two problems. Firstly, if the properties of three objects have equal fingerprints but the property of the second one really is different, then this program will not detect that the properties of the first and the third are equal. Secondly, it doesn't handle the case when the property of more than two objects are the same.

I'd like to fix these two problems. I think I started the right way, that is, I have to categorize the objects by their property fingerprints. However, I don't know what exactly to do after that.

I'll be glad for any help. Thanks.

Oh, and I need this by yesterday.

Replies are listed 'Best First'.
Re: Equivalence classes from equivalence relation and fingerprints
by ambrus (Abbot) on Jan 17, 2007 at 21:19 UTC

    Here's tye's solution. It took some time for me to get it through the chatterbox, even though the idea isn't really complicated. It uses a different interface then I do in the parent post.

    my %hash; OBJ: for my $obj ( @objects ) { my $fingerprint = $fingerprint{$obj}; my $class = \@{$hash{$fingerprint}} for( @$class ) { if( same($obj,$_->[0] ) { push @$_, $obj; next OBJ: } } push @$class, [ $obj ] }
Re: Equivalence classes from equivalence relation and fingerprints
by merlyn (Sage) on Jan 17, 2007 at 20:05 UTC
    If the "plain array" can be serialized consistently, just bucket them into a hash keyed by the serialized version. When you're done, anything in the same bucket is equal!
Re: Equivalence classes from equivalence relation and fingerprints
by kyle (Abbot) on Jan 17, 2007 at 20:29 UTC

    UPDATE: I like tye's solution better than mine.

    I think you're most of the way there. I haven't written the code, but the procedure would go like this:

    1. Get a list of your objects, sorted by fingerprint (not just a list of indexes).
    2. Loop over the list with something like:
      my $run_start = 0; foreach my $index ( 0 .. @sorted - 2 ) { if ( ! prop_matching( $sorted[$index], $sorted[$index+1] ) ) { output_sets( $run_start, $index ); $run_start = $index + 1; } }
    3. The sub output_sets starts with return if ( $run_start == $index );
    4. It then loops through the items in the run, using prop_matching to group them. Just take the first one, loop through and print everything that matches, remove those from the list, and repeat. You want to avoid printing the ones that match nothing.

      This solves only one of my two problems: the runs of matching properties longer than two objects.

      The other problem is when two objects of matching property are not adjacent even after the sorting because they are separated by an object with different property but accidentally matching fingerprint.

        The other problem is when two objects of matching property are not adjacent even after the sorting because they are separated by an object with different property but accidentally matching fingerprint.

        I think it does that too, actually. It's not as efficient as tye's solution, but it should work. I probably didn't describe it well enough, though, so let me have another crack at it.

        1. The (unwritten) output_sets function accepts a range that specifies a group of objects that have matching fingerprints.
        2. Turn that range into a list (say @fpmatches).
        3. shift the first item off @fpmatches and compare it to every other item in @fpmatches (with prop_matching). This should find every object with matching properties, regardless of how they sorted.
        4. The ones that match, output as a set, and remove from the @fpmatches list.
        5. Go back and shift off another object to compare, until the list is empty.
Re: Equivalence classes from equivalence relation and fingerprints
by OfficeLinebacker (Chaplain) on Jan 18, 2007 at 02:51 UTC
    Is this a situation where FreezeThaw might help?

    I like computer programming because it's like Legos for the mind.
Re: Equivalence classes from equivalence relation and fingerprints
by Moron (Curate) on Jan 18, 2007 at 13:10 UTC
    It seems uncertain that a solution in Perl would necessarily assist homework in random-other-language. But if it really were Perl, the first thing that would pop into my mind would be grep, e.g.:
    my @matching = grep property( $_ ), @sg;

    -M

    Free your mind