Re: Partial sort for dependency graph
by Joost (Canon) on Nov 11, 2006 at 11:30 UTC
|
As far as I can tell, you're looking for "topological sort".
Here's what I used: it's not very efficient, but good enough for my purposes.
# $set is a hashref containing all the objects you're sorting
# $connections_to is a hashref containing $from => $to pairs
# (i.e. $to is dependent on $from)
sub toposort {
my ($set,$connections_to) = @_;
my @order;
while (%$set) {
my ($start_node) =
grep {!$connections_to->{$_} || !%{$connections_to->{$_}} } val
+ues %$set;
if (! $start_node ) {
return; # circular dependency found
}
push @order, $start_node;
delete $set->{$start_node};
delete $connections_to->{$_}->{$start_node} for keys %$connections
+_to;
}
return \@order;
}
| [reply] [d/l] |
Re: Partial sort for dependency graph
by tilly (Archbishop) on Nov 11, 2006 at 15:02 UTC
|
Straightforwardness in code is a virtue.
While you can shoehorn sort into doing this, it will not be as clear as a more straightforward algorithm. Furthermore should your requirements change (eg some dataset unexpectedly has a circular dependency and your code needs to detect that), it will be easier to make that change with a more straightforward solution. Something like this:
sub rank_objects {
my @objects = @_;
my $n = 0;
my @next_objects = grep {not $_->depends_on} @objects;
my @current_objects;
while (@next_objects and $n < @objects) {
$n++;
@current_objects = @next_objects;
@next_objects = ();
for my $object (@current_objects) {
$object->set_rank($n);
push @next_objects, $object->depended_on_by;
}
}
my @circular = @next_objects, grep {not $_->rank} @objects;
if (@circular) {
# We have circular dependencies, report that somehow.
# In production code I'd actually try to report one of
# the cycles here.
die "Circular dependencies found.";
}
}
| [reply] [d/l] |
Re: Partial sort for dependency graph
by bart (Canon) on Nov 11, 2006 at 19:17 UTC
|
I think joost is right. Take a look at the module Graph, the method for "topological sort". That can return the modules in order of dependency, in that no module depends on something that comes later in the list. After you added all the modules and dependecies as nodes and connections to a graph first, of course. | [reply] |
Re: Partial sort for dependency graph
by jbert (Priest) on Nov 11, 2006 at 13:38 UTC
|
@objects = sort {
if ($a->t_d_o($b)) {
return -1;
elsif ($b->t_d_o($a)) {
return 1;
}
else {
return 0;
}
} @objects;
But I would imagine this would be pretty inefficient - t_d_o would have to be a tree walk all of its own I think.
Also, if you're processing a set of dependencies in order to do some processing on the lowest-ranked first (for example, building a set of C/C++ projects, each of which requires its dependencies be built first) you could always not sort at all, but dynamically create a makefile and pipe it to 'make' on STDIN :-) | [reply] [d/l] |
|
|
OK, I'm more awake now. And I realized that this wouldn't work.
Suppose that we have 4 objects, the second one depends on the fourth, and there are no other dependencies. Then Perl will (does a quick test) compare the first to the second, the third to the fourth, the first to the third and the third to the second. Then conclude that they are all equivalent and will return the list in the original order.
Note that the second and fourth never got compared at all.
The moral of the story is that sort algorithms assume that comparison functions are well-behaved. Your comparison function isn't so it misbehaves.
| [reply] |
|
|
Ah, thanks for this. You're absolutely right.
The fundamental problem (as noted in other threads) is that we only have a partial ordering on the set, i.e. some pairs of nodes don't satisfy (a <= b or a => b). My sort comparator returns equal for such pairs (e.g. two root nodes) which breaks the sorting algo, as you noted.
perldoc -f sort says: "The comparison function is required to behave. If it returns inconsistent results (sometimes saying $x1 is less than $x2 and sometimes saying the opposite, for example) the results are not well-defined." My comparator was badly-behaved for different reason. It doesn't satisfy transitivity: i.e. a == b && b == c does not imply a == c.
As well as CPAN's Graph module, there is slightly more direct Sort::Topological which appears to do the job (in essentially the same way the other code posted appears to do it - associate a depth and sort on that).
(Side note: if testing other approaches, note that your example list is already sorted - the first three elts are all root nodes). In the code below, I've reversed the dependency, so that the fourth depends on the second.)
#!/usr/bin/perl
use strict;
use warnings;
use Sort::Topological qw/toposort/;
use Class::Struct Item => {
id => '$',
deps => '@',
};
my @items;
push @items, Item->new(id => $_) for 1..4;
# Make $items[1] a rank 'deeper', so it should sort last
$items[3]->deps([$items[1]]);
map { print "id: ", $_->id, "\n" } @items;
# 1, 2, 3, 4
@items = toposort(sub { my $i = shift; @{$i->deps}; }, \@items);
map { print "id: ", $_->id, "\n" } @items;
# 1, 3, 4, 2
| [reply] [d/l] |
|
|
t_d_o would have to do a tree walk, true, but if it cached the result then t_d_o would not be that inefficient.
| [reply] |
|
|
I haven't thought it through in detail, but I would guess that (memoising t_d_o) would only help if the sort algo called the comparator multiple times for the same pair of $a and $b, which it presumably avoids doing as far as possible.
Inefficiency is in the eye of the beholder anyway. If it turns out to be the difference between 10ms and 100ms on the problem in question then it probably doesn't matter...
To fill in the missing piece, I guess t_d_o could look something like (sadly untested, so apologies for any craziness):
sub t_d_o {
my $s = shift;
my $obj = shift;
return 1 if $s eq $obj; # String compare on objects ok?
foreach my $dependency ($s->dependencies) {
return 1 if $dependency->t_d_o($obj); # Inf. loop on cycles. Whee.
}
return undef;
}
Hmmm...in which case memoising might help a lot (on the recursive steps) - you'd end up storing approx N^2 t_d_o values (for N objects). | [reply] [d/l] |
Re: Partial sort for dependency graph
by artist (Parson) on Nov 11, 2006 at 14:17 UTC
|
How sort is going to return you the ranks? If you can visuallize, it can only put your objects in sort orders, after you ranked them.
| [reply] [d/l] [select] |
Re: Partial sort for dependency graph
by zer (Deacon) on Nov 11, 2006 at 07:00 UTC
|
im having a hard time visualizing your intent. could you provide some examples? maybe some code? | [reply] |