MH1 has asked for the wisdom of the Perl Monks concerning the following question:

Greetings monks, long time lurker, first post. Typically I can googlefoo my way around to find a solution but I'm not sure how to go about that for this problem. I have a flat file consisting of item and item dependencies. But I need to be able to track it back to find all the potential dependencies, basically following the rabbit down the hole.

Plain Example: if item1 requires item2 and item3, and item3 requires item4 -> then item1 also requires item4.

I'm working with around 120k rows so I know the relationships are going to be rather complex, however a task is a task.

Relevant Code:

#load hash after some parsing in a foreach loop push(@{$jobs{$JobName}{1}}, $blah); ... $level =1; $nextlevel = 2; while ($level <= 10) { foreach my $dep (keys %jobs) { foreach (@{$jobs{$dep}{$level}}) { print "$dep $level $_\n"; push( @{$jobs{$dep}{$nextlevel}},@{$jobs{$_}{$level}} ); next if ${$jobs{$_}{$level}}[0] eq "$dep"; last if ${$jobs{$_}{$level}}[0] eq ''; #this cancels the loop +if any of the dependencies don't have dependencies, which is a proble +m if there's additional } } $level++; $nextlevel++; }

So one of the problems is the last statement is prematurely breaking the loop if X dep doesn't have a dep, but there's additional deps in the array. Without that terminate feature the script will run until the while loop has been fulfilled, I have no way of breaking out once I have tracked all the dependencies down. As for CPAN and modules, given my environment I'm very limited on what I can use.

Example Data(first column is job, rest are dependencies):

job1, job2,job1, job3,job2 job4,job2 job5,job2,job4 job6,jobz joba,job4,jobb jobz,jobbc,job2

Any guidance will be greatly appreciated!

I figured it out, made some loop structure changes, thanks VinsWorldcom for your replay

while ($level <= 10) { foreach $dep (keys %jobs) { foreach (@{$jobs{$dep}{$level}}) { print "Key:$dep Value:$_ Lvl:$level Seen:$seen{$_}\n"; if ($seen{$_} eq '') { push( @{$jobs{$dep}{$nextlevel}},@{$job +s{$_}{$level}} ); } $seen{$_} = 1; next if ${$jobs{$_}{$level}}[0] eq "$dep"; last if $seen{$_} == 1; } } %seen = ''; @temparray = ''; $level++; $nextlevel++; }

Replies are listed 'Best First'.
Re: Multi level dependency structure
by GrandFather (Saint) on Feb 11, 2016 at 23:19 UTC

    Recursion and a "been there" flag is a clean way to handle the problem:

    use strict; use warnings; use Data::Dump; my %depends; while (<DATA>) { chomp; my ($job, @deps) = split ','; $depends{$job}{$_} = undef for @deps; } addDepends (\%depends, $_, keys %{$depends{$_}}) for keys %depends; print Data::Dump::dump(\%depends); sub addDepends { my ($depends, $target, @deps) = @_; for my $dep (@deps) { next if $depends->{$target}{$dep}++; addDepends($depends, $target, keys %{$depends{$dep}}); } } __DATA__ job1, job2,job1, job3,job2 job4,job2 job5,job2,job4 job6,jobz joba,job4,jobb jobz,jobbc,job2

    Prints:

    { job1 => {}, job2 => { job1 => 1 }, job3 => { job1 => 1, job2 => 1 }, job4 => { job1 => 1, job2 => 1 }, job5 => { job1 => 2, job2 => 2, job4 => 1 }, job6 => { job1 => 2, job2 => 1, jobbc => 1, jobz => 1 }, joba => { job1 => 1, job2 => 1, job4 => 1, jobb => 1 }, jobb => {}, jobbc => {}, jobz => { job1 => 1, job2 => 1, jobbc => 1 }, }
    Premature optimization is the root of all job security

      ... wow ... ++

Re: Multi level dependency structure
by pokki (Monk) on Feb 11, 2016 at 19:34 UTC

    Edit: dammit, I *just* saw that you can't install CPAN modules. Still, Graph.pm has no dependencies, so maybe there's a chance?

    I've been there. What you need is a real directed graph structure. I'm not sure how it will handle a 120k edge graph, but I've put a sizeable subset of the CPAN dependency graph (our entire org's module dep graph, actually -- including the high level modules, web apps, etc. so DBIC and other heavyweight stuff) in a Graph.pm object.

    It makes for extremely readable code too. Below is an example (not guaranteed to run, I don't have Perl on this machine):

    use Graph::Directed; my $graph = Graph::Directed->new; # Building the graph. In this example, edges are directed from a job +to a dependency. while (my $line = $file->getline) { my ($job, @deps) = split(/,/, $line); foreach my $dep (@deps) { # adds both $job and $dep as nodes in the graph if they don't +exist already $graph->add_edge($job, $dep); } } # What does job FOO require? say "FOO requires $_" foreach $graph->all_successors('FOO'); # Who requires FOO? say "FOO is required by $_" foreach $graph->all_predecessors('FOO'); # Jobs that do not have prerequisites. say "$_ can be executed right now!" foreach $graph->sink_vertices; # Somebody messed up here: say "A job requires itself!" if $graph->has_a_cycle;

    It's really comfortable, and reading the POD for Graph.pm might give you lots of ideas of cool stuff to do with your dependency graph :)

      pokki, absolutely brilliant! ++ I needed to see this work myself and had Graph::Easy but couldn't duplicate. The CPAN install for Graph::Directed was easy (Win7 x64 / Strawberry 5.18.1) and I whipped this up based on your example:

      #!perl use strict; use warnings; use Graph::Directed; my $graph = Graph::Directed->new; while ( my $line = <DATA> ) { chomp $line; my ( $job, @deps ) = split( /,/, $line ); for my $dep (@deps) { $graph->add_edge( $job, $dep ); } } for my $job ( sort ( $graph->vertices ) ) { print "$job requires "; for ( sort ( $graph->all_successors($job) ) ) { print "$_ "; } print "\n"; } __DATA__ job1, job2,job1, job3,job2 job4,job2 job5,job2,job4 job6,jobz joba,job4,jobb jobz,jobbc,job2

      and outputs

      job1 requires job2 requires job1 job3 requires job1 job2 job4 requires job1 job2 job5 requires job1 job2 job4 job6 requires job1 job2 jobbc jobz joba requires job1 job2 job4 jobb jobb requires jobbc requires jobz requires job1 job2 jobbc
Re: Multi level dependency structure
by VinsWorldcom (Prior) on Feb 11, 2016 at 18:40 UTC

    I understand exactly what you're talking about - the CPAN client does something like this when installing modules and dependencies are discovered.

    I did something similar when coding a simulation where mobile nodes could talk to each other either directly or by relaying through adjacent nodes if the endpoints were too far way. Essentially, if node1 could talk to node2 but not directly to node3 *AND* node2 could talk to node3, then node1 could in fact talk to node3 (by relaying through node2). If that sounds to you (like it does to me) similar to your problem, have a look at Union/Intersection of Multiple Arrays where JavaFan gave me the idea of "transitive closure" that solved my problem.

Re: Multi level dependency structure
by Laurent_R (Canon) on Feb 11, 2016 at 22:32 UTC
    Hi,

    I *think* this quick program does what you need:

    use strict; use warnings; use Data::Dumper; my %dependencies; while (<DATA>) { chomp; my ($key, @items) = split /,/; $dependencies{$key} = [@items]; } # print Dumper \%dependencies; my $continue = 1; while ($continue) { $continue = 0; for my $key (keys %dependencies) { my @jobs = @{$dependencies{$key}}; my %seen = map { $_ => 1 } @jobs; my @add; for my $job (@jobs) { next unless exists $dependencies{$job}; my @new_jobs = @{$dependencies{$job}}; for my $new_job (@new_jobs) { next if exists $seen{$new_job}; $continue = 1; push @add, $new_job; $seen{$new_job} = 1; } } push @{$dependencies{$key}}, @add; } } print Dumper \%dependencies; __DATA__ job1, job2,job1, job3,job2 job4,job2 job5,job2,job4 job6,jobz joba,job4,jobb jobz,jobbc,job2
    I have tested only with the simple example you provided and it seems to do what you need, but much more thorough testing would be needed. Although I have tried to take the case into account and to guard against it, you would need to make really sure that the program does not go into an infinite loop in the event of circular dependencies.