Basically I need to create a list of values in the correct order so that a "dependent" is always loaded after the parent otherwise the load simply will not work.
It took me a while to get my head around the issue, but I think it's accurate to rephrase it like this:
In order for the data set to be loaded correctly as a whole, records containing top-level parents must be loaded before those containing intermediate parents. In other words, whenever a given value is first seen in the "parent" column, insertion of that record must be deferred until either:
Supposing that's right, here's a demonstration on a small sample data set, using a hash structure to keep track of parent and child relations for each field value, and a recursive function to print out the records in the order required. (Recursion is not needed to load the hash.)
All this does is sort the input lines into the desired order (if a value serves as both child and parent, the record that has the value as child always comes first). This works well for loading the database, because you'll probably want to use whatever native loader is available for the particular database server you're using (inserts via DBI are very slow in comparison).
Note that I took the liberty of enforcing a "one-parent-only" constraint (the last line of DATA causes a warning, and is ignored). But I wasn't clear about the status of values that occur as both parent and child in a single record. (Should the next-to-last line of DATA cause a warning as well?) The special treatment of the "child eq parent" records might be problematic...#!/usr/bin/perl use strict; use warnings; my %node; while (<DATA>) { my ( $c, $p ) = split; if ( $c eq $p ) { # these are easy, so finish them first print; next; } if ( exists( $node{$c}{child_of} )) { warn "$.: bad record: $c is child of both $p and $node{$c}{chi +ld_of}\n"; next; } $node{$c}{child_of} = $p; $node{$p}{parent_of}{$c} = undef; } # begin the sorted output by looping over values that do not have pare +nts: for my $parent ( grep { !exists( $node{$_}{child_of} ) } keys %node ) +{ my $children = $node{$parent}{parent_of}; # ref to hash of child +values trace_down( $children, \%node ); } sub trace_down { my ( $kids, $tree ) = @_; for my $kid ( keys %$kids ) { print "$kid $$tree{$kid}{child_of}\n"; if ( exists( $$tree{$kid}{parent_of} )) { trace_down( $$tree{$kid}{parent_of}, $tree ); } } } __DATA__ n1 n2 n3 n2 n4 n1 n5 n5 n6 n4 n7 n7 n8 n9 n10 n2 n11 n6 n2 n12 n7 n6 n8 n4
(Update: moved the line in the main "while" loop that stores the "parent_of" relation, so that it happens after the "one-parent-only" condition has been passed; if it happens before that check, it puts a redundant entry in the hash structure.)
In reply to Re: Perl modules or standard tools for searching hierachical data
by graff
in thread Perl modules or standard tools for searching hierachical data
by SlackBladder
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |