comment on

Basically I need to create a list of values in the correct order so that a "dependent" is always loaded after the parent otherwise the load simply will not work.

It took me a while to get my head around the issue, but I think it's accurate to rephrase it like this:

In order for the data set to be loaded correctly as a whole, records containing top-level parents must be loaded before those containing intermediate parents. In other words, whenever a given value is first seen in the "parent" column, insertion of that record must be deferred until either:

it is established that this value never occurs as the "child" in another record, or
the record containing this value as "child" has already been stored to the database.

(Meanwhile, records having the same value as both parent and child may be loaded at any time in the sequence.)

Supposing that's right, here's a demonstration on a small sample data set, using a hash structure to keep track of parent and child relations for each field value, and a recursive function to print out the records in the order required. (Recursion is not needed to load the hash.)

All this does is sort the input lines into the desired order (if a value serves as both child and parent, the record that has the value as child always comes first). This works well for loading the database, because you'll probably want to use whatever native loader is available for the particular database server you're using (inserts via DBI are very slow in comparison).

#!/usr/bin/perl

use strict;
use warnings;

my %node;

while (<DATA>) {
    my ( $c, $p ) = split;
    if ( $c eq $p ) {  # these are easy, so finish them first
        print;
        next;
    }

    if ( exists( $node{$c}{child_of} )) {
        warn "$.: bad record: $c is child of both $p and $node{$c}{chi
+ld_of}\n";
        next;
    }
    $node{$c}{child_of} = $p;
    $node{$p}{parent_of}{$c} = undef;
}

# begin the sorted output by looping over values that do not have pare
+nts:

for my $parent ( grep { !exists( $node{$_}{child_of} ) } keys %node ) 
+{
    my $children = $node{$parent}{parent_of};  # ref to hash of child 
+values
    trace_down( $children, \%node );
}

sub trace_down
{
    my ( $kids, $tree ) = @_;
    for my $kid ( keys %$kids ) {
        print "$kid $$tree{$kid}{child_of}\n";
        if ( exists( $$tree{$kid}{parent_of} )) {
            trace_down( $$tree{$kid}{parent_of}, $tree );
        }
    }
}

__DATA__
n1 n2
n3 n2
n4 n1
n5 n5
n6 n4
n7 n7
n8 n9
n10 n2
n11 n6
n2 n12
n7 n6
n8 n4
[download]

Note that I took the liberty of enforcing a "one-parent-only" constraint (the last line of DATA causes a warning, and is ignored). But I wasn't clear about the status of values that occur as both parent and child in a single record. (Should the next-to-last line of DATA cause a warning as well?) The special treatment of the "child eq parent" records might be problematic...

(Update: moved the line in the main "while" loop that stores the "parent_of" relation, so that it happens after the "one-parent-only" condition has been passed; if it happens before that check, it puts a redundant entry in the hash structure.)

In reply to Re: Perl modules or standard tools for searching hierachical data by graff
in thread Perl modules or standard tools for searching hierachical data by SlackBladder

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.