Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm using the following test code to match children with there parents, but it seems really expensive iterating over several data structures several times. Any suggestions or is there a better way?
#!/usr/bin/perl use warnings; use strict; chomp (my @array = <DATA>); my @map = @array; my %hash = (); my %hash2 = (); foreach my $parent (@array) { foreach my $child (@map) { if ($child =~ /$parent/) { push @{$hash{$parent}}, $child unless ($child eq $parent); } } } my @children; my @parent; my %family = (); my %seen = (); foreach my $key (keys %hash) { push @parent, $key; push @children, @{$hash{$key}}; } foreach my $parent (@parent){ foreach my $child (@children) { if ($child eq $parent) { $seen{$parent}++ } } } foreach my $parent (@parent) { if (!$seen{$parent}++ ){ print "Parent: ", $parent, "\n"; foreach (@{$hash{$parent}}) { print " child: $_ \n"; } } } __DATA__ Test_100.26.35.6_1 Test_100.26.35.6_13 Test2_9.25.6.27_2 Test2_9.25.6.27 Test_100.26.35.6 Test_100.26.35.6_10 Test3_28.20.116.210_ide Test3_28.20.116.210 Test4_28.25.6.21_45 Test4_28.25.6.21_45_25
Thanks!

Replies are listed 'Best First'.
Re: Algorithm performance enhancement needed
by dragonchild (Archbishop) on Apr 14, 2004 at 15:38 UTC
    You still need to define your parent-child relationships a little better. If you had "Test_100.26.35.6_103", who should be its parent(s)? With the data as given, it would have two parents, and I don't think that's what you want.

    Once you define your parent-child relationships more solidly, a better algorithm will more easily specified. (Right now, it's not really clear.)

    ------
    We are the carpenters and bricklayers of the Information Age.

    Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

      A child is considered to be anything that contains the whole parent (i.e. 10.2.3.5_1Test is a child to 10.2.3.5)
        I don't think you understand. Let's say we have the following data:
        abc abc1 abc12

        It's obvious that abc1 is a child of abc and abc12 is a child of abc1. But, is abc12 a direct child of abc or not? Your code would make abc12 a child of both abc and abc1.

        Essentially, you are attempting to take a list and make a two-level structure out of it. That may be perfectly correct for your needs, but that's not a standard parent-child data structure. Normally, trees have as many levels as are necessary and a given node has one and only one parent. Your rules do not match those criteria.

        Now, if you would give us what you're trying to accomplish, maybe we can help design the correct data structure, but all of our mind-reading helmets are currently in the shop.

        ------
        We are the carpenters and bricklayers of the Information Age.

        Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

Re: Algorithm performance enhancement needed
by EdwardG (Vicar) on Apr 14, 2004 at 15:59 UTC

    As dragonchild rightly points out, you leave us guessing about the parent-child relationship, but assuming that a "child" is defined as being identical to a parent but with a suffix of /_\w+/, and also assuming that "children" will not appear in the list unless a matching parent also appears in the list, you could start with the following code.

    use strict; use warnings; use Data::Dumper; chomp (my @family=<DATA>); my %parents; for (@family) { if (/(Test\d*_\d+(?:\.\d+){3})(_.+)/) { push @{$parents{$1}},$_; } } print Dumper(\%parents); __DATA__ Test_100.26.35.6_1 Test_100.26.35.6_13 Test2_9.25.6.27_2 Test2_9.25.6.27 Test_100.26.35.6 Test_100.26.35.6_10 Test3_28.20.116.210_ide Test3_28.20.116.210 Test4_28.25.6.21_45 Test4_28.25.6.21_45_25

    Output:

    $VAR1 = { 'Test3_28.20.116.210' => [ 'Test3_28.20.116.210_ide' ], 'Test2_9.25.6.27' => [ 'Test2_9.25.6.27_2' ], 'Test4_28.25.6.21' => [ 'Test4_28.25.6.21_45', 'Test4_28.25.6.21_45_25' ], 'Test_100.26.35.6' => [ 'Test_100.26.35.6_1', 'Test_100.26.35.6_13', 'Test_100.26.35.6_10' ] };
Re: Algorithm performance enhancement needed
by Roy Johnson (Monsignor) on Apr 14, 2004 at 18:01 UTC