dunkirk_phil has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to write a program to do the following. I have a list of links. Each link is identified by a start and finish end and I need the program to iterate through the lis and join the links up. For example the links are stored in a hash like this.
my %T1 = ( 'a|b' => 'a|b', 'b|c' => 'b|c', 'c|d' => 'c|d', 'j|k' => 'j|k', 'k|l' => 'k|l', 'l|m' => 'l|m', 'm|n' => 'm|n' );
and I want the links to be joined as a|b|c|d and j|k|l|m|n i.ie in my code below I want my %newTrailsToCombine hash to look like
my %newTrailsToCombine = ( 'a|b' => 'a|b|c|d', 'b|c' => 'a|b|c|d', 'c|d' => 'a|b|c|d', 'j|k' => 'j|k|l|m|n', 'k|l' => 'j|k|l|m|n', 'l|m' => 'j|k|l|m|n', 'm|n' => 'j|k|l|m|n' );
instead it looks like
%newTrailsToCombine$VAR1 = { 'm|n' => 'm|n|l|m', 'j|k' => 'j|k|k|l', 'k|l' => 'k|l|j|k|l|m', 'b|c' => 'b|c|a|b|c|d', 'a|b' => 'a|b|b|c', 'l|m' => 'l|m|k|l|m|n', 'c|d' => 'c|d|b|c'
Below is my code , any help/ideas would be appreciated Thanks Phil
#! /opt/perl/bin/perl -w use strict; use Data::Dumper; my %T1 = ( 'a|b' => 'a|b', 'b|c' => 'b|c', 'c|d' => 'c|d', 'j|k' => 'j|k', 'k|l' => 'k|l', 'l|m' => 'l|m', 'm|n' => 'm|n' ); my %trailsToCombine = %T1; my %newTrailsToCombine; my %clone_trailsToCombine; my $seenFirst; my %seen; %clone_trailsToCombine = %trailsToCombine; while (my($K, $V) = each (%trailsToCombine)) { my @Karr = split (/\|/,$V); my $tmp = $K; foreach my $i (@Karr) { while (my($SK, $SV) = each (%clone_trailsToCombine)) { if( $K eq $SK) { #ignore } elsif((my $count = grep /$i/,$SK) >= 1 ) { $tmp = $tmp."|$SK"; } else { #no match } } } if( ! exists $newTrailsToCombine{$K}) { $newTrailsToCombine{$K} = $tmp; } } #process_trailsToCombine(%trailsToCombine); print "\n Dumping \%newTrailsToCombine"; print Dumper(\%newTrailsToCombine);

Replies are listed 'Best First'.
Re: Link Connectivity Algorithim
by liverpole (Monsignor) on Oct 30, 2006 at 14:17 UTC
    Hi dunkirk_phil,

    It doesn't appear that you need a hash for the original data, so I went with the assumption that it could be a list instead:

    my @T1 = ( 'a|b', 'b|c', 'c|d', 'j|k', 'k|l', 'l|m', 'm|n' );

    Having done that, I think my solution does what you want:

    #! /opt/perl/bin/perl -w # # By liverpole 061030 # use strict; use warnings; use Data::Dumper; my @T1 = ( 'a|b', 'b|c', 'c|d', 'j|k', 'k|l', 'l|m', 'm|n' ); # Map each endpoint to the pair of endpoints, and create a # doubly-linked list of left and right pointers, eg.: # # %endpoints = ( %pprev = ( %pnext = ( # 'n' => [ 'm', 'n' ], 'l' => 'k', 'l' => 'm', # 'a' => [ 'a', 'b' ], 'n' => 'm', 'c' => 'd', # 'm' => [ 'm', 'n' ], 'c' => 'b', 'k' => 'l', # 'd' => [ 'c', 'd' ], 'k' => 'j', 'a' => 'b', # 'j' => [ 'j', 'k' ], 'b' => 'a', 'b' => 'c', # 'l' => [ 'l', 'm' ], 'm' => 'l', 'm' => 'n', # 'c' => [ 'c', 'd' ], 'd' => 'c', 'j' => 'k', # 'k' => [ 'k', 'l' ], ); ); # 'b' => [ 'b', 'c' ], # ) # my (%endpoints, %pprev, %pnext); foreach my $pair (@T1) { my ($left, $right) = split(/\|/, $pair); $endpoints{$left} = $endpoints{$right} = [ $left, $right ]; $pprev{$right} = $left; $pnext{$left} = $right; } # Now merge all trails. Each time a previous or next trail exists, # merge the endpoints, both in the %endpoints hash, and by making # the previous or next trail point to the same endpoint pair. # foreach my $pair (@T1) { my ($left, $right) = split(/\|/, $pair); my $pendpoints = $endpoints{$left}; my $prev = $pprev{$left}; while (defined($prev)) { # Adjust start of link $endpoints{$prev} = $pendpoints; $pendpoints->[0] = $prev; $prev = $pprev{$prev}; } my $next = $pnext{$right}; while (defined($next)) { # Adjust end of link $endpoints{$next} = $pendpoints; $pendpoints->[1] = $next; $next = $pnext{$next}; } } # Now simply construct the results for each pair my %trailsToCombine = ( ); foreach my $pair (@T1) { my ($left, $right) = split(/\|/, $pair); my $pendpoints = $endpoints{$left}; my ($start, $end) = @$pendpoints; my $path = $start; while ($start ne $end) { $start = $pnext{$start}; $path .= "|$start"; } $trailsToCombine{$pair} = "$path"; } # Display results print "\%trailsToCombine = ", Dumper(\%trailsToCombine);

    When run, this produces:

    %trailsToCombine = $VAR1 = { 'm|n' => 'j|k|l|m|n', 'j|k' => 'j|k|l|m|n', 'k|l' => 'j|k|l|m|n', 'b|c' => 'a|b|c|d', 'a|b' => 'a|b|c|d', 'l|m' => 'j|k|l|m|n', 'c|d' => 'a|b|c|d' };

    Is that what you were looking for?

    Update:  made the output show the *entire* path, rather than just the endpoints.

    Update 2:  fixed comment to correctly show contents of %endpoints after initialization.


    s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/
      Thanks a lot, it works a treat. How easy would it be to the script so that it could handle variable length values. E.g
      my @T1 = ( 'a|b|c', 'c|d', 'j|k|l|m', 'm|n|o', 'o|p|q|r|s' );
        Your question reminds me of the joke about the engineer and the mathematician who are asked for an algorithm to boil a pot of water.  They both say:

        1. Fill pot with water
        2. Carry pot to stove
        3. Light stove
        4. Wait for pot to boil

        Then they are asked for an algorithm for boiling a pot which already contains water.

        The engineer simply replies with steps 2-4 above.

        The mathematician replies:  "Step one = pour out the water.  Now the problem is reduced to the same as the first!"

        My simplistic answer would be to take your array and convert it to pairs (using a hash along the way):

        my @T1 = ( 'a|b|c', 'c|d', 'j|k|l|m', 'm|n|o', 'o|p|q|r|s' ); my %newT1; foreach (@T1) { my @points = split /\|/; for (my $i = 1; $i < @points; $i++) { my $link = $points[$i-1] . '|' . $points[$i]; $newT1{$link}++; } } my @newT1 = keys %newT1; # Verify that the new array contains the desired points printf "\@newT1 = %s\n", Dumper(\@newT1);

        Now the problem's been reduced to the first one.


        s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/
Re: Link Connectivity Algorithim
by jbert (Priest) on Oct 30, 2006 at 17:19 UTC
    You appear to be doing graph theory. You might want to check out the 'Graph' module on CPAN. It's possible your bigger problem (which you seem to want to solve by merging edges to form trails) is an instance of a more general problem solved by one of the supplied algorithms or methods.

    e.g. there is a connected_components method which looks like it might be related to your problem (perhaps not, depending on whether the ordering in your problem is an artifact of your example or not).

Re: Link Connectivity Algorithim
by Anonymous Monk on Oct 30, 2006 at 11:22 UTC
    The danger of presenting your problem with a single example is that people waste time coming up with solutions that aren't going to work for you.

    What for instance is supposed to happen if %T1 contains 'b|d' => 'b|d' instead of 'c|d' => 'c|d'? Or is that not allowed? (In which case, the solution seems to be very easy).

      Sorry this is my first post. To answer your question
      'b|d' => 'b|d'
      is not allowed
        So what *is* allowed? Can you have 'a|b' => 'b|c'? Can you have 'a|b' => 'b|a'? Can you have 'a|b|c' => 'x|y'? 'a|b' => 'x|y|z'? Be more specific on the constraints. (Your constraints will make the difference between a cubed, O(n log n) or a linear solution, so they are important).