I'm not sure if this really adds to what has already been said, but when I thought about the problem, I thought it'd be nice to at all times have a minimal graph representing everything known so far.

So the sequences (AB, AC, BC) would produce the series of graphs

A B
A B C
A B C
To do this, consider the intermediate step where you have a minimal graph for all the previous sublists seen, and you are walking through a new sublist. Keep a pointer to your current position in the graph, and move it down along edges as long as the next element is a child of the current position. If the next element is not a child, you're going to add it as a child, which might create redundant links in the graph: (after adding B->C)
A |\ B | |/ C
To remove them, loop through all parents of the new node and delete the ones that are reachable via the new link (in the above example, you're adding B->C, so you look at the parents of C and discover that A is reachable via B, so you delete that link). Do the same in the other direction to prune redundant children:
sub reachable { my ($links, $start, $dest) = @_; return 1 if $start eq $dest; foreach (keys %{ $links->{$start} }) { return 1 if reachable($links, $_, $dest); } return 0; } my %next; my %prev; my $ptr; while(<DATA>) { chomp; if ($_ eq 'Start') { undef $ptr; next; } if (defined $ptr) { if (! $next{$ptr}{$_}) { for my $parent (keys %{ $prev{$_} }) { if (reachable(\%prev, $ptr, $parent)) { delete $prev{$_}{$parent}; } } for my $child (keys %{ $next{$_} }) { if (reachable(\%next, $ptr, $child)) { delete $next{$_}{$child}; } } $next{$ptr}{$_} = 1; $prev{$_}{$ptr} = 1; } } $ptr = $_; }
(Because this is all done via hashes, you're not really keeping a pointer to where you are in the dag, because the string itself takes you directly to the right place via %next and %prev.)

Then you just need to look through %prev to find all keys with no values. They are the roots. If there is more than one, you have a problem. Otherwise, walk through %next to find the linear chain. If you ever have more than one child, you have a problem.

One way to make an arbitrary decision and resolve these problems is to, whenever faced with a choice (multiple roots or multiple children), add edges from one to another and rerun the same minimization fixup.

This isn't the fastest algorithm, because you're constantly doing full reachable() queries that could be truncated in many cases, but it's pretty straightforward.

It might even work.


In reply to Re: Reconstructing List Order From Partial Subsets by sfink
in thread Reconstructing List Order From Partial Subsets by QM

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.