The following looks to do the shortest path variant:

#!/usr/bin/perl use warnings; use strict; my %group = ( # Hash table/dictionary for all the groups 'P' => 'I_1', 'Pl' => 'I_2', 'P.P' => 'I_3', 'P.Pl' => 'I_4', 'Pl.P' => 'I_5', 'Pl.Pl' => 'I_6', 'P.P.P' => 'I_7', 'P.P.Pl' => 'I_8', 'P.Pl.P' => 'I_9', 'P.Pl.Pl' => 'I_10', 'Pl.P.P' => 'I_11', 'Pl.P.Pl' => 'I_12', 'Pl.Pl.P' => 'I_13', 'Pl.Pl.Pl' => 'I_14', 'E' => 'II_15', 'P.E' => 'II_16', 'Pl.E' => 'II_17', 'P.P.E' => 'II_18', 'P.Pl.E' => 'II_19', 'Pl.P.E' => 'II_20', 'Pl.Pl.E' => 'II_21', 'E.P' => 'III_22', 'E.Pl' => 'III_23', 'P.E.P' => 'III_24', 'P.E.Pl' => 'III_25', 'Pl.E.P' => 'III_26', 'Pl.E.Pl' => 'III_27', 'E.P.P' => 'III_28', 'E.P.Pl' => 'III_29', 'E.Pl.P' => 'III_30', 'E.Pl.Pl' => 'III_31', 'E.E' => 'IV_32', 'P.E.E' => 'IV_33', 'Pl.E.E' => 'IV_34', 'E.P.E' => 'IV_35', 'E.Pl.E' => 'IV_36', 'E.E.P' => 'IV_37', 'E.E.Pl' => 'IV_38', 'E.E.E' => 'IV_39', ); <DATA>; # Skip the headers (first row). my %tree; while (<DATA>) { # parse through the input data and fill in our tree data structure chomp; my ($child, $parent, $prob) = split /\t/; if ($child eq 'Q') { push @{$tree{$child}}, {parent => '', prob => $prob, dist => 0 +}; next; } if ($parent eq 'Q') { push @{$tree{$child}}, {parent => $parent, prob => $prob, dist + => 1}; next; } for my $opt (@{$tree{$parent}}) { my $dist = $opt->{dist} + 1; push @{$tree{$child}}, {parent => $parent, prob => $prob, dist => $dist}; } } for my $child (sort {length $a <=> length $b or $a cmp $b} keys %tree) + { my @bestPath = findBestPath($child, \%tree); my $probs = join '.', map {$_->{prob}} @bestPath; printf "%-5s ", "$child:"; # Join the likelihood path. Then if group is found for a likelihoo +d #from the group hash table then print it, else quit print join '<-', $child, grep {$_} map {$_->{parent}} @bestPath; print ", $probs"; print ", $group{$probs}" if exists $group{$probs}; print "\n"; } sub findBestPath { my ($child, $tree) = @_; return $tree->{Q}[0] if $child eq 'Q'; my @alts = sort {$a->{dist} <=> $b->{dist}} @{$tree->{$child}}; return $alts[0], findBestPath($alts[0]{parent}, $tree); } __DATA__ child, Parent, likelihood M7 Q P M54 M7 Pl M213 M54 E M206 M54 E M194 M54 E ...

Prints (in part):

Q: Q, E, II_15 M6: M6<-Q, E.E, IV_32 M7: M7<-Q, P.E, II_16 M10: M10<-Q, E.E, IV_32 M13: M13<-M7<-Q, E.P.E, IV_35 M17: M17<-Q, P.E, II_16 M18: M18<-Q, E .E M22: M22<-Q, E.E, IV_32 M23: M23<-Q, E.E, IV_32 M28: M28<-M6<-Q, P.E.E, IV_33 M33: M33<-M28<-M6<-Q, E.P.E.E
True laziness is hard work

In reply to Re: find shortest path for each query from a CSV file by GrandFather
in thread find shortest path for each query from a CSV file by zing

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.