in reply to Maximal Parsimony Problem

# this heuristic seems to be faulty my $min_mm = scalar( uniq(@colbp) ) - 1;

To me this seems perfectly right. If you want to speed up things a bit more (this is only a micro optimization) you can use a hash in the first place:

my %chars; foreach my $site ( @{$tfbs} ) { my $bp = substr($site,$pos,1); $chars{$bp} = 1; } # this heuristic seems to be faulty my $min_mm = keys %chars - 1; push @mincol, $min_mm;

Instead I think that your sample output is wrong:

# is: 00100000302011000000100 # should be: 00100000301011000000100

If not, could you please explain how to get the 2 there?

Replies are listed 'Best First'.
Re^2: Maximal Parsimony Problem
by neversaint (Deacon) on Sep 03, 2008 at 14:28 UTC
    Hi Moritz,
    It is based on phylogenetic tree, the substitution comparison from lower nodes must be compared to higher level.

    E.g. For column 11, the bases for row 3 is compared to row 1. Same result will also be given if we compare row 3 to row 2. Both row 1 and row 2 has higher level than row 3 row 4 in phylogenetic tree.

    ---
    neversaint and everlastingly indebted.......
      I still don't quite understand. We have
      human: C chimp: T mouse: T rat: C

      So which comparisons exactly lead to to a number of 2? With which rules?

      As a poor perl programmer I'm not familiar with the phylogenetic tree, so if that's relevant here please explain in which relation these four examples appear in that tree.

      Have you changed your example?
      A C T G
      How does that become 3? And the column moritz quoted above used to bee:
      C T T T