First, you'll want to use eq when testing equality of strings in Perl and == when testing equality of numbers. Second, $a and $b are special variables in Perl used in sorting, so don't name normal things that way.

Your specification isn't quite clear. It seems to me you're looking at generations and you want to know when generation n+1 shows a mutation of one single base from generation n. There's https://bioperl.org/ for that sort of work if you need it. There are set libraries as BillKSmith mentioned as well in case that's what you need.

As far as your specific case, if you're indeed wanting to compare A to B and B to C but not A to C, I have some working code I threw together for you. Some loops (whether explicit in your code or implicit in the language or in a library beneath your code) are necessary because you're doing repeated checks over different combinations of inputs. One can try to minimize the number of such loops, but loops (possibly mixed with recursion) is how these combinations get checked. If what you actually need is every string tested against every other string then you need even another loop and, yes, it will complete even more slowly given all else stays equal. Forking, threading, IPC, flow-based programming, and similar topics are unnecessary for three strings of seven characters, but enough very long strings might require something other than a simple single-threaded, single process approach to run in an acceptable amount of time.

This code is quite particular to the problem as I restated it. There's probably a dozen better ways to do this, and some of those will be more generalizable.

#!perl use strict; use warnings; my $difference = 1; my @generations = ( 'TTTATTT', 'TTTTTTTT', 'TBTTTTT' ); my %mismatch; my $times = $difference == 1 ? 'time' : 'times'; my $columns = $difference == 1 ? 'column: ' : "columns:\n"; my $length = length $generations[0]; for ( my $column = 0 ; $column < $length ; $column++ ) { for ( my $gen = 1 ; $gen < scalar @generations ; $gen++ ) { my $last = substr $generations[ $gen-1 ], $column, 1; my $current = substr $generations[ $gen ], $column, 1; push @{ $mismatch{ $gen } }, [ $column, $last, $current ] unle +ss $last eq $current; } } my $out = ''; for my $mm ( sort keys %mismatch ) { if ( scalar @{ $mismatch{ $mm } } == $difference ) { $out .= sprintf "mismatched %d %s between gens %d and %d at %s +", $difference, $times, $mm, $mm-1, $columns; foreach my $c_l_c ( @{ $mismatch{ $mm } } ) { $out .= sprintf " %d ( %s and %s ),\n", @$c_l_c; } $out .= "\n"; } } print $out;

If you need the amount of difference to change or the input strings to change, that's not at all difficult. They could come from the command line or configuration files or a data file of some sort.


In reply to Re: compare initial by mr_mischief
in thread compare initial by dideod.yang

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.