I was wandering around SuperSearch looking for something else when I saw this and wondered "Are the xor's with 'N' different than the xor's with ACGT?". Turns out they are, and so there is no need for any of the masking in this post.

#!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=766743 use warnings; my %thexor; # check for N mismatches different from non-N mismatches for my $x ( qw( A T G C N ) ) { for my $y ( qw( A T G C N ) ) { $x lt $y and $thexor{$x ^ $y} .= "$x$y "; } } use Data::Dump 'dd', 'pp'; dd \%thexor; # yes they are # mismatch "\2\4\6\23\25\27" # match "\0\t\r\17\32" local $/ = ''; while( <DATA> ) { my ($x, $y) = split; my $bad = ($x ^ $y) =~ tr/\2\4\6\23\25\27//; # therefore this counts + mismatches print "$x ^ $y => ", pp($x ^ $y), $bad ? ' FAIL' : ' ok', "\n"; } __DATA__ ATGNCNC ATGACNN ATGNCNC TTGNNNC

Outputs:

{ "\2" => "AC ", "\4" => "CG ", "\6" => "AG ", "\t" => "GN ", "\r" => "CN ", "\17" => "AN ", "\23" => "GT ", "\25" => "AT ", "\27" => "CT ", "\32" => "NT ", } ATGNCNC ^ ATGACNN => "\0\0\0\17\0\0\r" ok ATGNCNC ^ TTGNNNC => "\25\0\0\0\r\0\0" FAIL

In reply to Re^2: simple string comparison for efficiency by tybalt89
in thread simple string comparison for efficiency by CaptainF

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.