If you really want to look for any individual common characters between all possible pairs of 2000 sequences then you just gotta do the work, and there is a lot of work to do!
It may help to tell us why you want to do that. There may be a better solution to your problem than the brute force search implied so far.
You may find this code interesting however:
use strict; use warnings; my @sequences = qw(ACGCATTCA ACTGGATAC TCAGCCATC); my %matches; for my $outer (0 .. $#sequences - 1) { for my $inner ($outer + 1 .. $#sequences) { my $mask = $sequences[$outer] ^ $sequences[$inner]; next if index ($mask, "\0") == -1; # No matching characters $mask =~ tr/\0/\xff/c; $mask |= $sequences[$outer]; $mask =~ tr/\xff/./; push @{$matches{$mask}}, [$outer + 1, $inner + 1]; } } for my $match (sort keys %matches) { print "$match pattern between ", join (', ', map {"$_->[0] and $_->[1]"} @{$matches{$match}}), "\n"; }
Prints:
.C....... pattern between 1 and 3 .C.G....C pattern between 2 and 3 AC....T.. pattern between 1 and 2
In reply to Re: pattern finding algorithm
by GrandFather
in thread pattern finding algorithm
by kdt2006
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |