in reply to Complicated pattern match
Even this falls short on "actual" data though: working against GATAGCATGGAGGCCATCGATAACGCGAATTTGAATTTGAAT it produces G AT A G CAT G G AG GCCA TCGA TAA CG CG AATTTGAA TTT GAAT..#!/usr/bin/perl -w use strict; use Algorithm::Diff qw(traverse_sequences); # construct arrayrefs containing an array of single chars my ($A, $B) = map [/(.)/sg], qw( ATGGAGTCGACGAATTTGAAGAAT xxxxxxATGGAGyxxxTCGAzxxxxCGAATTTGAAxxwGAAT ); my $prev = ''; my @seq; traverse_sequences( $A, $B, { MATCH => sub { my ($aidx, $bidx) = @_; if('=' ne $prev) { push @seq, ''; $prev = '='; } $seq[-1] .= $A->[$aidx]; }, DISCARD_A => sub { die "Sequence in A is not fully contained i +n B" }, DISCARD_B => sub { my ($aidx, $bidx) = @_; if('!' ne $prev) { push @seq, ''; $prev = '!'; } $seq[-1] .= $B->[$bidx]; }, }, ); print "@seq\n"; __END__ xxxxxx ATGGAG yxxx TCGA zxxxx CGAATTTGAA xxw GAAT
Makeshifts last the longest.
|
|---|