in reply to Re^3: Finding Neighbours of a String
in thread Finding Neighbours of a String

Dear Aristotle,

Sorry slight glitches here. I was working on your last modified code below. It works 99% fine except when the given string is in bracketed format.
#!/usr/bin/perl -w use strict; use Data::Dumper; use Carp; use Algorithm::Combinatorics qw( combinations ); use Set::CrossProduct; my $str1 = '[TA]TTCGG'; my $e = 2; find_nb($str1,$e); sub find_nb { my ( $str, $d ) = @_; my @base = $str =~ /\G ( \[ [^][]+ \] | [^][] ) /xg; #my @base = split //, $str; for my $exact_distance ( 1 .. $d ) { my $change_idx_iter = combinations( [ 0 .. $#base ], $exact_distance ); while ( my $change_idx = $change_idx_iter->next ) { my @base_combo = map { my $i = $_; [ grep { $base[$i] !~ $_ } qw( A T C G ) ]; #[ grep { $base[$i] ne $_ } qw( A T C G ) ]; } @$change_idx; push @base_combo, [0] if $exact_distance == 1; my $bases_iter = Set::CrossProduct->new( \@base_combo ); my @neighbour = @base; while ( my $new_bases = $bases_iter->get ) { @neighbour[@$change_idx] = @$new_bases; #$_ = "[$_]" for @neighbour[@$change_idx]; my $str = join( "", @neighbour ); print "$str\n"; } } } return; }
Why my modification above it doesnt' produce this:
Tried with bruteforce - enumeration method, and sorted with unix "sort" command.
AAACGG AACCGG AAGCGG AATAGG AATCAG AATCCG AATCGA AATCGC AATCGG AATCGT AATCTG AATGGG AATTGG ACACGG ACCCGG ACGCGG ACTAGG ACTCAG ACTCCG ACTCGA ACTCGC ACTCGG ACTCGT ACTCTG ACTGGG ACTTGG AGACGG AGCCGG AGGCGG AGTAGG AGTCAG AGTCCG AGTCGA AGTCGC AGTCGG AGTCGT AGTCTG AGTGGG AGTTGG ATAAGG ATACAG ATACCG ATACGA ATACGC ATACGG ATACGT ATACTG ATAGGG ATATGG ATCAGG ATCCAG ATCCCG ATCCGA ATCCGC ATCCGG ATCCGT ATCCTG ATCGGG ATCTGG ATGAGG ATGCAG ATGCCG ATGCGA ATGCGC ATGCGG ATGCGT ATGCTG ATGGGG ATGTGG ATTAAG ATTACG ATTAGA ATTAGC ATTAGG ATTAGT ATTATG ATTCAA ATTCAC ATTCAG ATTCAT ATTCCA ATTCCC ATTCCG ATTCCT ATTCGA ATTCGC ATTCGT ATTCTA ATTCTC ATTCTG ATTCTT ATTGAG ATTGCG ATTGGA ATTGGC ATTGGG ATTGGT ATTGTG ATTTAG ATTTCG ATTTGA ATTTGC ATTTGG ATTTGT ATTTTG CATCGG CCTCGG CGTCGG CTACGG CTCCGG CTGCGG CTTAGG CTTCAG CTTCCG CTTCGA CTTCGC CTTCGG CTTCGT CTTCTG CTTGGG CTTTGG GATCGG GCTCGG GGTCGG GTACGG GTCCGG GTGCGG GTTAGG GTTCAG GTTCCG GTTCGA GTTCGC GTTCGG GTTCGT GTTCTG GTTGGG GTTTGG TAACGG TACCGG TAGCGG TATAGG TATCAG TATCCG TATCGA TATCGC TATCGG TATCGT TATCTG TATGGG TATTGG TCACGG TCCCGG TCGCGG TCTAGG TCTCAG TCTCCG TCTCGA TCTCGC TCTCGG TCTCGT TCTCTG TCTGGG TCTTGG TGACGG TGCCGG TGGCGG TGTAGG TGTCAG TGTCCG TGTCGA TGTCGC TGTCGG TGTCGT TGTCTG TGTGGG TGTTGG TTAAGG TTACAG TTACCG TTACGA TTACGC TTACGG TTACGT TTACTG TTAGGG TTATGG TTCAGG TTCCAG TTCCCG TTCCGA TTCCGC TTCCGG TTCCGT TTCCTG TTCGGG TTCTGG TTGAGG TTGCAG TTGCCG TTGCGA TTGCGC TTGCGG TTGCGT TTGCTG TTGGGG TTGTGG TTTAAG TTTACG TTTAGA TTTAGC TTTAGG TTTAGT TTTATG TTTCAA TTTCAC TTTCAG TTTCAT TTTCCA TTTCCC TTTCCG TTTCCT TTTCGA TTTCGC TTTCGT TTTCTA TTTCTC TTTCTG TTTCTT TTTGAG TTTGCG TTTGGA TTTGGC TTTGGG TTTGGT TTTGTG TTTTAG TTTTCG TTTTGA TTTTGC TTTTGG TTTTGT TTTTTG
So the output should be always without bracket. Currently one of the entry appear like this: [TA]TTTTG. Instead this kind of string would need to be represented separately into:
TTTTTG ATTTTG
Is there anything I can do to fix it? I really hope to hear from you again. Since your solution is very important to me.

Here is my brute-force code that generate the result above.

Regards,
Edward

Replies are listed 'Best First'.
Re^5: Finding Neighbours of a String
by Aristotle (Chancellor) on Mar 03, 2006 at 20:15 UTC