in reply to Defining substring matches
I wish to use the substr() function to search for the particular motif.#
substr is *not* designed for (nor capable of) searching for anything; so why are you specifying that particular function?
You've defined your IUPAC codes in terms of regex character classes; so why are you eschewing the regex engine?
Given your table, it is trivial to convert IUPAC codes into a regex and use the regex engine to search your fasta file:
my %IUPAC = ( A => '[A]', C => '[C]', G => '[G]', T => '[T]', R => '[AG]', Y => '[CT]', M => '[AC]', K => '[GT]', W => '[AT]', S => '[GC]', B => '[CGT]', D => '[AGT]', H => '[ACT]', V => '[ACG]', N => '[ACGT]', ); my( $file, $motif ) = @ARGV; my $re = join '', map $IUPAC{ $_ }, split '', $motif; open FASTA, '<', $file or die $!; getc( FASTA ); ## discard first '>' until( eof( FASTA ) ) { chomp( my $id = <FASTA> ); ## read ident my $seq = do{ local $/ = '>'; <FASTA> }; $seq =~ tr[\n>][]d; while( $seq =~ m[($re)]g ) { printf "Found: '$1' at '$id':%d\n", $-[0]; } }
NB: The above is untested code typed directly into my browser.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Defining substring matches
by jwkrahn (Abbot) on Sep 21, 2013 at 01:24 UTC | |
by Anonymous Monk on Sep 21, 2013 at 02:19 UTC | |
by BrowserUk (Patriarch) on Sep 21, 2013 at 02:38 UTC | |
|
Re^2: Defining substring matches
by AnomalousMonk (Archbishop) on Sep 21, 2013 at 14:32 UTC | |
by BrowserUk (Patriarch) on Sep 21, 2013 at 15:22 UTC | |
|
Re^2: Defining substring matches
by drhicks (Novice) on Sep 21, 2013 at 15:40 UTC |