Re: Help to build a REGEXP (BioPerl)

Replies are listed 'Best First'.
Re^2: Help to build a REGEXP (BioPerl) by Anonymous Monk on Mar 11, 2014 at 23:43 UTC
It's supposed to be for an assignment and we must use REGEXPS... Sequence as you can see is spread over multiple lines, that's why I tried to catch everything from </code>/translation</code> all the way until first occurence of `exon` in the file....	[reply] [d/l]
Re^3: Help to build a REGEXP (BioPerl) by Kenosis (Priest) on Mar 11, 2014 at 23:59 UTC
It's supposed to be for an assignment and we must use REGEXPS... That's akin to being asked to do a gainer off a diving board when just learning to swim. Especially so if you're in bioinformatics. From my experience, it would be more pedagogically sound to first learn to proficiently wield the (BioPerl) tools, then learn how to forge such tools... If you must, however, use a regex in your script, perhaps the following will be helpful: `use strict; use warnings; use Bio::SeqIO; my $filename = 'sequences.gen'; my $stream = Bio::SeqIO->new( -file => $filename, -format => 'GenBank' ); while ( my $seq = $stream->next_seq() ) { my $trans = $seq->translate(); print $trans->seq(), "\n"; } my $string = 'This script uses a regex.'; $string =~ s/uses/doesn't use/; print $string;` [download]	[reply] [d/l]
Re^4: Help to build a REGEXP (BioPerl) by erix (Prior) on Mar 12, 2014 at 00:20 UTC
Nice, but that doesn't work (because the text used by the OP does not constitute a valid genbank record). That could be worked around by getting the complete record, I guess. But the wrath of the teacher needs to be deflected too. Perhaps make the regex a (quoted) multiline capture? :)	[reply]
Re^5: Help to build a REGEXP (BioPerl) by erix (Prior) on Mar 12, 2014 at 00:42 UTC
Re^6: Help to build a REGEXP (BioPerl) by Kenosis (Priest) on Mar 12, 2014 at 02:10 UTC
Re^5: Help to build a REGEXP (BioPerl) by Kenosis (Priest) on Mar 12, 2014 at 00:32 UTC
Re^3: Help to build a REGEXP (BioPerl) by Anonymous Monk on Mar 11, 2014 at 23:47 UTC
Still, this doesn't seem to work... `if($line7=~/^\s+\/translation\=\"(.*?)\"/s) {$amino_acid_seq=$1;}` [download]	[reply] [d/l]
Re^4: Help to build a REGEXP (m//ms) by Anonymous Monk on Mar 12, 2014 at 00:21 UTC
Try m//ms instead if //gs Also, use re 'debug'; to see how the regex engine matches your string ... you can also use rxrx - command-line REPL and wrapper for Regexp::Debugger Read more... (5 kB) Also interesting (but tad more pita to install) is wxPPIxregexplain.pl	[reply] [d/l]
Re^2: Help to build a REGEXP by Anonymous Monk on Mar 11, 2014 at 23:43 UTC
Also, once you get more than one line into $line7, you want non-greedy matching `.*?` as there are multiple "exon" strings also, you don't want to use m//g in scalar context Also, perlrequick is a great quick reference :)	[reply] [d/l]