(Update: added $species detection from @ARGV)
Here's the code...
my ($c, $species) = (0, shift()); while(<>) { chomp; next unless length; my ($seq, $scount) = split /\s+/; next if $scount < 2 || length $seq < 15 || length $seq > 30; print ">$species" . $c++ . "_count=$scount\n$seq\n"; }
If it's named process_sequences, then it could be invoked as:
process-sequences speciesname inputfilename >outputfilenameIt's too bad the filenames aren't named for the species they represent, because then you could do something like this:
my ($c, $species) = (0, $ARGV[0]); open my $outfh, '>', $species . ".new" || die $!; while(<>) { chomp; if(length) { my ($seq, $scount) = split /\s+/; if($scount >= 2 && length $seq >= 15 && length $seq <= 30) { print $outfh ">$species" . $c++ . "_count=$scount\n$seq\n" +; } } if (eof()) { ($species, $c) = ($ARGV[0], 0); close $outfh || die $!; open $outfh, '>', $species . ".new" || die $!; } }
And that would be invoked with a list of filenames, each named for the species:
process-sequences bird bee cat dogDave
In reply to Re: Manipulating tab delimited file
by davido
in thread Manipulating tab delimited file
by andyBio
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |