in reply to Manipulating tab delimited file
(Update: added $species detection from @ARGV)
Here's the code...
my ($c, $species) = (0, shift()); while(<>) { chomp; next unless length; my ($seq, $scount) = split /\s+/; next if $scount < 2 || length $seq < 15 || length $seq > 30; print ">$species" . $c++ . "_count=$scount\n$seq\n"; }
If it's named process_sequences, then it could be invoked as:
process-sequences speciesname inputfilename >outputfilenameIt's too bad the filenames aren't named for the species they represent, because then you could do something like this:
my ($c, $species) = (0, $ARGV[0]); open my $outfh, '>', $species . ".new" || die $!; while(<>) { chomp; if(length) { my ($seq, $scount) = split /\s+/; if($scount >= 2 && length $seq >= 15 && length $seq <= 30) { print $outfh ">$species" . $c++ . "_count=$scount\n$seq\n" +; } } if (eof()) { ($species, $c) = ($ARGV[0], 0); close $outfh || die $!; open $outfh, '>', $species . ".new" || die $!; } }
And that would be invoked with a list of filenames, each named for the species:
process-sequences bird bee cat dogDave
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Manipulating tab delimited file
by andyBio (Novice) on Apr 29, 2016 at 04:41 UTC |