in reply to Re^8: how to read input from a file, one section at a time?
in thread how to read input from a file, one section at a time?
A quick and dirty and UNTESTED modification to do what I think you want:my $name; while ( my $para = <$PROTFILE> ) { # Remove fasta header line if ( $para =~ s/^>(.*)//m ){ $name = $1; }; ... }
Warning: The requirement to "... get rid of duplicate entries ..." is ambiguous. If there is more than one entry with the same header (i.e., $name), which is (or are, if there are more than two) the duplicate(s)? The first one? The last one? Etc. The code modification above ignores all entries with a given $name after the first one. Also, it might be wise to trim all leading/trailing whitespace from $name before any further processing whatsoever (also untested):my $name; my %name_seen; # fasta headers seen so far FASTA_RECORD: while ( my $para = <$PROTFILE> ) { # Remove fasta header line if ( $para =~ s/^>(.*)//m ){ $name = $1; next FASTA_RECORD if $name_seen{ $name }++; }; ... }
Give a man a fish: <%-{-{-{-<
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^10: how to read input from a file, one section at a time?
by davi54 (Sexton) on Apr 02, 2019 at 15:32 UTC | |
by poj (Abbot) on Apr 02, 2019 at 15:43 UTC | |
by davi54 (Sexton) on Apr 02, 2019 at 16:02 UTC | |
by poj (Abbot) on Apr 02, 2019 at 16:18 UTC | |
by davi54 (Sexton) on Apr 02, 2019 at 17:40 UTC | |
| |
by davi54 (Sexton) on Apr 02, 2019 at 16:38 UTC | |
|