A quick and dirty and UNTESTED modification to do what I think you want:my $name; while ( my $para = <$PROTFILE> ) { # Remove fasta header line if ( $para =~ s/^>(.*)//m ){ $name = $1; }; ... }
Warning: The requirement to "... get rid of duplicate entries ..." is ambiguous. If there is more than one entry with the same header (i.e., $name), which is (or are, if there are more than two) the duplicate(s)? The first one? The last one? Etc. The code modification above ignores all entries with a given $name after the first one. Also, it might be wise to trim all leading/trailing whitespace from $name before any further processing whatsoever (also untested):my $name; my %name_seen; # fasta headers seen so far FASTA_RECORD: while ( my $para = <$PROTFILE> ) { # Remove fasta header line if ( $para =~ s/^>(.*)//m ){ $name = $1; next FASTA_RECORD if $name_seen{ $name }++; }; ... }
Give a man a fish: <%-{-{-{-<
In reply to Re^9: how to read input from a file, one section at a time?
by AnomalousMonk
in thread how to read input from a file, one section at a time?
by davi54
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |