in reply to Re^2: skipping lines when parsing a file
in thread skipping lines when parsing a file
So it appears that sections of the file are defined by words in all caps starting in column 0. This actually lends itself pretty well to keeping track of the state (in this case the file section) you're in.
There are many other ways to do it, but this is an example of what I was suggesting:
which outputs:my $state; while (my $line=<$in>) { if ($line=~/^([A-Z]+)/) { $state=$1; } print $line unless $state eq "COMMENT"; }
07:37<sandvik@sat1> ~/perl$ ./pmtest.pl LOCUS 4 302276 bp DNA linear HTG 31 +-OCT-2008 DEFINITION Mus musculus chromosome 4 NCBIM37 partial sequence 138489260..138791535 reannotated via EnsEMBL ACCESSION chromosome:NCBIM37:4:138489260:138791535:-1 KEYWORDS . SOURCE house mouse. ORGANISM Mus musculus Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Eutele +ostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Sciurognathi; Muroidea; Muridae; Murinae; Mus. FEATURES Location/Qualifiers source 1..302276 /db_xref="taxon:10090" /organism="Mus musculus" gene complement(267261..268504) /note="locus_tag=Rnf186" /gene="ENSMUSG00000070661" /note="ring finger protein 186 [Source:MGI;Acc:MG +I:1914075]
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: skipping lines when parsing a file
by lomSpace (Scribe) on Aug 20, 2009 at 17:02 UTC | |
by ssandv (Hermit) on Aug 20, 2009 at 17:46 UTC | |
|
Re^4: skipping lines when parsing a file
by lomSpace (Scribe) on Aug 20, 2009 at 17:42 UTC | |
by ssandv (Hermit) on Aug 20, 2009 at 17:50 UTC |