in reply to Extract the matching strings

Perhaps you should consider writing a parser for this format - always a good idea if you're going to be dealing with it on a regular basis. Example:

#!/usr/bin/perl -w use strict; open my $Gpff, '<', 'protein.gpff' or die "protein.gpff: $!\n"; my ($key, %data); while (<$Gpff>){ chomp; if (/^(?:\s\s)?([A-Z]+)\s+(.*)$/){ $key = $1; $data{$key} = $2; } else { s/^\s+/ /; $data{$key} .= $_; } } close $Gpff;

Given the above, you can now easily extract the data that you want by its label; that is, printing $data{SOURCE} will output "mitochondrion Ephydatia muelleri". You may need to adjust the parser to suit your exact application, but this should give you a good start.


--
"Language shapes the way we think, and determines what we can think about."
-- B. L. Whorf