Re: parsing multiple lines

My preferred approach is basically the same as what has been said above, but I like to use a hash ref to keep track of the parsed information. Another way to look at this problem is that the file is a sequence of records, and after you parse a complete record you want to 'process' it in some fashion. The typical way to do this is:

my $r = {};  # hash to hold the parsed record
while (<IN>) {
  if (/^(\d+).../) { # found beginning of new record
    if ($r->{id}) { process($r); }
    $r = {};         # begin new record
    $r->{id} = $1;   # populate parsed info from this line
  } elsif (/KEGG.../) {
    $r->{kegg} = ...;
  } elsif ...
  }
}
if ($r->{id}) { process($r) };

sub process {
  my $r = shift;
  ...
}
[download]

By changing the process subroutine you can re-use this code to perform different kinds of analyses on the file.

Comment on Re: parsing multiple lines Select or Download Code

Replies are listed 'Best First'.
Re^2: parsing multiple lines by GrandFather (Saint) on May 21, 2008 at 21:19 UTC
Why a hash ref rather than a hash? Surely it is simpler and clearer to write: `my %record; while (<IN>) { if (/^(\d+).../) { # found beginning of new record process (\%record) if $record{id}; %record = (); # Flush old record $record{id} = $1; # populate parsed info from this line ...` [download] Perl is environmentally friendly - it saves trees	[reply] [d/l]
Re^2: parsing multiple lines by sm2004 (Acolyte) on May 22, 2008 at 00:53 UTC
Thanks a lot. I'm learning from all alternative ideas.	[reply]