in reply to file parsing

Perhaps it'd be better to invest into a somewhat more generic parser, something like this:

my (@records,$cur); while(<>) { chomp; if ($_ eq "//") { push @records, $cur if defined $cur; $cur = undef; } elsif (/^(.+?) - (.+)$/) { my ($key,$value) = ($1,$2); if (defined $cur->{$key}) { if (ref $cur->{$key}) { push @{$cur->{$key}}, $value } else { $cur->{$key} = [$cur->{$key}, $value] } } else { $cur->{$key} = $value } } else { warn "didn't handle input line: $_" } } push @records, $cur if defined $cur;

Note that changing @records into a hash keyed by UNIQUE-ID is as simple as my %records = map {$_->{'UNIQUE-ID'}=>$_} @records;

Output of the above code for your example input:

$VAR1 = [ { 'ACCESSION-2' => 'ECK1895', 'LEFT-END-POSITION' => '1978212', 'RIGHT-END-POSITION' => '1979636', 'UNIQUE-ID' => 'EG11751', 'LAST-UPDATE' => '3609256889', 'KNOCKOUT-GROWTH-OBSERVATIONS' => [ 'OBS0-40', 'OBS0-37', 'OBS0-33', 'OBS0-49', 'OBS0-44' ], 'COMMENT-INTERNAL' => '1/24/05 keseler removed pexA as syn +onym', 'COMMON-NAME' => 'otsA', 'DBLINKS' => [ '(ECOLIHUB "otsA" NIL |kr| 3474243543 NIL N +IL)', '(REGULONDB "EG11751" NIL |kr| 3462030648 N +IL NIL)', '(ASAP "ABE-0006318" NIL |paley| 3398447608 + NIL NIL)', '(ECHOBASE "EB1701" NIL |pkarp| 3346767936 +NIL NIL)', '(ECOGENE "EG11751" NIL |pick| 3292798423 N +IL NIL)', '(OU-MICROARRAY "b1896" NIL NIL NIL NIL NIL +)', '(CGSC "18073" NIL |pkarp| 3035559680 NIL N +IL)' ], 'PRODUCT' => 'TREHALOSE6PSYN-MONOMER', 'ACCESSION-1' => 'b1896', 'CENTISOME-POSITION' => '42.636864 ', 'TRANSCRIPTION-DIRECTION' => '-', 'TYPES' => [ 'BC-5.5.2', 'BC-1.7.9', 'BC-5.5.1' ], 'MEMBER-SORT-FN' => 'NUMBERED-CLASS-SORT-FN', 'COMPONENT-OF' => [ 'COLI-K12-39', 'TU0-7722', 'TU00391', 'TU00312' ] } ];