Re: Parsing line by line

Two issues:

You can't declare the lexical variables inside the loop. It creates new variables for each iteration, i.e. for each input line, so the values are not preserved. Move the declaration before the while.
After adding a new record to the hash, clear the variables. It's enough to add the following line after the assignment:
```
                undef $cycLoc14;
[download]
```

#!/usr/bin/env perl
use strict;
use warnings;

use Data::Dumper;

my %Ho14Loc2GeNm;
my ($cycID14, $cycLoc14, $cycNm14);
while (<DATA>) {
    next unless /^(?:UNIQUE-ID|ACCESSION-1|COMMON-NAME)/;

    if (/^UNIQUE-ID - (GJDZ-[0-9]+)/) {
        $cycID14 = $1;

    } elsif (/^COMMON-NAME - (\S+)/) {
        $cycNm14 = $1;

    } elsif (/^ACCESSION-1 - (STM14_[0-9]+)/) {
        $cycLoc14 = $1;
    }

    if (defined($cycLoc14)){
        $Ho14Loc2GeNm{$cycLoc14} = $cycNm14;
        undef $cycLoc14;
    }
}
print Dumper(\%Ho14Loc2GeNm);

__DATA__
#
UNIQUE-ID - GJDZ-5046
TYPES - BC-4
TYPES - Unclassified-Genes
COMMON-NAME - STM14_5042
ACCESSION-1 - STM14_5042
CENTISOME-POSITION - 90.96536    
COMPONENT-OF - CHROMOSOME-1-100
COMPONENT-OF - TUJDZ-2494
COMPONENT-OF - CHROMOSOME-1
LEFT-END-POSITION - 4430254
PRODUCT - GJDZ-5046-MONOMER
RIGHT-END-POSITION - 4430427
TRANSCRIPTION-DIRECTION - -
//
UNIQUE-ID - GJDZ-1101
TYPES - BC-4
TYPES - Unclassified-Genes
COMMON-NAME - focA
ACCESSION-1 - STM14_1100
CENTISOME-POSITION - 20.85712    
COMPONENT-OF - CHROMOSOME-1-23
COMPONENT-OF - TUJDZ-587
COMPONENT-OF - CHROMOSOME-1
LEFT-END-POSITION - 1015797
PRODUCT - GJDZ-1101-MONOMER
RIGHT-END-POSITION - 1016774
TRANSCRIPTION-DIRECTION - -
//
[download]

لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

Comment on Re: Parsing line by line Select or Download Code

Replies are listed 'Best First'.
Re^2: Parsing line by line by AWallBuilder (Beadle) on Jan 14, 2015 at 14:42 UTC
thanks - #2 was what I was looking for. had conceptually thought about this but didn't know how to do it. had realized #1 earlier thanks	[reply]