in reply to Hash Help

by reading the lines five at a time and pulling the user number out of the first one:

use warnings; use strict; use Data::Dump::Streamer; my %records; while (! eof (DATA)) { my $line1 = <DATA>; next unless defined $line1 and $line1 =~ /^\s1/; $_ = <DATA> for my ($line2, $line3, $line4, $line5); my $id = substr $line1, 9, 5; $records{$id} = [$line1, $line2, $line3, $line4, $line5]; } Dump (\%records); __DATA__ 10101ABC000019101L0001374686047S30339 GA &DOE C080229CR7 7 +00244 0000001 000000000 2 CR7 000 060714Q + Y 0000000000 000 3 00030339 3JO +HN DOE 36 423 MAIN STREET ATLANTA GA 30339 5 +000000000 +0080226I 052461 05241961 10101ABC000029102 N D +3658 MAIN STREET 2 ATLANTA GA3033 +9 0001 3JOHN DOE 05241961INDV37468604 +7S 4 5

Prints:

$HASH1 = { "00001" => [ " 10101ABC000019101L0001374686047S30339 GA +&DOE C080229CR7 7 00". "244 0000001 000000000\n", " 2 CR7 000 060 +714Q ". " Y 0000000000 000\n", " 3 00030339 + 3JOHN". " DOE 36\n", " 423 MAIN STREET ATLANTA GA 3 +0339\n", " 5 + +00000000000". "80226I 052461 05241961\n" ], "00002" => [ " 10101ABC000029102 N D + 36". "58 MAIN STREET\n", " 2 ATLANTA + GA30339 ". " 0001\n", " 3JOHN DOE + 05241961INDV374686047S". "\n", " 4\n", " 5\n" ] };

Note that I edited the second record to give it a unique id as described and that I restored what appeared to be a missing space in front of the first line of the first record.


Perl reduces RSI - it saves typing

Replies are listed 'Best First'.
Re^2: Hash Help
by mmittiga17 (Scribe) on Nov 25, 2008 at 18:30 UTC
    Thanks again for your help, I have a question if you have time, I notice in some cases in the file I am trying to parse that there maybe 7 lines of info per ID and in other cases only 5.
    my %records; while (! eof (DATA)) { my $line1 = <DATA>; next unless defined $line1 and $line1 =~ /^01KV/; $_ = <DATA> for my ($line2, $line3, $line4, $line5, $line6, $line7 +); my $id = substr $line1, 2, 9; $records{$id} = [$line1, $line2, $line3, $line4, $line5, $line6, $l +ine7];
    If I add line6 and line7, I notice that IDs with only five lines will also grab the first two lines of the next record. How can I dynamically account for IDs with more than 5 record lines? Thanks!

      In that case you need to be smarter about recognizing records. There seems to be a line number associated with each line of a record so you can notice when the line number resets:

      use warnings; use strict; use Data::Dump::Streamer; my %records; my @lines; while (! eof (DATA) or @lines) { my $line = <DATA>; $line ||= ''; # Avoid a bunch of defined tests chomp $line; next unless $line =~ /^([\s\d]\d)/ or @lines; if (! defined $1 or $1 <= @lines) { # Start of new record (or last record) - save previous my $id = substr $lines[0], 9, 5; $records{$id} = [@lines]; @lines = (); } push @lines, $line if length $line; } Dump (\%records); __DATA__ 10101ABC000019101L0001374686047S30339 GA &DOE C080229CR7 7 +00244 0000001 000000000 2 CR7 000 060714Q + Y 0000000000 000 3 00030339 3JO +HN DOE 36 423 MAIN STREET ATLANTA GA 30339 5 +000000000 +0080226I 052461 05241961 6Additional line 1 7and another additional line 10101ABC000029102 N D +3658 MAIN STREET 2 ATLANTA GA3033 +9 0001 3JOHN DOE 05241961INDV37468604 +7S 4 5

      Perl reduces RSI - it saves typing
        with this new method how can I access each ID records lines 1 - X ? Before I could do it this way:
        #########REC01############## $line = $records{$id}[0]; $ACTNUM = substr $line, 2, 9 ; $TOT_AVBL_TO_PAY1 = substr $line, 32, 11; $TOT_AVBL_TO_PAY2 = substr $line, 43, 2; $TOT_AVBL_TO_PAY1 ="0" if (! $TOT_AVBL_TO_PAY1); $TOT_AVBL_TO_PAY2 ="0" if (! $TOT_AVBL_TO_PAY2); $TOT_AVBL_TO_PAY3 = join('.', $TOT_AVBL_TO_PAY1, $TOT_AVBL_TO_PAY2); $TOT_AVBL_TO_PAY = $TOT_AVBL_TO_PAY3; #########REC02############## $line = $records{$id}[1]; #########REC03############## $line = $records{$id}[2]; $MGN_FED_CALL_TOT1 = substr $line, 31, 11; $MGN_FED_CALL_TOT2 = substr $line, 42, 2; $MGN_FED_CALL_TOT1 ="0" if (! $MGN_FED_CALL_TOT1); $MGN_FED_CALL_TOT2 ="0" if (! $MGN_FED_CALL_TOT2); $MGN_FED_CALL_TOT3 = join('.', $MGN_FED_CALL_TOT1, $MGN_FED_CALL_TOT +2); $MGN_FEDSIG = substr $line, 30, 1; if ($MGN_FEDSIG eq "+") { $MGN_FED_CALL_TOT = "-" . $MGN_FED_CALL_TOT3 ; }else{ $MGN_FED_CALL_TOT = $MGN_FED_CALL_TOT3; }
        Thanks!!!!