in reply to Perl Hashes, keys under keys (I think)?


Hey Guys, thanks again for the help. I had another question if anyone could help me with.

The question is about assigning elements to a hash, and why is this certain record in the hash being overwritten instead of adding it to the hash?

So I have a text file with the following data (below). The idea is that a record could have only one 'OWNER' and could have any number of users who are 'WAITING' for that record ('RECORD_ID' is what links them between users who are OWNERS and WAITING because they will be the same).

INPUT-DATA
FILENAME RECORD_ID M INBR DNBR OWNER UNBR UNO TTY TIME DATE /ud/QC-DATA/CONTROL 001!SCHEDULE X 3462970 922337219576856 richard 500 +5444 156 pts/220 10:50:44 Sep 16 FILENAME RECORD_ID M INBR DNBR WAITING UNBR UNO TTY TIME DATE /ud/QC-DATA/CONTROL 001!SCHEDULE X 3462970 922337219576856 marshall 33 +09650 35 none 10:50:30 Sep 16 /ud/QC-DATA/CONTROL 001!SCHEDULE X 3462970 922337219576856 steven 4565 +899 122 pts/101 10:45:44 Sep 16 FILENAME RECORD_ID M INBR DNBR OWNER UNBR UNO TTY TIME DATE /ud/QC-PROD/NWO 667!NEWPRODUCT X 6468555 59761456123786 kevin 5555555 +999 pts/900 10:25:00 Sep 16 FILENAME RECORD_ID M INBR DNBR WAITING UNBR UNO TTY TIME DATE /ud/QC-PROD/NWO 667!NEWPRODUCT X 6468555 59761456123786 kelly 1234567 +886 none 10:28:00 Sep 16 /ud/QC-PROD/NWO 667!NEWPRODUCT X 6468555 59761456123786 russel 7654321 + 456 tty/101 10:25:00 Sep 16 FILENAME RECORD_ID M INBR DNBR OWNER UNBR UNO TTY TIME DATE /ud/NEW-PrOdUcTs/NWA 999?SheetMusic X 9876541 86555522211 mmartin 5151 +515 000 tty/100 10:51:00 Sep 16 FILENAME RECORD_ID M INBR DNBR WAITING UNBR UNO TTY TIME DATE /ud/NEW-PrOdUcTs/NWA 999?SheetMusic X 9876541 86555522211 luther 12345 +67 987 none 10:51:00 Sep 16

And here is my sub-routine that uses an array that contains all the data above in @temp (where each WHOLE line is an element in the array).

#@temp contains all data from input file (line-by-line) my @temp = <DATA>; my %records; my @fields; my $x = 0; if (@temp) { while ($x <= $#temp) { @fields = split " ", $temp[$x]; if ($fields[5] =~ /OWNER/) { $x++; @fields = split " ", $temp[$x]; print "OWNER --- @fields\n"; #Add OWNER to hash based on RECORD_ID $records{"$fields[1]"}{ OWNER } = { USER => "$fields[5]", FILENAME => "$fields[0]", PID => "$fields[6]", TIME => "$fields[9]", DATE => (join " ", "$fields[10] $fields[11]"), ELAPSED => [] }; #Increment to Next Line $x++; } else { #If line is WAITING header increment to next line. if ($fields[5] =~ /WAITING/) { $x++; } #Split line into array @fields = split " ", $temp[$x]; print "WAITING --- @fields\n"; #Add WAITING to hash based on RECORD_ID $records{"$fields[1]"}{ WAITING } = { USER => "$fields[5]", FILENAME => "$fields[0]", PID => "$fields[6]", TIME => "$fields[9]", DATE => (join " ", "$fields[10] $fields[11]"), ELAPSED => [] }; $x++; } } } print Dumper \%records; ____OUTPUT____ OWNER --- /ud/QC-DATA/CONTROL 001!SCHEDULE X 3462970 922337219576856 r +ichard 5005444 156 pts/220 10:50:44 Sep 16 WAITING --- /ud/QC-DATA/CONTROL 001!SCHEDULE X 3462970 922337219576856 + marshall 3309650 35 none 10:50:30 Sep 16 WAITING --- /ud/QC-DATA/CONTROL 001!SCHEDULE X 3462970 922337219576856 + steven 4565899 122 pts/101 10:45:44 Sep 16 OWNER --- /ud/QC-PROD/NWO 667!NEWPRODUCT X 6468555 59761456123786 kevi +n 5555555 999 pts/900 10:25:00 Sep 16 WAITING --- /ud/QC-PROD/NWO 667!NEWPRODUCT X 6468555 59761456123786 ke +lly 1234567 886 none 10:28:00 Sep 16 WAITING --- /ud/QC-PROD/NWO 667!NEWPRODUCT X 6468555 59761456123786 ru +ssel 7654321 456 tty/101 10:25:00 Sep 16 OWNER --- /ud/NEW-PrOdUcTs/NWA 999?SheetMusic X 9876541 86555522211 mm +artin 5151515 000 tty/100 10:51:00 Sep 16 WAITING --- /ud/NEW-PrOdUcTs/NWA 999?SheetMusic X 9876541 86555522211 +luther 1234567 987 none 10:51:00 Sep 16 $VAR1 = { '999?SheetMusic' => { 'WAITING' => { 'PID' => '1234567', 'TIME' => '10:51:00', 'DATE' => 'Sep 16', 'WAITERS' => [], 'FILENAME' => '/ud/NEW-PrOdUcTs/NWA', 'USER' => 'luther' }, 'OWNER' => { 'PID' => '5151515', 'TIME' => '10:51:00', 'DATE' => 'Sep 16', 'WAITERS' => [], 'FILENAME' => '/ud/NEW-PrOdUcTs/NWA', 'USER' => 'mmartin' } }, '667!NEWPRODUCT' => { 'WAITING' => { 'PID' => '7654321', 'TIME' => '10:25:00', 'DATE' => 'Sep 16', 'WAITERS' => [], 'FILENAME' => '/ud/QC-PROD/NWO', 'USER' => 'russel' }, 'OWNER' => { 'PID' => '5555555', 'TIME' => '10:25:00', 'DATE' => 'Sep 16', 'WAITERS' => [], 'FILENAME' => '/ud/QC-PROD/NWO', 'USER' => 'kevin' } }, '001!SCHEDULE' => { 'WAITING' => { 'PID' => '4565899', 'TIME' => '10:45:44', 'DATE' => 'Sep 16', 'WAITERS' => [], 'FILENAME' => '/ud/QC-DATA/CONTROL', 'USER' => 'steven' }, 'OWNER' => { 'PID' => '5005444', 'TIME' => '10:50:44', 'DATE' => 'Sep 16', 'WAITERS' => [], 'FILENAME' => '/ud/QC-DATA/CONTROL', 'USER' => 'richard' } } };

After executing the code above, I can see that the @fields, at the time of assigning elements to the hash, does give it ALL the elements of @temp (You can see that from the print statements of the array @fields). But it seems that which ever was the last 'WAITING' record processed, it will overwrite the one before it.
From the INPUT file you can see that RECORD_ID's "001!SCHEDULE" and "667!NEWPRODUCT" should each have 1 OWNER and 2 WAITING. And the RECORD_ID "999?SheetMusic" should have 1 OWNER and 1 WAITING.
Anyone know what I have to do for that not to overwrite the WAITING user before it?



Thanks in Advance,
Matt


Replies are listed 'Best First'.
Re^2: Perl Hashes, keys under keys (I think)?
by hbm (Hermit) on Sep 20, 2011 at 15:49 UTC

    Absolutely, you can only have one record of a given key. So you can tweak the key like you are doing, or rethink your structure. Maybe you want the WAITING keys to be arrayrefs, pointing to lists of zero or more WAITING records.

    Also, you have a lot of duplicate code. Consider consolidating like this, untested:

    if ($fields[5] =~ /(OWNER|WAITING)/) { my $state = $1; # and perhaps this, for your new solution: $state .= $count if $state eq 'WAITING'; $x++; @fields = split " ", $temp[$x]; print "$state --- @fields\n"; #Add record to hash based on RECORD_ID $records{"$fields[1]"}{ $state } = { USER => "$fields[5]", FILENAME => "$fields[0]", PID => "$fields[6]", TIME => "$fields[9]", DATE => (join " ", "$fields[10] $fields[11]"), ELAPSED => [] }; #Increment to Next Line $x++; }

      Hey hbm,

      Thanks for the reply. Good idea! You guys are so good at condensing code down to the bare minimum. I always seem to have a case of "code bloat".

      Only thing is, is that it will only do up to one WAITING user. That was one of the problems I ran into as well. Somewhere there would have to be something where it should, for instance, "Once you do a WAITING record, keep doing it until the line contains 'OWNER'".

      But really thanks for the suggestion. I will play around with your condensed version and see if I can get it to run properly.



      Thanks Again,
      Matt



      .

        Try this. Note that I changed it so that OWNER and WAITING are arrayrefs, hence the push. You'll probably want to add data validation too...

        Update:I made two late adjustments after 'strict'.

        use strict; use warnings; use Data::Dumper; my %records; my $state; # UPDATE - Added this! while(<DATA>){ my @fields = split; # and added 'my' if ($fields[5] =~ /(OWNER|WAITING)/) { $state = $1; } else { push@{$records{$fields[1]}{$state}}, { USER => $fields[5], FILENAME => $fields[0], PID => $fields[6], TIME => $fields[9], DATE => "$fields[10] $fields[11]", ELAPSED => [] }; } } print Dumper(%records);

        Partial Output:

        $VAR21 = '001!SCHEDULE'; $VAR22 = { 'WAITING' => [ { 'PID' => '33', 'TIME' => undef, 'DATE' => ' ', 'ELAPSED' => [], 'FILENAME' => '/ud/QC-DATA/CONTROL', 'USER' => 'marshall' }, { 'PID' => '4565', 'TIME' => undef, 'DATE' => ' ', 'ELAPSED' => [], 'FILENAME' => '/ud/QC-DATA/CONTROL', 'USER' => 'steven' } ], 'OWNER' => [ { 'PID' => '500', 'TIME' => undef, 'DATE' => ' ', 'ELAPSED' => [], 'FILENAME' => '/ud/QC-DATA/CONTROL', 'USER' => 'richard' } ] };

      Hey hbm,

      I think I got it working correctly. All's I really had to do was add a couple of lines (i.e. add $count++, as well as a "else" clause, etc...).

      Here's what I got:

      if ($fields[5] =~ /(OWNER|WAITING)/) { $x++; $state = $1; if ($state eq 'OWNER') { $count = 0; } if ($state eq 'WAITING') { $state .= "-$count"; $count +++; } @fields = split " ", $temp[$x]; print "$state --- @fields\n"; #Add record to hash based on RECORD_ID $records{"$fields[1]"}{ $state } = { USER => "$fields[5]", FILENAME => "$fields[0]", PID => "$fields[6]", TIME => "$fields[9]", DATE => (join " ", "$fields[10] $fields[11]"), ELAPSED => [] }; #Increment to Next Line $x++; } else { $records{"$fields[1]"}{ "WAITING-$count" } = { USER => "$fields[5]", FILENAME => "$fields[0]", PID => "$fields[6]", TIME => "$fields[9]", DATE => (join " ", "$fields[10] $fields[11]"), ELAPSED => [] }; $count++; $x++; }



      Thanks Again,
      Matt


      .

        Matt--

        Consider reading your data one line at a time, rather than slurping into an array and manipulating $x.

        Also, here:

        DATE => (join " ", "$fields[10] $fields[11]"),

        The double-quotes on the right side bind the tenth and eleventh fields into a single string, which you then join(?) with a space. Joining a single item doesn't do anything; you can simply do:

        DATE => "$fields[10] $fields[11]"
Re^2: Perl Hashes, keys under keys (I think)?
by mmartin (Monk) on Sep 20, 2011 at 15:32 UTC

    Ok well I think I may have answered my own question.

    I found this link below. (You have to view the google cached version because the page no longer exists.).

    "http://webcache.googleusercontent.com/search?q=cache:yVSNSleva3EJ:docstore.mik.ua/orelly/perl4/lperl/ch05_02.htm+perl+hash+-+hash+element+overwriting+next+element&cd=1&hl=en&ct=clnk&gl=us"

    It basically says "Last one in wins". So I added a $count variable that increments after each 'WAITING' user, and then resets it to zero after executing each 'OWNER' record.

    So I changed the following hash assignment line for the WAITING user to this:

    $records{"$fields[1]"}{ "WAITING$count" } = { USER => "$fields[5]", FILENAME => "$fields[0]", PID => "$fields[6]", TIME => "$fields[9]", DATE => (join " ", "$fields[10] $fields[11]"), WAITERS => [] };

    Can anyone think of a way where it would be a bad idea to do it this way?


    Thanks,
    Matt


    .