How to code a complex AoH?

iatros has asked for the wisdom of the Perl Monks concerning the following question:

Here is a task that's puzzling me for quite a while now: Regularly, we get exam results of our students as a .csv file. The header has some meta data such as ID, gender, date of birth, status, exam room, seat number, exam version, etc. The following lines start with these data and the scores for 60 questions (0.0 0.5 1.0 points if the answer is wrong, half-correct, or correct). There are 6 versions (A - F) of the exam differing only by the order of the 60 questions. The information is stored for statistical evaluation which requires the correct alignment according to the exam master (a .txt file with 7 columns for version A-F and the correct answer in the 7th column).

I tried to accommodate the .csv file as an array of hashes to generate a different .csv or tabbed .txt file in which all exam results appear in a unified order for later statistical evaluation. But something went wrong.

Example:

header --
ID,gender,birthdate,order,room,seat,version,points,,,,,,,,,,,,,,,,,,,,
+,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
277710814533,f,01/02/1993,m,sr_3,A11,A, 1,1,1,1,0,1,1,1,.5,1,1,1,0,1,.
+5,1,1,1,0,1,.5,1,1,0,1,1,1,1,1,1,1,0,0,1,0,1,.5,1,1,1,1,.5,0,1,1,1,0,
+1,1,1,1,1,0,1,1,1,.5,1,1,1
755310765962,f,31/07/1992 00:00,v,aula,C11,C,1,.5,0,1,1,1,1,1,1,1,1,1,
+1,1,0,1,1,1,1,1,0,1,1,1,1,0,1,1,1,.5,1,0,.5,1,0,1,.5,0,.5,0,1,0,0,.5,
+1,1,0,.5,1,1,.5,.5,1,.5,.5,1,1,1,.5,.5
394610513538,m,20/10/1992 00:00,m,sr_3,E13,E,1,1,0,.5,1,1,1,1,1,1,1,.5
+,1,1,.5,.5,1,1,1,.5,.5,1,1,1,1,0,0,.5,1,1,.5,.5,.5,.5,0,1,0,.5,0,0,1,
+0,1,.5,0,1,0,0,.5,1,0,1,1,0,.5,.5,.5,.5,.5,.5
[download]

The code generates anonymous hash keys according to the following scheme:

    
    while ( <FH> ) {
    chomp ;
    if ( /^\d\d\d/) {
        ( $id , $gender , $birthday , $status , $room , $seat , $versi
+on , @points ) = split ( /,/ , $_ ) ;
        $student = { 
            'id'       => $id ,
            'gender'   => $gender , 
            'birthday' => $birthday ,
            'position' => $position ,
            'room'     => $room , 
            'seat'     => $seat ,
            'version'  => $version ,
            'points'   => @points
        } ;
        push ( @candidates , $student ) ;
    }
    } ;
    close FH ;
    print "Number of candidates processed: " . ( $#candidates + 1 ) . 
+"\n" ;
[download]

The compiler throws a warning for each record, e.g. "Odd number of elements in anonymous hash at /Documents//testAoH.pl line 38, line 16." but the script is executed.

The script prints the correct number of processed records, but when I try to retrieve a specific record I only get the scalar values and the @points array yields only one (the first?) result as if it were destroyed. A data dumper output further shows that something must be internally wrong with this code.

Data Dumper e.g.

        755310765962 
    $VAR1 = \{
            '0' => '0',
            'gender' => 'f',
            'id' => '755310765962',
            'points' => '1',
            'room' => 'aula',
            '.5' => undef,
            '1' => '.5',
            'birthday' => '31/07/1992',
            'seat' => 'A11',
            'version' => 'A',
            'status' => 'v'
          };
[download]

Any clues?

Thx - Harald -

Comment on How to code a complex AoH? Select or Download Code

Replies are listed 'Best First'.

Re: How to code a complex AoH?
by Athanasius (Archbishop) on Mar 25, 2017 at 15:04 UTC

Hello iatros, and welcome to the Monastery!

Each element of a hash (or of an array, for that matter) must be a scalar value. So if you want to store an array of values in the points slot of a hash, you have to store a pointer to that array:

$student = {
    ...
    points => \@points,
};
[download]

$student = {
    ...
    points => [@points],
};
[download]

See perlreftut and perldsc.

Update: To elaborate on stevieb’s point: if the array @points contains the elements ('a', 'b', 'c', 'd'), then the assignment

$student = {
    ...
    points => @points,
};
[download]

is effectively this:

$student = {
    ...
    'points', 'a',
    'b', 'c',
    'd',
};
[download]

or, equivalently,

$student = {
    ...
    points => 'a',
    b      => 'c',
   'd',
};
[download]

— which explains why the compiler is warning about an odd number of elements in the hash: the last array value ('d') becomes a hash key with no associated value.

Hope that helps,

Athanasius <°(((>< contra mundum Iustus alius egestas vitae, eros Piratica,

[reply]
[d/l]
[select]

Re^2: How to code a complex AoH?

by haukex (Archbishop) on Mar 25, 2017 at 15:25 UTC

points => \@points,

Actually, that won't work here - it looks like iatros isn't using strict, or has predeclared @points outside of the loop, so that \@points and therefore $$student{points} will always point to the same array, and that array will get overwritten on each iteration of the loop. points => [@points], will work correctly, since it creates a (shallow) copy of the array. (Athanasius, I know you know all this, the explanation is for the benefit of the OP.)

iatros: You really should Use strict and warnings, and then use my to declare your variables, including inside the loop: my ( $id , $gender , $birthday ... ) = .... Then you can use both of the code examples that Athanasius showed, because then on each iteration of the loop, @points will be a "new" array.

Update: Tweaked explanation a tiny bit.

[reply]
[d/l]
[select]

Re^2: How to code a complex AoH?

by AnomalousMonk (Archbishop) on Mar 25, 2017 at 18:01 UTC

... the last array value ... becomes a hash key with no associated value.

iatros: Just ~~another~~ | a minor tweak to Athanasius's otherwise excellent ++explanation: The unpaired key will not have no associated value, but rather the undef value (that's where the '.5' => undef in the OPed dump comes from). This, of course, will probably just lead to more warnings down the line!

Give a man a fish: <%-{-{-{-<

[reply]
[d/l]
[select]

Re^2: How to code a complex AoH?

by iatros (Novice) on Mar 30, 2017 at 17:26 UTC

Thank you for the reply. I hope I've learned my lesson with your help. I will think over the parsing thing. As Larry put it: TIMTOWTDI -HM-

[reply]

Re: How to code a complex AoH?
by stevieb (Canon) on Mar 25, 2017 at 14:57 UTC

Welcome to the Monastery, iatros!

I'm just about to run out the door so I don't have time for a thorough review, but one thing immediately popped out:

'points'   => @points
[download]

That's what is most likely causing the issue, because that really looks like this:

points => element1, element2, element3

In essence, assigning an array to a hash key will attempt at adding more key, value pairs. You need to assign a reference to the array instead:

points   => \@points
[download]

Also note that in a hash, unless there are special characters involved, the "fat comma" (ie =>) will auto-quote the left-hand-side, so you don't have to in your hash creation. This is why I left the single quotes off of the key in the example above. Just makes for a bit more cleaner code.

[reply]
[d/l]
[select]

Re^2: How to code a complex AoH?

by iatros (Novice) on Mar 30, 2017 at 17:18 UTC

Thanks for the reply and the tip with the auto-quote feature. Definitely more readable and less typing. Thx -HM-

[reply]

Re: How to code a complex AoH?
by haukex (Archbishop) on Mar 25, 2017 at 15:55 UTC

You've already got some answers to your issue - just note the issue with \@points that I discussed here, including that it's a good idea if you Use strict and warnings.

Since it looks like you're parsing CSV, I'd recommend using a module for this, in particular Text::CSV:

#!/usr/bin/env perl
use warnings;
use strict;
use Text::CSV;

my $csv = Text::CSV->new({binary=>1, auto_diag=>2});
$csv->getline(\*DATA); # read and discard header
my @candidates;
while ( my $row = $csv->getline(\*DATA) ) {
    my %student;
    for my $key (qw/ id gender birthday status room seat version /) {
        # remove the first elements of the arrayref
        $student{$key} = shift @$row;
    }
    # all that's left in the arrayref is the points
    $student{points} = $row;
    push @candidates, \%student;
}
$csv->eof or $csv->error_diag;

use Data::Dump;
dd \@candidates;

__DATA__
ID,gender,birthdate,order,room,seat,version,points,,,,,,,,,,,,,,,,,,,,
+,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
277710814533,f,01/02/1993,m,sr_3,A11,A, 1,1,1,1,0,1,1,1,.5,1,1,1,0,1,.
+5,1,1,1,0,1,.5,1,1,0,1,1,1,1,1,1,1,0,0,1,0,1,.5,1,1,1,1,.5,0,1,1,1,0,
+1,1,1,1,1,0,1,1,1,.5,1,1,1
755310765962,f,31/07/1992 00:00,v,aula,C11,C,1,.5,0,1,1,1,1,1,1,1,1,1,
+1,1,0,1,1,1,1,1,0,1,1,1,1,0,1,1,1,.5,1,0,.5,1,0,1,.5,0,.5,0,1,0,0,.5,
+1,1,0,.5,1,1,.5,.5,1,.5,.5,1,1,1,.5,.5
394610513538,m,20/10/1992 00:00,m,sr_3,E13,E,1,1,0,.5,1,1,1,1,1,1,1,.5
+,1,1,.5,.5,1,1,1,.5,.5,1,1,1,1,0,0,.5,1,1,.5,.5,.5,.5,0,1,0,.5,0,0,1,
+0,1,.5,0,1,0,0,.5,1,0,1,1,0,.5,.5,.5,.5,.5,.5
[download]

Note that DATA is a special filehandle that refers to the __DATA__ section at the end of the script, but you can use any other filehandle here, like FH in your program. Although, it'd be better to switch to "lexical" filehandles, that is open my $fh, '<', $filename or die "open $filename: $!"; (see open).

[reply]
[d/l]
[select]

Re^2: How to code a complex AoH?

by iatros (Novice) on Mar 30, 2017 at 15:42 UTC

#!/usr/bin/perl -w
use File::Basename ;
use Fcntl qw (:flock :seek) ;
use warnings ;
use strict ;

use Data::Dumper qw(Dumper);
[download]

[reply]
[d/l]