Array of Hashes population

barakuda has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Array of Hashes population by Anonymous Monk on Mar 06, 2008 at 16:56 UTC
You probably want to add a hash reference (a scalar) to the end of your array, `push @data, \%tmp;` [download] You might prefer to return a array reference for speed.	[reply] [d/l]
Re^2: Array of Hashes population by barakuda (Initiate) on Mar 06, 2008 at 17:04 UTC
Wow! That's beautiful :) Thank you! So, you can't actually have an array of hashes, only an array of hash references?	[reply]
Re^3: Array of Hashes population by swampyankee (Parson) on Mar 06, 2008 at 17:37 UTC
Correct; elements of Perl arrays and hashes must be scalars. One "converts" a variable (@array, %hash) into a reference to a variable by prefixing a backslash(\), so the reference to @array would be \@array. I believe that references can also point to subs, which means that one could have an array which has elements that are, variously, references to hashes, references to arrays, references to subs, references to scalars, and actual scalars. emc Information about American English usage here and here. Floating point issues? Read this before posting: http://docs.sun.com/source/806-3568/ncg_goldberg.html	[reply]
Re: Array of Hashes population by dwm042 (Priest) on Mar 06, 2008 at 18:25 UTC
I'm sure that by now you've gotten a number of good answers, but I couldn't resist working with this one. What happens with the statement: `push @data, %tmp;` [download] Is that the hash is flattened into a list, and the hash structure is lost. Wrapping %tmp in curly brackets i.e. '{' and '}' turns %tmp into an anonymous hash and that preserves the hash structure. `push @data, { %tmp }; and later .. print $data[0]->{COMPONENT};` [download] Code where you can see all this (play with the push statement) is given below: #!/usr/bin/perl use warnings; use strict; use Data::Dumper; my $file = "<DATA>"; my @columns = ( "ONE", "TWO", "THREE", "FOUR", "FIVE" ); my @AoH = read_dat($file); print Data::Dumper->Dump(\@AoH); print "\n\nArray of hash element: ", $AoH[0]->{"THREE"}, "\n"; sub read_dat { my $file = shift; my @data = (); for my $line( <DATA> ) { next unless $line =~ m/\w+/; my @line = split (",", $line); my $i = 0; my %tmp; foreach my $col (@columns){ $tmp{$col} = $line[$i]; $i++; } push @data, { %tmp }; } return @data; } __DATA__ 0,1,2,3,4 1,2,3,4,5 2,3,4,5,6 3,4,5,6,7 [download] Update: typo fixes	[reply] [d/l] [select]
Re^2: Array of Hashes population by TGI (Parson) on Mar 06, 2008 at 19:56 UTC
That was an excellent explanation of the problem. I thought that it was worth commenting on the two suggested methods of storing the hash ref in the array, because there is a subtle difference between them. `# Method 1 push @data, {%tmp}; # Method 2 push @data, \%tmp;` [download] Method 1 copies the values in `%tmp` into a new anonymous hash. The reference to the anonymous hash is stored in `@data.` `Method 2 stores a reference to the %tmp hash.` `Either of these approaches could be desirable or lead to interesting bugs, depending on the situation. Other times, it may make almost no difference. In this case, I'd probably use method 2 to avoid an unnecessary copy operation, but I see no strong argument for either approach in this case--it really comes down to personal preference.` `The OP might find perldsc and perlreftut to be interesting reading.` BTW, I made a couple of other changes. I used map to generate the hash. It's a bit more compact than the for loop. Another change is using a while loop to read the file. When you use a for loop, you will read the entire file into memory before processing begins. With a while loop, only one line is loaded at a time. Also, note the test in the while loop. The 'defined' is needed to filter out lines evaluate to false in a boolean context--for example: "0\n". I also added a chomp so that the last field doesn't end in "\n" all the time--that gets annoying. sub read_dat { my $file = shift; my @data = (); while ( defined (my $line = <DATA>) ) { next unless $line =~ m/\w+/; chomp $line; my @line = split (",", $line); my %tmp = map { $columns[$_] => $line[$_] } 0..$#columns; push @data, \%tmp; } return @data; } [download] Warning: I haven't tested the above code. It may harbor typos and silly logic errors. TGI says moo	[reply] [d/l] [select]
Re: Array of Hashes population by Roy Johnson (Monsignor) on Mar 06, 2008 at 17:10 UTC
A slice assignment would spare you walking through the columns and maintaining a counter. `#!perl use strict; use warnings; my @columns = qw(col1 col2 col3); my @data = (); while (<DATA>) { chomp; my %tmp; @tmp{@columns} = split ','; push @data, \%tmp; } use Data::Dumper; print Dumper \@data; __DATA__ one,two,three four,five,six` [download] Caution: Contents may have been coded under pressure.	[reply] [d/l]
Re: Array of Hashes population by EvanCarroll (Chaplain) on Mar 06, 2008 at 17:32 UTC
You're doing this the difficult way. Use DBD::CSV, at the least that makes all of the methods in the DBI available to you. `$hash_ref = $dbh->selectall_hashref($statement, $key_field); $hash_ref = $dbh->selectall_hashref($statement, $key_field, \%attr); $hash_ref = $dbh->selectall_hashref($statement, $key_field, \%attr, +@bind_values);` [download] The $key_field parameter defines which column, or columns, are used as keys in the returned hash. It can either be the name of a single field, or a reference to an array containing multiple field names. Using multiple names yields a tree of nested hashes. Of course, with this method you have to know how to write a select statement... update If you're looking for an AoH, just `my @rows; push @rows, $_ while $sth->fetchrow_hashref` Evan Carroll I hack for the ladies. www.EvanCarroll.com	[reply] [d/l] [select]