in reply to multidimensional hash emulation vs hash of hashes

First, anytime you are curious about a performance question, you should Benchmark. Nothing gives you truth like experiment.

#!/usr/bin/perl -w use strict; use Benchmark qw':all :hireswallclock'; my %the_hash = ( empl_john_id => 13, empl_john_position => 'slave', empl_bob_id => 0, empl_bob_position => 'manager', client_fred_id => 2, client_fred_phone => 12345, client_goldman_id => 0, client_goldman_phone => 666 ); my %the_hash2 = ( empl => { john => { id => 13, position => 'slave' }, bob => { id => 0, position => 'manager' } }, client => { fred => { id => 2, phone => 12345 }, goldman => { id => 0, phone => 666 } } ); cmpthese(10**6, { 'emulation' => sub {$the_hash{client_goldman_phone}++ }, 'HoH' => sub { $the_hash2{client}{goldman}{phone}++ }, });

yields

(warning: too few iterations for a reliable count) Rate HoH emulation HoH 1883239/s -- -27% emulation 2564103/s 36% --

In a literal sense, multidimensional hash emulation will be faster than using a hash of hashes (at least for this test). This is mostly because there is no need for dereferencing the sub elements (I think). However, this is highly unlikely to be the bottleneck in your code, and multidimensional hash emulation is considered poor form because it makes code harder to understand, and hence harder to debug and maintain. This technique's usage mainly predates the introduction of proper references into Perl.

For a discussion of hash performance, see A short meditation about hash search performance.

Replies are listed 'Best First'.
Re^2: multidimensional hash emulation vs hash of hashes
by moritz (Cardinal) on Aug 18, 2011 at 15:52 UTC

    While certainly a good first step, this benchmark is very one-dimensional (benchmarks usually are :-)

    For example the choice of data representation also dictates how you process the keys for accessing the information, and the data structure also needs to be build, which might also take a significant amount of time.