Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Total beginner here. When building a 'complex data structure' out of nesting lists and hashes, is it better to do this:
@a = ( [1,2,{foo=>"foo",bar=>"bar"}], [3,4,{baz=>"baz",far=>"baz"}] );
or this:
$r = [ [1,2,{foo=>"foo",bar=>"bar"}], [3,4,{baz=>"baz",far=>"baz"}] ];
? I've seen the former more often, but I'm not sure why... to me, the latter makes it more natural to pull out elements since it's references all the why down to the scalars ($r->[0]->2->{foo}). Any input welcome.

Replies are listed 'Best First'.
Re: when to use lists/hash vs references?
by almut (Canon) on Jul 15, 2010 at 21:01 UTC

    As long as you know what the difference is, it doesn't really matter much which one you use.

    Under some circumstances, it might be more "natural" to use a plain hash, for example when Perl syntactically requires a hash, as in keys %hash (because then you don't have to write keys %$hash to dereference a hashref).  In other cases, it might be more natural to use a hashref in the first place, e.g. when you want to pass it to a function without copying/flattening it (so you can say func($hash) instead of func(\%hash) ).  But ultimately, it's just a matter of personal preference.

    P.S.: the arrows between indices are optional (because there's nothing to disambiguate — nested structures are always references), i.e. your sample case could also be written as

    $a[0][2]{foo} # plain array $r->[0][2]{foo} # array ref
Re: when to use lists/hash vs references?
by BrowserUk (Patriarch) on Jul 15, 2010 at 21:19 UTC

    Not the only, and maybe not the most important, but here's one reason. The extra dereference costs:

    c:\test>p1 @a = map[1..3],1..1e6;; $r = \@a;; cmpthese -1, { a=>q[ ++$a[$_][1] for 0..$#a ], b=>q[ ++$r->[$_][1] for 0..$#$r ] };; Rate b a b 3.94/s -- -16% a 4.72/s 20% --

    Another is that the extra syntax can get unwieldy for involved expressions.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      The extra dereference costs...

      So does the stack manipulation to pass or return a large list.

        Of course. So don't do that.

        Declare an array, and pass a reference.

        I also prefer to use

        sub arrayManip { our @a; local *a = shift; # do stuff with @a } my @array = (...); arrayManip( \@array );

        But you doubtless consider that too "complex".

Re: when to use lists/hash vs references?
by FunkyMonk (Bishop) on Jul 15, 2010 at 21:16 UTC
    I've seen the former more often, but I'm not sure why... to me, the latter makes it more natural to pull out elements since it's references all the why down to the scalars ($r->[0]->[2]->{foo})
    Actually, only the first dereferencing arrow is needed for a reference:
    # using your @a and $r say $r->[0]->[2]->{foo}; # foo say $r->[0][2]{foo}; # foo say $a[0]->[2]->{foo}; # foo say $a[0][2]{foo}; # foo

    So now I think you'll find that the array is the most "natural". As almut said, it's mostly preference.

    I think you'll find the perl data structures cookbook a good read.

Re: when to use lists/hash vs references?
by TomDLux (Vicar) on Jul 16, 2010 at 01:29 UTC

    Basically, the question is, should you use an array of deeply nested structures, or a reference to an array of d.n.s.?

    I occasionally use an array, if I'm certain it will be created, used and destroyed all within one routine. If you're going to pass it around, you're going to pass the reference, anyway, so might as well create the reference and store it in a scalar.

    --
    TTTATCGGTCGTTATATAGATGTTTGCA

Re: when to use lists/hash vs references?
by aquarium (Curate) on Jul 16, 2010 at 03:56 UTC
    This has turned into a bit of a heated argument. I'm sure it helps us all clarify our views regarding coding preferences..as long as we don't get too personal.
    My preference for the most natural coding is without syntatic sugar or other un-natural synthetic additives.
    Seriously my advice is that once your data structure starts requiring treatment/code as an object, the reference to the object becomes the natural way to handle it.
    the hardest line to type correctly is: stty erase ^H