Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re^3: Data structures benchmark(pack vs. arrays vs. hashes vs. strings)

by BrowserUk (Patriarch)
on Dec 10, 2011 at 01:08 UTC ( [id://942755]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Data structures benchmark(pack vs. arrays vs. hashes vs. strings)
in thread Data structures benchmark(pack vs. arrays vs. hashes vs. strings)

the purpose is to take a string, split it and store it in memory in such a way that you can pass it around and not need to split it again when receiving it in some other part of the program.

Then nothing will be as fast as constructing an array of arrays and passing a reference to it around. It could not be so.

Reading between the lines, your main problem seems to be that yoo are inisting on copying the subarrays to local named scalars each time before using them, rather than just using them in-situ.

Ie. You are doing something like:

sub process { my( $AoA, $thingToProcess ) = @_; my( $v1, $v2, $v3, $v4, $v5, $v6, $v7 ) = @{ $AoA->[ $thingToProce +ss ] }; my( $r1, $r2, $r3, $r4, $r5, $r6, $r7 ) = ( ... some calculation(s +) involving $v1, $v2, $v3, $v4, $v5, $v6, $v7 ... ); @{ $AoA->[ $thingToProcess ] } = ( $r1, $r2, $r3, $r4, $r5, $r6, $ +r7 ); return; }

When you could be doing:

sub process { my( $AoA, $thingToProcess ) = @_; $AoA->[ $thingToProcess ][ 3 ] = $AoA->[ $thingToProcess ][ 1 ] * + $AoA->[ $thingToProcess ][ 2 ]; ... return; }

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?

Replies are listed 'Best First'.
Re^4: Data structures benchmark(pack vs. arrays vs. hashes vs. strings)
by spx2 (Deacon) on Dec 10, 2011 at 01:14 UTC

    this is correct. however, writing each time $AoA->[ $thingToProcess ][ something ] could lead to hard to understand code.

    also, if in the benchmarks, I use every time the "copying the subarrays to local named scalars" , these cancel themselves out, so basically the benchmark is still valid from this point of view, do you agree ?

      also, if in the benchmarks, I use every time the "copying the subarrays to local named scalars" , these cancel themselves out, so basically the benchmark is still valid from this point of view, do you agree ?

      Not really, no. The problem is you have equations something like:

      (call_time=1000) + (allocate_names=150) + (copy_values=250) + (extra_b +it_a=15) versus (call_time=1000) + (allocate_names=150) + (copy_values=250) + (the_ext +ra_bit=5)

      The extra bit is so small relative to the set-up and tear-down, you cannot accurately instrument the differences you are interested in. They just get mixed up in the noise of the overheads

      this is correct. however, writing each time $AoA->[ $thingToProcess ][ something ] could lead to hard to understand code.

      I sympathise with this. In this case I would construct the code differently. Instead of calling the subroutines as:

      sub process { my( $AoA, $thingToProcess ) = @_; $AoA->[ $thingToProcess ][ 3 ] = $AoA->[ $thingToProcess ][ 1 ] * $AoA->[ $thingToProcess ][ 2 ]; ... return; } ... process( $AoA, 123 );

      Do it this way:

      ## Use meaningful names obviously!! use constant { 0 => FIRST, 1 => SECOND, 2 => THIRD, 3 => FOURTH, 4 => FIFTH, 5 => SIXTH, 6 => SEVENTH }; sub process( our @s; local *s = shift; $s[ FOURTH ] = $s[ SECOND ] + $s[ THIRD ]; ... return; } process( $AoA[ $thingToProcess ] );

      The first (our) line allows us to use the global variable locally.

      The second line (local) aliases a local copy of the global variable to the sub array within the external @AoA

      The use constant gives us meaningful names for the subarray elements.

      The effect is direct, in-situ access to the subarrays without the need to copy and via short, meaningful names.

      • Aliasing is a very cheap operation -- just a pointer assigned.
      • All data copying is avoided.
      • Short, meaningful names.
      • Constants are resolved at compile time making access very fast.

        Real constants that is! Don't be fooled by the crass, laborious & slow, oxymoronic poor substitutes of "ReadOnly variables".

      The results is clean, safe and very readable and maintainable code that is also efficient.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://942755]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (6)
As of 2024-04-18 19:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found