in reply to Re^4: bit by overhead
in thread bit by overhead

What value or range of values would you see if you added:

sub to_cache { my $ticker = shift; my $data = shift; print scalar @$data; ############# What would be printed? my @list; foreach(@$data) { push @list, pack("A10FFFFL", @$_); } $data_cache{$ticker} = \@list; }

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^6: bit by overhead
by Anonymous Monk on Jan 06, 2011 at 21:06 UTC
    For the current example causing a problem, 251. However, this can be a wide variety of lengths, for long and detailed reasons.

      Making a small change to the implementation of your routines yeilds a 3x performance improvement:

      #! perl -slw use strict; use Benchmark qw[ cmpthese ]; my %data_cache; sub to_cache { my $ticker = shift; my $data = shift; my @list; foreach(@$data) { push @list, pack("A10FFFFL", @$_); } $data_cache{$ticker} = \@list; } sub from_cache { my $ticker = shift; my $data = $data_cache{$ticker}; my @rval; foreach (@$data) { my @row = unpack("A10FFFFL", $_); push @rval, \@row; } return \@rval; } my %cache2; sub to_cache2 { my( $ticker, $data ) = @_; $cache2{$ticker} = [ map pack("A10FFFFL", @$_), @$data ]; } sub from_cache2 { my $ticker = shift; return [ map unpack("A10FFFFL", $_), $cache2{$ticker} ]; } our @AoA = map[ '20110106', map( rand( 1e5), 1..4 ), int( rand 1000 ) ], 1 .. 251; cmpthese -1, { orig => sub { to_cache( $_, \@AoA ) for 1 .. 100; my $ref = from_cache( $_ ) for 1 .. 100; }, mod1 => sub { to_cache2( $_, \@AoA ) for 1 .. 100; my $ref = from_cache2( $_ ) for 1 .. 100; }, }; __END__ C:\test>880868-2 Rate orig mod1 orig 8.13/s -- -76% mod1 33.4/s 310% --

      Do you use all 251 sets of 6 values every time you retrieve the data from the cache?

      The gist of where I'm going with this, is that if you don't use them all each time, you might be better to use a two level cache so that you unpack less data each time.

      Or, if you do use all the value for each ticker each time, then could you not cache the results of whatever you do with them, rather than the raw data itself?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Or, if you do use all the value for each ticker each time

        ...you could collapse the remaining array too.

        Update: It's slower though??

        my %cache3; sub to_cache3 { my( $ticker, $data ) = @_; $cache3{$ticker} = pack "(A10FFFFL)*", map @$_, @$data; } sub from_cache3 { my $ticker = shift; return [ map [ unpack "A10FFFFL", $_ ], $cache3{$ticker} =~ /.{46} +/sg ]; }
        Rate mod2 orig mod1 mod2 9.43/s -- -1% -72% orig 9.52/s 1% -- -72% mod1 33.7/s 257% 253% --
        Perhaps I'm missing something here:
        sub from_cache2 { my $ticker = shift; return [ map unpack("A10FFFFL", $_), $cache2{$ticker} ]; }
        Shouldn't that be:
        sub from_cache { my $ticker = shift; my $ref = $data_cache{$ticker}; return [ map [unpack("A10FFFFL", $_)], @$ref ]; }
        The other way doesn't seem to work, since map wants an array, not an array reference, and since the data is stored in a multidimensional array, shouldn't each row be pushed as a reference?