Re: Re (tilly) 3: Efficiency Question

You can speed up the _clear_cache, which would in probably eat up the most cycles, being called once every 30 insertions, like so:

sub clear_cache {
  my @ids = sort {$count{$b} <=> $count{$a}} keys %count;
  splice @ids, 0, $reset_size-1;
  delete @cache{@ids};
  %count = ();
  @count{keys %cache} = (0) x $reset_size;
  $size = $reset_size;
}
[download]

Faster to delete elements from an existing hash, I've found, than to rebuild a hash. This of course depends mostly on the ratio of deleted/total hash entries.

Also, this is isn't quite a frequency-expiration cache, since time is not a factor into the expiration routine, and you would stand a pretty good chance of getting stale cache entries that once had a bunch of hits, but remain in the cache even though they are entirely unused. I'd probably switch the count over to something like a shift-based MRU, called once every 30 fetches:

my $do_shift = 0;
sub get_elem {
  my $id = shift;
  if (++$do_shift == 30) {
    shift_mru();
    $do_shift = 0;
  }
  if (exists $count{$id}) {
    $count{$id} |= 0x80000000;
    return $cache{$id};
  }
  else{
    return;
  }
}

sub shift_mru {
  while (my ($id, $count) = each %count) {
    $count{$id} >>= 1;
  }
}
[download]

I also seem to recall that there are ways to optimize the sorting algorithm for a return of fewer results than the total number to be sorted, but that's getting rather hairy :-)

   MeowChow                                   
               s aamecha.s a..a\u$&owag.print

Comment on Re: Re (tilly) 3: Efficiency Question Select or Download Code