in reply to Bitten by the worst case (or why it pays to know whats inside the black box)

Hello, first of all my congratulations for finding out. That's very good and you're right, people tend to use components or structures without understanding them at all.

The only thing that seems suspicious to me s that you have to use an 8000-value key to query your cache. Are you sure this is the only way to go? maybe there is a shorter input query with less independent parametrs you could use as a cache key. You could have a rough-hit and then check afterwards whether the entry is appropriate or not.

Also, in exchanging against a complex and costly structure building process, you might probably do for storing the cached results on disk or in a database and then reading when needed. This means you don't use RAM for the thing and you could have a very simple cronjob running purging the stale entries cache.

From what I read, the process itself might be taking a couple of seconds, as you say it was 10x worse when sorting. I don't think reading a cached entry from disk should take averagely more than 0.2 seconds (and likely much less), so that's anyway a 10x improvement without the risks and hassles of ending physical RAM or finding yourself on swap space.

Just my $0.02

  • Comment on Re: Bitten by the worst case (or why it pays to know whats inside the black box)

Replies are listed 'Best First'.
Re^2: Bitten by the worst case (or why it pays to know whats inside the black box)
by demerphq (Chancellor) on Jun 27, 2004 at 09:28 UTC

    Actually you may have a point regarding the key. However im not so sure about the ondisk versus in ram issue. Im on a box with large ram, and virtually all of the complex data structures will be reused several times, so my thinking is that I will just let the OS decide if the data should be swapped out or not. Only one of the large strucutures is used at a time, and is used for a reasonable amount of time, so I suspect that avoiding swapping isnt a win at all, and possibly even would be a loss. BTW, the 8000x2 data elements end up being about 100k+ of RAM, but even there its more the construction time than the memory that im worried about. It takes a while to build that 100k structure. (Ive actually contemplated rewriting Tie::Array::PackedC in C, or recoding not to use it at all to reduce the run time, but at this point im overall satisfied with the performance.)

    Anyway, more brainfood to digest. Thanks.


      First they ignore you, then they laugh at you, then they fight you, then you win.
      -- Gandhi