Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: Bitten by the worst case (or why it pays to know whats inside the black box)

by duff (Parson)
on Jun 26, 2004 at 23:23 UTC ( [id://369893]=note: print w/replies, xml ) Need Help??


in reply to Bitten by the worst case (or why it pays to know whats inside the black box)

Nice meditation!

After a bit of review I realized that the problem was due to the way I was implementing the cache lookup. The process was essentially this: produce the union of four queries each of which returned a variable number of two field records, (the first number being a telephone prefix, and the second being an ID that represented its pricing model) then join that data together to form a string key that could be used in a hash lookup to determine if the data was already in the cache.

Some other ideas:

  • use the queries themselves as the key into the cache instead of the results
  • since most of the data is sorted already, instead of just pushing onto the end, insert them at the correct place
  • and, inspired by a recent QOTW ... it shouldn't matter that the data are sorted asciibetically, just that they're sorted the same way each time. So you can do something like this:
    use List::Util qw(shuffle); my $alphabet = join '', shuffle 'a'..'z'; # do this only once and sav +e it to a file for later reading my $t; @union = map { $_->[1] } sort { $a->[0] cmp $b->[0] } map { eval "(\$t = lc \$_) =~ tr/$alphabet/a-z/"; [ $t, $_ ] +} @union;
    Not sure if there's a win there or not, but it's interesting anyway :)

Replies are listed 'Best First'.
Re^2: Bitten by the worst case (or why it pays to know whats inside the black box)
by demerphq (Chancellor) on Jun 26, 2004 at 23:41 UTC

    Hmm. Some interesting thoughts here. I can't use the queries as part of the problem is that the results for different queries can be the same. Although its only implied these are lists of Prefix=>Zone Assignments for customers. So two customers may have the same assigments. I think instead of merging the lists. (which isnt that terrible an idea BTW) I might actually rework the way the queries are executed so I can do the union of the four in a single query and let the DB sort it. The last idea certainly is an interesting approach to avoiding the shuffle.

    Thanks for the brainfood. :-)


    ---
    demerphq

      First they ignore you, then they laugh at you, then they fight you, then you win.
      -- Gandhi


      In general, if you can let the database do the sorting/merging of the four queries it is likely to be faster (after all, that's what a database server is optimized to do!).

      There may of course be situations where trying to get the database to do the right thing requires too much shoe-horning of the data to be worth it...

      Michael

        Yeah I reckon so as well. In fact between the various responses I've received I reckon there are a few different optimisations that can be had for cheap. I look forward to trying them out tomorrow.


        ---
        demerphq

          First they ignore you, then they laugh at you, then they fight you, then you win.
          -- Gandhi


Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://369893]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2024-03-28 17:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found