Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Good afternoon, MrCromeDome. It seems you are having the same sort of problem as do a lot of search engines... needing to save cycles on many repeat queries. There are a lot of approaches to it, and it really is going to depend on the result set size, what hardware you have it on, etc. I'm going to assume that you have mySql as a database engine, as these tips will hold true for many many DBMs out there.

  1. First off, consider this, what is the upper limit size on your result set. Are you looking at 100s, 1000s, more? Because at a certain point, these caching mechanisms are not really going to help you. You'll have to do some performance tweaking to see what size category the bulk your results end up.
  2. Secondly, consider how often your data changes? This is essential in determining when your cache should drop dead, and the next search should store new data. If you have rapidly changing data (a good amount entered new each day), consider a cached time of perhaps 4 hours, or do it on off cycles, such as overnight.
In mySQL, whenever you request a large result set that has an "ORDER BY" statement in it, mySQL has to at least analyze the entire result set, then apply the limiter to it, to get the correct results. This can be quite tedious on the server, as you often pass over tons of results that you never pass back to the app. One way of caching is to get and hold all of the results that mySQL will come back to you with, and store them someplace fast to look them up in, such as another table.

A proposed table would take the form:
  • result_id = primary key int not null
  • dropdead = datetime, a future time in which to kill the result
  • searchstring = char(255) of search results.
  • resultcache = blob / text (or largeblob, etc)
Basically, you'd store the "SMITH JOHN" in the search string, and then form your result set in the resultcache, in some way that

Storage caching mechanisms can take two forms.
  • Binary column type, Storable, and freeze / thaw This method compresses the best, uncompresses fast, but is binary, and may be harder to debug.
  • A text data type (see the blob link above), Data::Dumper, and Data::Dumper->Dump / eval to store and then revive the data. This approach will store fewer results, is not as fast other approaches, but is easier to debug.
If you take a result set, and store it as an array, you can thaw (or eval), the entire result set and jump to a certain spot in the set. To look at this in psuedoperl:
my @list_of_cached_results; if($indatabase){ foreach my $result in ($search_result) { push @list_of_cached_results, result; } my $binary_cache = nfreeze(\@list_of_cached_results); #store lists of binary results } else { @list_of_cached_results = thaw($indatabase); } renderPage(@list_of_cached_results, $offset);
Basically, if you use the perl persistance modules, and a database (such as the one you're already reading from), you can pick up a good amount of speed from picking up a pre-packaged set of results (a match to your search string), and then display it to the page, saving you the hell of finding all the results again. As long as you can manage results and not have them go "stale", then you'll be all set.

Hope this made some sense to you. If I don't seem clear, I'd be happy to clarify anything in here.


In reply to Re: Database searches with persistent results via CGI by JayBonci
in thread Database searches with persistent results via CGI by MrCromeDome

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (5)
As of 2022-05-26 12:13 GMT
Find Nodes?
    Voting Booth?
    Do you prefer to work remotely?

    Results (93 votes). Check out past polls.