Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

I-F the entire data structure can fit into memory, the solution to this task is obvious.   (Although, if there is known to be “no paging,” even a simple search of an array just might be good enough.)   You can “debate differences of a few microseconds” any way you wish.   We already know that Perl’s data structure implementations are fast.

Really, it’s all going to come down to ... the production environment.   Will this program, on that machine, and surrounded by all of its other active processes, be able to operate continuously with all of its virtual-memory pages in RAM?   Or, not?

And you’d better sincerely hope that the answer is always “yes,” because if not, a program designed in this way could stagger like a drunken elephant.   First of all, it’s going to have a large working-set size.   Then, when it does start having pages stolen from it, it’s going to continue making r-a-n-d-o-m memory accesses throughout its l-a-r-g-e virtual memory space.   The probability of such a process experiencing a page fault, or many page faults, is quite high ... and every page-fault that causes disk access will incur a wait of several milliseconds.   Milliseconds add up very fast, to minutes and hours.   “Big working-set” and “pure random” do not play well together in production, and tests run on unconstrained, memory-plentious developer machines will only mislead you.   (You find out when you deploy.)

These are all external influences upon the program’s behavior which have nothing to do with Perl, nor with any programming language that the algorithm might be written in.   The vulnerability lies in the algorithm, itself.   You need to reduce the exposure to page faults, or make each page fault count for more.   Even if a database is used, you want to find ways to make a single search count for more than just the satisfaction of a single request.   You might for instance find it advantageous to load a bunch of search-keys into a temporary table then do a single SQL query with a JOIN.   Iterate over the resulting cursor.   Old-hat stuff.   Slower.   Yet ... graceful, predictable degradation.

In reply to Re: Bidirectional lookup algorithm? (Updated: further info.) by sundialsvc4
in thread Bidirectional lookup algorithm? (Updated: further info.) by BrowserUk

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (2)
As of 2022-05-22 03:34 GMT
Find Nodes?
    Voting Booth?
    Do you prefer to work remotely?

    Results (78 votes). Check out past polls.