Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Ok. Here's a crude demo in case someone is interested in doing benchmarking comparisons or whatever. The program is pure C, tested on linux 64-bit. Give two arguments: Nkeys and Filename. File should contain sym-value pairs, both unique in their domains. (Space-separated, one pair per line).

Update. Some additional notes regarding the program.

Portability is easily improved upon:
  • Replace mmap/mremap with malloc/realloc. This is trivial.
  • The test for direct vs indirect entries (ss offset) might be written as !(s->o >> 56), avoiding issues with endianness.
There are at least two bugs to fix:
  • Verify sym to val loop has no test for NULL; this will fault on a missing key.
  • The fetch_by_key should not proceed with a memcmp in case keylen > 8, but the slot is direct (ie <= 8).
Many further optimizations to consider:
  • Try to allocate ss strings so that cache-line boundaries aren't crossed.
  • Terminate strings with a high-bit: more compact, plus faster than using memchr.
  • Specific inlined implementations for 8-byte memcmp, etc.
  • Inplace reorder looks up every key twice. Place an (un)processed flag to halve this.
  • Choose a faster hash function for string keys, FNV perhaps.
  • A primitive hash function for values. Just a mul by constant?
There's also a wishlist for the CMPH library, but enough for now.


In reply to Re^2: Bidirectional lookup algorithm? (try perfect hashing) by oiskuu
in thread Bidirectional lookup algorithm? (Updated: further info.) by BrowserUk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (5)
As of 2024-04-19 14:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found