in reply to Re^9: [OT] The interesting problem of comparing (long) bit-strings.
in thread [OT] The interesting problem of comparing bit-strings.

How is your data?

There's no way to say as it it intended for general use. Some bitsets may be sparse; some very dense.

The current use-case is characterised by substantial chunks of zero data interspersed with longish runs of fairly dense stuff.

B-M can potentially be several orders of magnitude faster than the brute-force approach

You keep claiming that, but for all the reasons I've outlined elsewhere, I do not believe it.

I'd be happy to be proved wrong, because in the end, I just want the fastest search I can code, but I can't even see how you would adapt B-M to bit-string search.

I'm doing my comparisons in 64-bit chunks; but building delta tables with 64-bit indices is obviously not on.

So, you look to doing byte sized compares in order to keep the table sizes reasonable.

BUT:

  1. Doing 8 byte-byte compares instead of a single quad-quad compare, cost way more than 8 times as much.

    Not only does it require 8 cmp instructions instead of 1, it also requires 8 counter increments and 8 jumps.

    Even if the compiler unwound the loop -- which it doesn't -- or I coded it in assembler, which I won't, it would still take substantially more than 8 times longer because loading a 64-bit register with 8-bit units, means the microcode has to shuffle 7 of the 8-bytes into the low-8 bits of the register. And it has to do that for both comparands.

    So, the 8 x 8-bit compares versus 1 64-bit is more than 8 times slower.

    But don't forget that for each n-bits, you need to do n comparisons with one of the comparands shift 1-bit each time.

    So now the delta between 1 x 64-bit comparison and 8 x (unaligned) 8-bit comparisons, becomes 64 x 64-bit comparison versus 64 x 8-bit comparisons.

    And that's not to mention that the 8-bit values from which bits need to be shifted in will also need to be shuffled by the microcode, adding further overheads.

  2. Instead of 2 tables (4*needle length in bytes each) you'd need 16 tables in order to deal with the bit-aligned nature of the needle.

    For a modest-size 8192-bit needle, you're looking at 16*4*1024 = 64k of table space, that needs to be randomly accessed and thus wiping out my 32k L1 cache in the process.

I don't know for sure, because I haven't tried it, because I don't believe it would be beneficial. I'd be happy to be proved wrong, but I don't think I will be.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
  • Comment on Re^10: [OT] The interesting problem of comparing (long) bit-strings.

Replies are listed 'Best First'.
Re^11: [OT] The interesting problem of comparing (long) bit-strings.
by salva (Canon) on Mar 31, 2015 at 12:53 UTC
    I can't even see how you would adapt B-M to bit-string search.

    That remains me of another of your questions. The trick is to consider that at every bit a new "byte" is introduced.

    but building delta tables with 64-bit indices is obviously not on

    From here to the end everything you say is mostly wrong. B-M for bit-strings can be implemented using a table of fixed size, that can comfortably fit in the L1 cache (needle size doesn't matter at all).

    Even better, most of the time, all the work can be done on bytes, with very little bit-level fiddling.

    In the worst scenario, the overhead over the brute-force approach would probably be a few machine instructions per haystack bit, on L1-cached data!

      I can continue to explain my reasoning; and you can continue to state your beliefs till we're both blue in the face.

      Blah! Prove it!

        # $file $needle_bit_offset $needle_bit_length $repetitio +ns ./bitstrstr test.dat 100000000 2000 +10 needle found at 100000000, expected at 100000000 in 1.6/10 = 0.16ms

        Update:

        $ for i in 16 20 30 40 60 100 200 400 1000 3000 10000; do echo $i; ./b +itstrstr test.dat 100000000 $i 10; done 16 needle found at 164016, expected at 100000000 in 1.1/10 = 0.11ms 20 needle found at 949378, expected at 100000000 in 1.8/10 = 0.18ms 30 needle found at 100000000, expected at 100000000 in 1018.4/10 = 101.84 +ms 40 needle found at 100000000, expected at 100000000 in 38/10 = 3.8ms 60 needle found at 100000000, expected at 100000000 in 924.1/10 = 92.41ms 100 needle found at 100000000, expected at 100000000 in 12/10 = 1.2ms 200 needle found at 100000000, expected at 100000000 in 6.2/10 = 0.62ms 400 needle found at 100000000, expected at 100000000 in 3.7/10 = 0.37ms 1000 needle found at 100000000, expected at 100000000 in 2.3/10 = 0.23ms 3000 needle found at 100000000, expected at 100000000 in 0.9/10 = 0.09ms 10000 needle found at 100000000, expected at 100000000 in 0.4/10 = 0.04ms
Re^11: [OT] The interesting problem of comparing (long) bit-strings.
by Anonymous Monk on Mar 31, 2015 at 12:58 UTC

    How could anyone ever prove YOU wrong?

    Do you really expect someone to step in and provide you with a well-coded, robust, generic, library-quality implementation of a bitstring search (analogous to strstr)?

    It is time to stop feeding the trolls.

      How could anyone ever prove YOU wrong?

      In truth, YOU probably couldn't; all you've done is snipe from the side lines. Who's the troll now?

      On the other hand, if anyone can -- if it is possible -- then it'll be salva. He has the proven record of producing working code to solve difficult problems.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
      In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
      dammit, I though a prove of concept would be enough!
        I though a prove of concept would be enough!

        What PoC? Where?

      Do you really expect someone to step in and provide you with a well-coded, robust, generic, library-quality implementation of a bitstring search

      No. Something more than unsubstantiated statements of opinion would be nice though.

      Something -- pseudo-code, a link, a paper (on bit-string search) -- anything more than "I think...therefore it must be so", would be good.

      I've stated what I'm doing; I've posted enough code to show how I'm doing it; I've posted a substantial table of the results.

      I've explained (ad-neaseum) why I don't believe Boyer-Moore works for bit-string search; and all I've got in return is opinions. (apart from oiskuu who posted code that doesn't appear to work!)

      I really think that until you've tried to implement this; you do not appreciate that trying to extrapolate byte-string search algorithms to bit-string search is fraught with problems that YOU haven't thought about. I have, because I done it!


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
      In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked