in reply to Re^4: Why Boyer-Moore, Horspool, alpha-skip et.al don't work for bit strings. (And is there an alternative that does?)
in thread Why Boyer-Moore, Horspool, alpha-skip et.al don't work for bit strings. (And is there an alternative that does?)

The first group to inspect is at position (needle_len - 8): 10111011. That does not appear in the needle, so apply the maximum shift, which is 17 (needle_len - 8 + 1). Were it present in the needle, then the shift would've been (needle_len - 8 - offset), followed by compare.

  • Comment on Re^5: Why Boyer-Moore, Horspool, alpha-skip et.al don't work for bit strings. (And is there an alternative that does?)

Replies are listed 'Best First'.
Re^6: Why Boyer-Moore, Horspool, alpha-skip et.al don't work for bit strings. (And is there an alternative that does?)
by BrowserUk (Patriarch) on Apr 05, 2015 at 20:29 UTC

    Did I go wrong some where cos it seems to miss it.

    00000110 needle offset 14 00001010 needle offset 0 00001101 needle offset 15 00010100 needle offset 1 00011010 needle offset 16 00101001 needle offset 2 00101010 needle offset 7 01000001 needle offset 12 01001010 needle offset 5 01010000 needle offset 10 01010010 needle offset 3 01010100 needle offset 8 10000011 needle offset 13 10010101 needle offset 6 10100000 needle offset 11 10100101 needle offset 4 10101000 needle offset 9 The first group to inspect is at position (needle_len - 8): 10111011. That does not appear in the needle, so apply the maximum shift, which +is 17 (needle_len - 8 + 1). Were it present in the needle, then the shift would've been (needle_le +n - 8 - offset), followed by compare. ???????? ???????? ???????? ??? +????? ???????? ???????? ???????? ???????? 0000111010011100101110110111000111101011110011111011111110000000100010 +0100001010010101000001101000110010011000101101010110011011000000 + 000010100101010000011010 Not there; max. shift 17 000010100101010000011010 Not there +17 000010100101010000011010 Not there + +17 0000101001010100000 +11010 Not there +17 00 +0010100101010000011010 In table +12 + 000010100101010000011010 Not there +17 + 000010100101010000011010 Not there +17 + 000010100101010000011010 +Not there +17

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
    In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
    code

      In table +12 => Shift by 4 (24-8-12).

        It required one small but important modification to your algorithm; that was a bitch to track down.

        The table calculation: table[ idx ] = needleLen - 8 + needleOffset; can result in a value of 0 (ie. 24 - 8 - 16 = 0 ), which means that the algorithm stalls.

        My current fix is to simply test for and replace any 0 table entries with 1 so it keeps moving forward. (I suspect that there is a better fix, but its not leaping off the page at me; and its time to sleep.)

        The upshot is that a brute force search that takes 9.72 seconds (a 220 bits needle in a 231 bits haystack, found at an offset of 2e9) takes:

        1. 8-bit table: 2.807604e-002 secs.
        2. 12-bit table: 7.613071e-003 secs.
        3. 16-bit table: 4.428185e-003 secs.

        Over 3 orders of magnitude speed up is all together a very successful outcome. Thank you anonymonk.

        Thanks also to salva & oiskuu who's persistent belief that a fast-bit-search was possible, led me to start this thread and get this solution.

        Here is a table of timings; bitstrstr1 is the brute-force; 3 is the 8-bit; 4 is the 16-bit; and 5 the 12-bit.

        The first entry (-1) is a full search for a non-existent needle:

        C:\test\C>bitstrstr "DeBruijn(2^31).bin" 1048576 1 250000000 v:00000000008E0040 fsize:0000000010000004 vsize:2000000 bitstrstr1: 0 : -1 took:1.030802e+001 secs bitstrstr1: 250000000 : 250000000 took:1.274025e+000 secs bitstrstr1: 500000000 : 500000000 took:2.420335e+000 secs bitstrstr1: 750000000 : 750000000 took:3.604043e+000 secs bitstrstr1: 1000000000 : 1000000000 took:4.826112e+000 secs bitstrstr1: 1250000000 : 1250000000 took:6.205683e+000 secs bitstrstr1: 1500000000 : 1500000000 took:7.483219e+000 secs bitstrstr1: 1750000000 : 1750000000 took:8.526418e+000 secs bitstrstr1: 2000000000 : 2000000000 took:9.723123e+000 secs bitstrstr3: 0 : -1 took:1.019202e-001 secs bitstrstr3: 250000000 : 250000000 took:3.729577e-002 secs bitstrstr3: 500000000 : 500000000 took:3.556026e-002 secs bitstrstr3: 750000000 : 750000000 took:1.154790e-002 secs bitstrstr3: 1000000000 : 1000000000 took:1.110663e-002 secs bitstrstr3: 1250000000 : 1250000000 took:1.219347e-002 secs bitstrstr3: 1500000000 : 1500000000 took:1.436291e-002 secs bitstrstr3: 1750000000 : 1750000000 took:2.335553e-002 secs bitstrstr3: 2000000000 : 2000000000 took:2.807604e-002 secs bitstrstr4: 0 : -1 took:8.661798e-003 secs bitstrstr4: 250000000 : 250000000 took:4.236888e-003 secs bitstrstr4: 500000000 : 500000000 took:4.733442e-003 secs bitstrstr4: 750000000 : 750000000 took:4.514161e-003 secs bitstrstr4: 1000000000 : 1000000000 took:4.211377e-003 secs bitstrstr4: 1250000000 : 1250000000 took:4.328529e-003 secs bitstrstr4: 1500000000 : 1500000000 took:4.249651e-003 secs bitstrstr4: 1750000000 : 1750000000 took:4.525668e-003 secs bitstrstr4: 2000000000 : 2000000000 took:4.428185e-003 secs bitstrstr5: 0 : -1 took:2.952949e-002 secs bitstrstr5: 250000000 : 250000000 took:1.516276e-002 secs bitstrstr5: 500000000 : 500000000 took:7.000594e-003 secs bitstrstr5: 750000000 : 750000000 took:6.885374e-003 secs bitstrstr5: 1000000000 : 1000000000 took:6.434349e-003 secs bitstrstr5: 1250000000 : 1250000000 took:6.177053e-003 secs bitstrstr5: 1500000000 : 1500000000 took:6.085954e-003 secs bitstrstr5: 1750000000 : 1750000000 took:8.395146e-003 secs bitstrstr5: 2000000000 : 2000000000 took:7.613071e-003 secs

        The code needs a little clean-up, but I'll post it if anyone is interested?


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
        In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

        You did say that already didn't you. Sorry for having to be spoon-fed.

        The upshot is an algorithm I understand; seems to work; and I can implement. Thank you.

        I have certain fears that it won't yield huge saving, because with a modest needle size of 1024 (my needles can be several million bits), you get 1024 - 8 + 1 8-bit patterns, thus an average of 4 candidates per pattern. With the maximum shift very unlikely, and using the minimum of the (ave.) four possibles for each pattern, the shifts are likely to be very modest I think.

        Which I guess means, increase the size of the table. 1024 - 16 +1 / 65536 = 0.0153961181640625 per pattern. Should yield better results; at the penalty of a larger table and greater competition for cache.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
        In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked