in reply to Re^2: Why Boyer-Moore, Horspool, alpha-skip et.al don't work for bit strings. (And is there an alternative that does?)
in thread Why Boyer-Moore, Horspool, alpha-skip et.al don't work for bit strings. (And is there an alternative that does?)

Given the needle:

00001010 01010100 00011010
The bad shift table for handling eight bits at a time:
00001010                        needle offset 0
 0001010 0                      needle offset 1
  001010 01                     needle offset 2
   01010 010                    needle offset 3
    1010 0101                   needle offset 4
     010 01010                  needle offset 5
      10 010101                 needle offset 6
       0 0101010                needle offset 7
         01010100               needle offset 8
          1010100 0             needle offset 9
           010100 00            needle offset 10
            10100 000           needle offset 11
             0100 0001          needle offset 12
              100 00011         needle offset 13
               00 000110        needle offset 14
                0 0001101       needle offset 15
                  00011010      needle offset 16
The shift follows from needle offset. If some 8-bit value repeats, the smallest shift is used.

  • Comment on Re^3: Why Boyer-Moore, Horspool, alpha-skip et.al don't work for bit strings. (And is there an alternative that does?)

Replies are listed 'Best First'.
Re^4: Why Boyer-Moore, Horspool, alpha-skip et.al don't work for bit strings. (And is there an alternative that does?)
by BrowserUk (Patriarch) on Apr 05, 2015 at 18:35 UTC

    Okay? So, we have your table, the original haystack and the needle you used:

    00000110 needle offset 14 00001010 needle offset 0 00001101 needle offset 15 00010100 needle offset 1 00011010 needle offset 16 00101001 needle offset 2 00101010 needle offset 7 01000001 needle offset 12 01001010 needle offset 5 01010000 needle offset 10 01010010 needle offset 3 01010100 needle offset 8 10000011 needle offset 13 10010101 needle offset 6 10100000 needle offset 11 10100101 needle offset 4 10101000 needle offset 9 0000111010011100101110110111000111101011110011111011111110000000100010 +0100001010010101000001101000110010011000101101010110011011000000 000010100101010000011010

    We've compared the needle at the first position of the haystack and it doesn't match. Now what?

    1. Which group of 8 bits from the haystack do we lookup in the table?
    2. And how do we translate the offset at which that pattern was found in the needle; into a number of bits to shift the needle?

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
    In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

      The first group to inspect is at position (needle_len - 8): 10111011. That does not appear in the needle, so apply the maximum shift, which is 17 (needle_len - 8 + 1). Were it present in the needle, then the shift would've been (needle_len - 8 - offset), followed by compare.

        Did I go wrong some where cos it seems to miss it.

        00000110 needle offset 14 00001010 needle offset 0 00001101 needle offset 15 00010100 needle offset 1 00011010 needle offset 16 00101001 needle offset 2 00101010 needle offset 7 01000001 needle offset 12 01001010 needle offset 5 01010000 needle offset 10 01010010 needle offset 3 01010100 needle offset 8 10000011 needle offset 13 10010101 needle offset 6 10100000 needle offset 11 10100101 needle offset 4 10101000 needle offset 9 The first group to inspect is at position (needle_len - 8): 10111011. That does not appear in the needle, so apply the maximum shift, which +is 17 (needle_len - 8 + 1). Were it present in the needle, then the shift would've been (needle_le +n - 8 - offset), followed by compare. ???????? ???????? ???????? ??? +????? ???????? ???????? ???????? ???????? 0000111010011100101110110111000111101011110011111011111110000000100010 +0100001010010101000001101000110010011000101101010110011011000000 + 000010100101010000011010 Not there; max. shift 17 000010100101010000011010 Not there +17 000010100101010000011010 Not there + +17 0000101001010100000 +11010 Not there +17 00 +0010100101010000011010 In table +12 + 000010100101010000011010 Not there +17 + 000010100101010000011010 Not there +17 + 000010100101010000011010 +Not there +17

        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
        In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
        code