Your skill will accomplish
what the force of many cannot
[OT] The interesting problem of comparing bit-strings.by BrowserUk (Patriarch)
|on Mar 24, 2015 at 08:08 UTC||Need Help??|
BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:
The interesting problem of comparing bit-strings.
Perl's vec and bitwise string operators are very useful and powerful for manipulating (large) bitstrings; but they are limited in several ways.
Namely, there are no left & right shift operators for them; and the bitwise string operators operate with 8-bit boundaries.
One common operation needed for programming with bitstrings is the ability to search one bitstring for another bitstring, looking for matches that occur even at non-byte or integer-sized alignments.
So graphically, using 8-bit units instead of 64 for brevity, the problem is this:
Find (the first occurance of):
The specification of needle requires three values:
Similarly, the haystack requires 3 values:
The function required thus takes six arguments:
In an ideal world (Perl) the function would return the unit number(or address), and the offset into that unit, where the match is found (or 0).
As C doesn't permit multiple return values, I've opted for a single 64-bit value indicating the offset from the start of the first unit of the haystack (or -1?).
There are 128-bit shift instructions available on Intel/AMD 64-bit processors (and probably most others):
which are available on the MS compiler as intrinsics:
and probably in gcc (though I couldn't locate the names).
Using these, you can load two registers with two units of one of the bitstrings and when shifting the first register, bits from the second are shifted in.
Thus to process through the bitstrings 1 bit at a time, you use loops something like:
What have I tried so far?
In answer to the traditional question: You've just pretty much read it!
The function definition and a few local variables:
And the two separate loops above, and little else, despite many hours of though and scribbling on post-its.
I suspect that the right approach might include another subroutine that compares the needle to the haystack at a given, fixed offset; but I'm stuck for how to code that such that it doesn't repeatedly shift one or both by an increasing number of bits each time it is called?
I also thought that it might make sense to copy the needle to an aligned chunk of memory, to eliminate its offset from the problem; but it could be a very large piece of memory, which would be expensive.
Update:I suspect one or even two gotos may be needed; though I'd avoid them if I can.
What am I looking for?
Cluebats; pseudo-code; real-code of any flavour.
Basically anything that will make the penny drop as to how to code this beast?
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked