As you are the third respondent that has suggested bit-masks of one form or another, either:

  1. My question/description is unclear.
  2. Or I'm missing something fundamental.

Both distinct possibilities.

As I see it, my problem isn't how to isolate the bits of the strings that I need to compare -- __shiftleft128() does that easily and efficiently.

The problem is one of how to construct/combine the nested loops in order to perform the iteration of the bits and quads in the proper way.

Looking up a basic (ie. not the gcc version which incredibly complicated) strstr() function I found this:

char *strstr( char *haystack, char *needle ) { char *hp; char *np; if( !*needle ) return haystack; // null needle; retur +n haystack (strange POSIX specification requirement!) while( *haystack ) { // until we reach the + end of the haystack hp = haystack; // copy of the haysta +ck pointer np = needle; // copy of the needle + pointer do { if( !*np ) return haystack; // end of needle; mat +ch completed; return current value of haystack pointer } while( *hp++ == *np++ ); // increment both poi +nter (copies) while the chars match ++haystack; // Got here means a m +ismatch; base pointer to next character. } return 0; // Got here means we +hit the terminating null of the haystack; no match; return null. }

If my bitstrings where always 64-bit aligned, I could just substitute U64 for char everywhere, and that would be that; but of course it isn't.

Essentially, there are five operations I need to encapsulate:

  1. Incrementing haystack by 1-bit.
  2. Incrementing haystack by 64-bits (1 quad + offset).
  3. Incrementing needle by 64-bits (1 quad + offset).
  4. Detecting EOS of the haystack.
  5. Detecting EOS of the needle.

For 2 & 3 I thought to write a function:

U64 nextQuad( U64 **p, U8 o ) { // p: pointer to poin +ter to the 'current' quad; o: value of the unused bits at the beginni +ng. U64 hi = *( ++*p ) // increment the poin +ter at *p, and set hi to the value it points at. , lo = *( *p+1 ); // and lo to the valu +e + 1 it points at return __shiftleft128( lo, hi, o ); // return the value f +rom the higher location with o bits from lo shifted in from the right +. }

I think that would work, but for efficiency, you'd want to keep hi and lo in registers between calls, and you can't take the address of regiters, so I'd be relying on the compiler to recognise the importance of doing so.

For one, things get more complicated. Now I need a pointer to the offset as well:

U64 nextBit( U64 **p, U8 *o ) { // p: pointer to poin +ter to current quad; o: pointer to current offset U64 hi, lo; if( ++( *o ) == 64 ) { // increment the offs +et and detect transitions across quad boundaries. *o = 0; // reset the offset. ++*p; // increment the quad + pointer. } hi = **p; // Get the 'current' + (unshifted) quad value lo = *( *p + 1 ); // And the next return __shiftleft128( lo, hi, *o ); // return the value +from the higher location with *o bits from lo shifted in from the rig +ht. }

For 4 & 5, I need to calculate whether I've reached the end from the current values of the quad pointer & associated offset.

Putting it all together and getting it to compile is defeating me at the moment.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

In reply to Re^4: [OT] The interesting problem of comparing bit-strings. (More info.) by BrowserUk
in thread [OT] The interesting problem of comparing bit-strings. by BrowserUk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.