in reply to Re: Locating a specified number of contiguous 0 bits within a large bitstring efficiently.
in thread Locating a specified number of contiguous 0 bits within a large bitstring efficiently.

Oh wait, this is simpler than I thought. We don't even need that last \0+

Fleshing this out a bit more:

# $pre[n] matches any character that ends with exactly n zero bits @pre = map { qr{[@{[chr(1<<(7-$_))]}-@{[chr((1<<(8-$_))-1)]}]} } 0 .. +7; $pre[0] = qr(^|$pre[0]); # $suf[m] matches if we have at least m zero bits following my @suf = ('', #done map { my $m=$_; qr{[@{[join '', map { chr($_<<$m) } 0..((1<<(8-$m))-1)]}]} } 1..7 ); sub zero_bit_regexp { my $N = shift; my $p = join '|', map { qr{$pre[$_]\0{@{[($N-$_)>>3]}}$suf[($N-$_)&0 +x7]} } 0..7; return qr{$p}; }
  • Comment on Re^2: Locating a specified number of contiguous 0 bits within a large bitstring efficiently.
  • Download Code

Replies are listed 'Best First'.
Re^3: Locating a specified number of contiguous 0 bits within a large bitstring efficiently.
by wrog (Friar) on Jun 07, 2013 at 02:26 UTC
    except for the small matter that regexp quoting issues are turning out to be deadly, so better go with this instead
    # $pre[n] matches any character that ends with exactly n zero bits my @pre = map { sprintf "[\\x%02x-\\x%02x]", 1<<(7-$_), (1<<(8-$_))-1 +} 0 .. 7; $pre[0] = "(?:^|$pre[0])"; # $suf[m] matches any character that begins with at least m zero bits my @suf = ('', #m=0 -> match anything map { my $m=$_; '[' . join('', map { sprintf "\\x%02x", $_<<$m } 0..((1<<(8-$m))-1)) . ']' } 1..7); sub zero_bit_regexp { my $N = shift; my $p = join '|', map { "$pre[$_]\\0{@{[($N-$_)>>3]}}$suf[($N-$_)&0x +7]" } 0..7; return qr{$p}; }

      Thank you for all the work you've put into this.

      But whilst this finds a match (if it exists), I'm having the same problem with this as I am with my attempts to implement Dave_the_M's suggestion, that of converting the match to a bit position at which it was found?

      $-[0] tells me the byte position, but then I need to work the byte at than position back to the pattern than matched and then the bit offset within it. Thoughts?


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Put in match groups, i.e., replace
        my $p = join '|', map { "$pre[$_]\\0{@{[($N-$_)>>3]}}$suf[($N-$_)&0x7] +" } 0..7;
        with
        my $p = join '|', map { "$pre[$_](\\0{@{[($N-$_)>>3]}}$suf[($N-$_)&0x7 +])" } 0..7;
        By construction, only one will match and $#- will tell you which one. We place the leftparen so that the group always starts at the first byte that's beginning with a zero bit. Then you just have to subtract the number of zero bits at the end of the byte that $pre[$_] is matching, but that's just $#- - 1. So the answer is
        $-[$#-]*8 - $#- + 1
        Still needs a bit more work to handle the $N < 8 case.