in reply to Re: Most Significant Set Bit
in thread Most Significant Set Bit

Some inaccuracies:
It's true that a binary search in a 64 bit unsigned integer range is going to take, at worst, 64 comparisons. ... This is O(log(n))

The initial proposal is a binary search on the bits which (as stated in the original post) takes at most 6 comparisons, not 64. It would be O(log(log(n)))

But a linear search through a 64-bit vector to find the most significant bit for an integer, is also O(log(n)); .... And a linear search will be very fast for such a small problem space.

Yes, linear on the number of bits, which is a loop of 64 comparisons, vs. the 6 proposed by OP.
So, slower.

Therefore, a solution that requires NO iteration at all could be ... though on paper this solution is O(1)

Hiding a loop inside a function doesn't make it not-a-loop. Since the conversation is about bits, you can't just assume it as a constant like when you're assuming a 64-bit hardware op. If this were an arbitrary precision number like Math::BigInt, 'log' is definitely not a constant operation.

Replies are listed 'Best First'.
Re^3: Most Significant Set Bit
by davido (Cardinal) on Mar 22, 2024 at 19:42 UTC

    Good point on the calculation of log not being O(1).

    How do you do a binary search on '0111110100111001011101010000111101000100111011000000000000000011'? I'm not quite sure what is meant by doing a binary search on the bits. What does the comparator look like? When I suggested that the binary search must be on the integer range, it was because I couldn't envision how a binary search would be applied to efficiently discover the first non-zero bit in a bit field directly. I could see it working fairly well on an integer range, though.


    Dave

      I'm not quite sure what is meant by doing a binary search on the bits

      I mean the exact thing that OP used as an example :-) My phrasing "binary search on the bits" might not be the best name for it; maybe "a log-based binary search"?

      Written in a generic manner, it might look like

      my ($min, $max, $mid)= (0, 62); while ($min < $max) { $mid= int(($min+$max)/2); if ($n < (1 << $mid)) { $max= $mid-1; } else { $min= $mid; } }
      but since we know the range is 64 bit, it can be unrolled as
      if ($n < 0x100000000) { if ($n < 0x10000) { if ($n < 0x100) { if ($n < 0x10) { if ($n < 8) { if ($n< 4) { return $n < 2? 1 : 2;
      and so on.

      Now, I have to retract my earlier statement about analyzing log() in terms of generic-length bit strings, because this binary search does actually depend on greater and lessthan ops being constants. In a variable-length bit string, those would also be loops. Still, I think even for fixed-width 64-bit numbers the log() function is probably implemented as a loop because they have to calculate out the full floating point precision, so it should be at least as expensive as floating point division, which is notoriously slower than the other floating-point operations.