At work we have a module that does some bitmask operations to do things like check user permissions and so forth. For example, an applicaion might have a line like:

if( $user->permits( FOO ) )

Where FOO is an imported constant integer with one bit on, which is bitwise-ANDed with the user's permissions bitfield to determine if he has the appropriate bit. Additionally, you commonly see stuff like this:

if( $user->permits( FOO | BAR | BAZ ) )

Which constructs a new bitmask with three bits on, all of which must be present in the user's permissions bits.

There are lots of permissions, and many applications have to do lots of different permissions checks all over the place. This common method was chosen because it's faster than doing a hash lookup or a method call to determine if something is allowed. Internally, the permits method is implimented quite simply:

package User; ... use constant FOO => 1 << 1; use constant BAR => 1 << 2; use constnat BAZ => 1 << 3; ... sub permits { my ($self, $mask) = @_; return ($self->{perms} & $mask); }

So far so good. But as client needs grew and changed, more and more permissions were added. Eventually, someone tried 1 << 32, and was dismayed when it didn't work. Our Perl only supports 32-bit integers. So someone changed those integers to Math::BigInts, and the problem was solved. Or so they thought.

Math::BigInt is very slow. Even if you use one of the C backends instead of the pure Perl implimentation, it's still painfully phlegmatic. But nobody realized just how much overhead was caused by BigInt until some Apache::DProfing revealed that nearly 20% of the program's execution time was spent in BigInt!

A better solution was demanded. After some consideration, I figured that most of the overhead of a BigInt implimentation is in dealing with integers of unlimited size, and munging integers of different sizes together. But all of our permissions could be represented as giant integers exactly 128 bits wide. The only operations they need to do are bitwise-AND and bitwise-OR, both of which are simple. I figured I could use some overloading to impliment them as arrays of four 32 bit integers each. Here's what I came up with as a test.

package FriedoBits; use strict; use overload '|' => \&bitor; use overload '&' => \&bitand; sub bitor { my ($a, $b) = @_; my $new = [ ]; for(0..3) { $new->[$_] = ( $a->[$_] | $b->[$_] ); } return __PACKAGE__->new( $new ); } sub bitand { my ($a, $b) = @_; my $new = [ ]; for(0..3) { $new->[$_] = ( $a->[$_] & $b->[$_] ); } return __PACKAGE__->new( $new ); } sub new { return bless $_[1], ref($_[0]) || $_[0]; } use constant PERM_0000 => FriedoBits->new( [ 0x00000000, 0x00000000, 0x00000000, 0x00000001 ] ); use constant PERM_0001 => FriedoBits->new( [ 0x00000000, 0x00000000, 0x00000000, 0x00000002 ] ); use constant PERM_0002 => FriedoBits->new( [ 0x00000000, 0x00000000, 0x00000000, 0x00000004 ] ); use constant PERM_0003 => FriedoBits->new( [ 0x00000000, 0x00000000, 0x00000000, 0x00000008 ] ); use constant PERM_0004 => FriedoBits->new( [ 0x00000000, 0x00000000, 0x00000000, 0x00000010 ] ); use constant PERM_0005 => FriedoBits->new( [ 0x00000000, 0x00000000, 0x00000000, 0x00000020 ] ); ... use constant PERM_0125 => FriedoBits->new( [ 0x20000000, 0x00000000, 0x00000000, 0x00000000 ] ); use constant PERM_0126 => FriedoBits->new( [ 0x40000000, 0x00000000, 0x00000000, 0x00000000 ] ); use constant PERM_0127 => FriedoBits->new( [ 0x80000000, 0x00000000, 0x00000000, 0x00000000 ] ); 1;

Benchmarking showed just how much better this was:

Benchmark: timing 100000 iterations of BigInt... BigInt: 188 wallclock secs (168.07 usr + 0.36 sys = 168.43 CPU) @ 593.72/s (n=100000) Benchmark: timing 100000 iterations of BigIntGMP... BigIntGMP: 44 wallclock secs (38.50 usr + 0.13 sys = 38.63 CPU) @ 2588.66/s (n=100000) Benchmark: timing 100000 iterations of FriedoBits... FriedoBits: 3 wallclock secs ( 3.11 usr + 0.01 sys = 3.12 CPU) @ 32 +051.28/s (n=100000)

That's nearly 100 times faster than the pure Perl version, and more than 10 times faster than the C version. Of course these are not really integers. You can't add them or subtract them or find their square roots. But for our purposes we don't need to, so why waste the time? Coming up with ideas like this is one of the reasons I love programming. (I'm quite certain it's a solution which has been invented many, many times before now. But I will bask in the glory anyway.)

Replies are listed 'Best First'.
Re: Freed From the Tyranny of Math::BigInt
by merlyn (Sage) on Feb 19, 2005 at 23:09 UTC
    And how much even simpler it would have been to define:
    use constant BIT_0 => "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x01"; use constant BIT_1 => "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x02"; use constant BIT_2 => "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x04"; ... use constant BIT_127 => "\x80\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"; ... my_subroutine(BIT_5 | BIT_8);
    and it just all works without requiring a new type or overloads. {grin}

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

Re: Freed From the Tyranny of Math::BigInt
by Aristotle (Chancellor) on Feb 20, 2005 at 07:22 UTC

    That's quite a lot of superfluous effort. Integers are 32-bit only, but strings can have as many bits as you want. Use vec.

    use constant FOO => 0; use constant BAR => 1; use constant BAZ => 2; use constant QUUX => 452; sub permits { my $self = shift; vec( $self->{perms}, $_, 1 ) or return for @_; return 1; } # and later if( $user->permits( FOO, BAR, BAZ ) ) { # ... }

    Makeshifts last the longest.

Re: Freed From the Tyranny of Math::BigInt
by ysth (Canon) on Feb 20, 2005 at 14:46 UTC
    As merlyn points out, perl has string versions of the bitwise operators. permits becomes slightly different, since you need to test for non-null bytes:
    sub permits { my ($self, $mask) = @_; return (($self->{perms} & $mask) =~ y/\0//c); }
    To retain the feature of having permits return not just a true or false value, but a usable bitmask of those permissions that were satisfied (untested):
    sub permits { my ($self, $mask) = @_; $mask &= $self->{perms}; return ($mask =~ y/\0//c ? $mask : ""); }
Re: Freed From the Tyranny of Math::BigInt
by dragonchild (Archbishop) on Feb 20, 2005 at 23:54 UTC
    This common method was chosen because it's faster than doing a hash lookup or a method call to determine if something is allowed.

    This thread should be retitled Why you shouldn't optimize prematurely. Did anyone ever look to see how much faster bitwise operations are over a hash lookup? And, frankly, you're doing a method call every time to call your permits() function. (A function is a method and the only overhead a method has is on the first you call it after @ISA is modified.)

    If you had started with a hash or array, you would have scaled just fine. And, had you used vec (which is the Perlish way to do bitmasks, as merlyn pointed out), you would have scaled infinitely and had the speed benefits of bitmasking.

    I'm not chastising you; I'm attempting to make sure others don't fall into the same trap you did.

    Being right, does not endow the right to be rude; politeness costs nothing.
    Being unknowing, is not the same as being stupid.
    Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
    Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

      I really don't know what was tried initially, as this stuff was written long, long before I started working on it.

      [id://merlyn]++'s suggestion is clearly much better than my idea (who knew you could do bitops on strings?! Not me. :) ). Oh, well. At least I got to mess around with overloading.

Re: Freed From the Tyranny of Math::BigInt
by Anonymous Monk on Feb 21, 2005 at 12:11 UTC
    But all of our permissions could be represented as giant integers exactly 128 bits wide. The only operations they need to do are bitwise-AND and bitwise-OR, both of which are simple.
    Did you ever consider using vec?