sir p freakly has asked for the wisdom of the Perl Monks concerning the following question:

My book gives a vague discription of what this is, but I coudln't comprehend exactly what it's used for and how it works. For instance, my book says if you AND together "123.45" & "234.56" you get "020.44" How this happens and what can be done w/ it is beyond me. If someone could help out a Perl newbie It'd be appreciated.

Replies are listed 'Best First'.
Re: bitwise operators
by dvergin (Monsignor) on May 28, 2001 at 05:23 UTC
    This is dark art and not likely to be of use to you any time soon. I've never used it but I actually do have a usage example at the end of this post.

    First, in order to make the answer to your question more clear, let's look at the question you did NOT ask. As the docs state, there is a difference between 123.45 & 234.56 and "123.45" & "234.56". In the first case, the two numbers will be truncated to integers and then the two values 123 and 234 will (in effect) be taken in their binary representation and ANDed. (Doing these in my head as I go -- I may slip a bit or two.) Thus:

    123 decimal = 01111011 binary 234 decimal = 11101010 binary ^^ ^ ^ <-- ANDing to find overlap 106 decimal 01101010 result

    So the result of evaluating 123.45 & 234.56 can be thought of as depending on the binary representation of (the integer portion of) each number taken as a whole.

    So far so good...

    In the case of "123.45" & "234.56", the two values are evaluated character-by-character according to the ASCII values of each character. So, while the previous case only works with numbers, in this case you could just as easily say "ABCD" & "gHiJ".

    I won't reproduce the entire ASCII table, but here are the values for the numerical digit characters. And while we're at it, lets show their binary representations. We will need them in a minute.

    '0' = char 48 = 110000 binary '1' = char 49 = 110001 binary '2' = char 50 = 110010 binary '3' = char 51 = 110011 binary '4' = char 52 = 110100 binary '5' = char 53 = 110101 binary '6' = char 54 = 110110 binary '7' = char 55 = 110111 binary '8' = char 56 = 110100 binary '9' = char 57 = 110101 binary

    Note in each case we are talking about e.g. the character 7. Not the value seven (as in seven things).

    So... when Perl sees "123.45" & "234.56", it takes each string character-by-character comparing (so to speak) their binary representations. So, for the first character of each string:

    '1' = char 49 = 110001 binary '2' = char 50 = 110010 binary ^^ <-- ANDing to find overlap '0' = char 48 = 110000 binary

    For the second character of each string:

    '2' = char 50 = 110010 binary '3' = char 51 = 110011 binary ^^ ^ <-- ANDing to find overlap '2' = char 49 = 110010 binary

    And so on until we get "020.44".

    Now the folks who designed the ASCII table did a clever thing. They set up the table so that the bottom four bits of the number characters were also the binary values for each digit. So for the character '5':

    '5' = char 53 = 110101 binary ^^^^ <--Note these bits

    and for the value five (as in five things):

    5 = 0101 binary

    so when we start ANDing and ORing these they act as if we were ANDing and ORing each digit's value (and not simply its ASCII assignment) digit-by-digit. If we do bitwise comparisons of strings of digits, we can (within certain limits) pretend we are ANDing or ORing the values of each digit and not just their ASCII assignments. Cute! But watch out. If you say "1234" & "39", Perl will start with the left-most character in each string. It will not compare the tens digit against the tens digit.

    So what good is all this? Danged if I know!... Actually I can think of a few uses. Here's one example. First, some more ASCII values:

    'A' = char 65 = 01000001 binary 'B' = char 66 = 01000010 binary 'a' = char 97 = 01100001 binary 'b' = char 98 = 01100010 binary

    Note that 'A' and 'a' differ by 32. And as chance would have it, ASCII character 32 (which is the space character: ' ') has a binary representation with just one single '1' bit set. This means that it makes a useful binary mask (those clever ASCII people again). And the complement of character 32 (for our purposes) is character 95 which is the underscore. So here are a pair of clumsy upper- and lower-case translators:

    print 'AbCdEfGh' | ' ', "\n"; print 'aBcDeFgH' & '________', "\n";
    Which prints:
    abcdefgh ABCDEFGH

    And that, I suppose, is quite enough about bitwise comparison of strings for now... HTH

    Update: Many of the responses to this thread play around with the upper/lower-case trick.

Re: bitwise operators
by wog (Curate) on May 28, 2001 at 04:24 UTC
    Why that's happening is that you're actually AND'ing together the bit representations of those strings. Without the quotes perl would be AND'ing the bit representations of those numbers, as integers.

    In this case, under the ASCII charset the strings "123.45" and "234.56" are represented by the bits:

    1        2        3        .        4        5
    00110001 00110010 00110011 00101110 00110100 00110101
    2        3        4        .        5        6
    00110010 00110011 00110100 00101110 00110101 00110110
    
    Thus when AND'ing them together you keep all bits which are 1 in both things as 1, and everything else becomes 0 resulting in the string "020.44" represented in ASCII with:
    0        2        0        .        4        4
    00110000 00110010 00110000 00101110 00110100 00110100
    
    All this is probably not portable to systems with odd charsets like EBCDIC.

    Now, if we were to AND together numbers (not strings) like say 123 and 234, we'll get a completely differnt result. For example, the numbers 123 and 234 and represented by:

    123
    01111011
    234
    11101010
    
    Note that this is different then AND'ing together strings since we're working with numbers, not charactors. Thus the result is 106, represented by:
    01101010
    

    Another thing to note is that if you try to have perl and together non-integers like doing 123.45 & 234.56, perl appears to treat 123.45 and 234.56 as if we used int(123.45) and int(234.56), which avoids odd situations you'd get if you tried to AND floating point numbers together, because they're implemented in a way that varies greatly from machine to machine.

    All bit strings here were generated with unpack "B*", ..., so have most signifigant byte first.

Re: bitwise operators
by srawls (Friar) on May 28, 2001 at 03:55 UTC
    First, let's take an example:
    10011 01011
    These are the binary representation of two numbers. Now, we go through bit by bit and we take the result by these rules:
  • if both bits are set (equal to 1) then the result bit is 1
  • otherwise the result bit is 0

    so, the above is this:

    10011 #from above 01011 #from above 00011 #result
    There are also the operators OR,XOR,and more. OR is the result bit is 1 if either bit is set, XOR is the result bit is 1 if one but not both of the bits are set.

    The 15 year old, freshman programmer,
    Stephen Rawls
Re: bitwise operators
by mr.nick (Chaplain) on May 28, 2001 at 04:16 UTC
Re: bitwise operators
by tigervamp (Friar) on May 29, 2001 at 07:53 UTC
    I think that adequate explanation has been given to your question about how bitwise operations work, so I won't talk about that. You did ask though, how they could be useful, and there are many different areas in which bitwise operations are important, including cryptography, internal database management, graphical rendering engines, and algorithms. I do not want to bore/overwhelm you with complex examples, but here are a couple of general ideas behind thier use.

    1. Cryptography
    This is a very simple example, but still very powerful.

    Bitwise Operators can play a very important role in cryptographic/masking algorithms. For example, say that you had a piece of binary data, for simplicity, these 15 bytes:

    This is a test.

    You also have a 15 byte "mask", a string of random bytes, the same length as the length of the string to be masked. Again, for simplicity, lets say the (obviously not very random) mask is:

    abcdefghijklmno

    To encode the orginal data, you would apply the mask "abcdefghijklmno" to the data "This is a test." You would do this by using the XOR bitwise operation.

    THE XOR OPERATOR

    The XOR operator works on bits in the same ways as explained in previous responses. The XOR test is true(1) if the two bits that are being compared are different(1 and 0), or false(0) if they are the same. Therefore, performing the XOR operation on the following 2 bytes would work as shown:

    10010111
    01011100
    -----------
    11001011

    Now, lets look at the last 2 bytes in the above example and XOR them:

    01011100
    11001011
    -----------
    10010111

    You get the original piece of data back!
    Getting back to our 15 byte example, if you XOR "This is a test." with "abcdefghijklmno", you will get a 15 byte piece of "masked data", that is hopelessly encoded and cannot be decoded by anyone without our incredibly secure and hard to guess mask. This new masked data(which is not shown here, it turns out to be ugly, hard to display binary), can be reverted back to the original by XORing it with the mask to obtain the data, or the data to obtain the mask. This is probably the most secure way to well, secure data, because since the mask is random(usually), the masked data can be just as easily be reverted back to the string "No test here..." using another mask.

    2. Internal data structures
    Bitwise operators can be used to manipulate data in a very efficient way that has been stored in bit fields, used in many areas including video games and 3D vector engines. It is also important in advanced mathematical algorithms. Instead of providing more, drawn out examples, I will provide you with the following sources, if you are still interested.

    Mastering Algorithms with Perl -- Explains the use of bit vectors in mathematical algorithms and Encrypting data using the XOR method I discussed.

    http://www.cs.cf.ac.uk/Dave/PERL/node36.html -- Discusses bitwise operators including bitwise shifts

    Programming Perl -- General information about bitwise operations, probably the book you are already reading

    CPAN also has some interesting/useful modules including Tie::VecArray and Bit::Vector.
    There is a site with some really good information on this topic, I can't find it at the moment, I will update this node with the information when I find it though.

    Hope this helps,
    tigervamp

      Wow these responses are wonderful. Makes me want to dive in..

      But to keep it simple, I always remember
      "AND and OR, multiply and plus."

      That's because with single-digit binary numbers, AND gives you the same result as multiplication and the same goes for OR and addition. (A truth table, which shows the grid of all possible pairs and the result you get for a given operation, is helpful.)

      Understanding binary is good in Perl when you want to check the return value of a system function, since it may look like 128 for example but you know it is just the 8th bit that is set to 1. You can shift that bit left or right to simulate multiplication or division by two.

      Binary logic is also the fundamental idea behind the operators "and", "or", "|", "||", "&", and "&&". If you ever use all capitals constants for system control you are probably using binary without realizing it. You also need to know what you are doing here when you tie a database or use other routines that let you hand them a number of boolean (single bit) flags.

      Also if you are doing Perl in windows or talking to a C program, you may very well need to set "flags" or use a "bit mask". For example, say you have a window drawing function and it treats each bit of an argument as a display option. Perhaps the second bit means "the window has beveled edges" and the fourth bit means "vertical scrollbar should be displayed". Then you would want to set both bits to specify both options, in other words take a string of 8 zeroes and set the second and fourth bits to ones (usually we count right to left so the first bit is on the right). In base two you get 00001010 (in base two, or binary) or 2^1 + 2^3 = 2+8 = 10 (in base 10).

      Now If you got the current setting and it was 8 (scrollbar displayed) and you wanted to add bevelled edges, you could add (binary OR) that value (8, or in binary 00001000) with your bit mask (10, or in binary 00001010). Since OR is like addition, it makes sense that every digit set to one in your mask will become set. Likewise, if you had used AND, every bit set to zero in your mask would be cleared because anything AND 0 is like multiplying. by zero.

      Usually these display options are given all capitals names like V_SCROLLBAR and BEVELED_EDGE. Then if you have a function DrawWindow that takes three bytes (x, y, and type) as arguments, you can draw a window at coordinates x and y with something like DrawWindow(x,y,V_SCROLLBAR|BEVELED_EDGE).

      So the more bits you have, the more options you can set. This is really the front end for the C or C++ code in which the function was originally written, and the byte size of each argument is predefined. This means that you can only specify up to eight window options at once if type is only one byte long. So you may see a function called something like DrawWindowEx which might be a function that might allows many more settings because it had a 32 bit (4 byte) argument.

      Another example is this snippet from DB_File.pm:

      sub LOCK_SH { 1 }
      sub LOCK_EX { 2 }
      sub LOCK_NB { 4 }
      sub LOCK_UN { 8 }

      ...

      unless (flock (DB_FH, LOCK_EX | LOCK_NB)) { .... }

      That means a file lock on the file handle DB_FH is being requested and that the lock should be both exclusive and nonblocking. Unless flock returns a number greater than 0, the .... code is executed.


      So now you can read this too:

      $db = tie(%db, 'DB_File', '/tmp/foo.db', O_CREAT|O_RDWR, 0644)
      || die "dbcreat /tmp/foo.db $!";

      You see someone (Paul Marquess, writer of that module) was trying to tie a hash variable (db) to a file (foo.db) using the DB_File module and opening the database file with file creation and Read/Write permission.

      In the last line of code above, you see bitwise OR (single pipemark) being used to add two flags, and you also see logical OR (double pipe, synonymous with "or") which will cause the "die" code to be executed if Perl decides it needs to look at it to evaluate the entire line. It is neat that "die" is not looked at if what comes before the double pipe is true, since in bitwise addition you already know the answer if you can tell the first value of your pair of binary digits is true (i.e. set to one). So binary logic is pretty important in Perl!

      By the way if you are curious what kind of constants are out there, you can check out Fcntl.pm, or (in unix) try "man fcntl" to see what the values are on your system. A Windows programmer's guide will also be full to the brim with these all caps flags.

Re: bitwise operators
by sir p freakly (Initiate) on May 28, 2001 at 08:25 UTC
    All your responses helped a lot. I understand the procedure, but still don't know what to do w/ it. Like someone said, I won't need it in the immediate future and their probably right. Anyway, thanx for your help.
      Bitwise operators are a lot more useful when doing system level programming in Assembler, C ,etc.

      Around the early 90's when I was writing 3D texture mapped engines designed for 386/386sx+ systems (16 Megahertz!), I used them constantly. You can use bitshifts and bitlogic to do quick (and dirty) multiplies/divides by powers of 2, cheap modulus operations, collision detection, etc. Under those constraints, every clock cycle counts. At that time, you had to roll your own video drivers which meant screwing around with SVGA registers (all bit-masking)and all other arcane kind of things.

      That said, if you are doing system level programming you will probably use them quite often.

      To be honest, I haven't ever had the need (yet) to use them in any Perl applications.



      -Lee

      "To be civilized is to deny one's nature."
        Bit fields, a componet which goes along with bitwise operators, are an important part of C(and C++). You can use them to save space: struct { UINT flag_a : 1; UINT flag_b : 1; } struct_name; (where as UINT is defined to be unsigned int). This is useful in saving memory, by combining many variables(flag_a and flag_b, in this case) together; we can do this because we only concern ourself with true/false value(0/1, respectively). In perl(as far as I know), you cannot do this automatically, but you can still make a scalar and use it to turn on and off certain values, and check certain values. For example: assume an eight bit integer(well, getting back to perl, scalar) 00000011 Here, we have the last two bits set. If we want to know if the second to last bit(i.e.,2nd rightmost) is set, then we can say: if(($scalar_here >> $number_to_shift_by) & $mask) #in this case, $number_to shift_by would be 1; and $mask would be a value such that all bits but the rightmost would be off; i.e., 00000001. To set the bit, we would say $scalar |= $mask, where as if we want to set the second right most bit, $mask should be a value where as all bits are off except the second right most: e.g., 00000010. We use or so that if it is on it stays on; however, if we want to say turn it on, except if it is already on, turn it off, we substitute |= with &=. Another use for bitwise operations is to substitute two values without using a temporary variable: e.g.,
        #var1= 0011 var2 = 1100 (only stating last
        # 4 bits, of cousre)
        $var1 ^= $var2; # v1=1111
        $var2 ^= $var1; # v2=0011
        $var1 ^= $var2; # v1=1100
      At least to my knowledge, bitwise operations are mostly used to store "toggles" so to speak. There's other applications, but toggles are the most practicle for most programmers.

      You can take a 32 bit int and set one of the bits. Setting the 4th bit would make the integer value 8. Now why would you want to do this? A good example of this would be the data that stored in Windows NT domain accounts. I use this example because its the most recent thing I've used bitwise operators for. If you look at NT accounts, there are several "toggles" for things like "Account Locked Out", "User Must Change Password", "User Cannot Change Password", etc...

      Instead of storing all of these in seperate variables, they are all stored in one integer. And if you want to test to see if the users account is locked out or unlock the account, you can test for that bit or toggle the bit, respectivly.

      In smaller programs, it's probably not something that you're going to want to use. But if you're talking about a bunch of simple yes/no variables that are going to be replicated over and over again (like for accounts), you're going to save space, and I belive slightly improve speed by storing them in an integer as on/off bits.

      Hope this helps...
      Rich

        ... and I belive slightly improve speed by storing them in an integer as on/off bits.

        To my knowledge, doing that is actually slightly slower in most cases. This is because if you're storing it as an integer one operation must be done to retrieve it: get value from location in memory; and to store it: store value to location in memory. On the other hand, with indiviual bits there are more operations; to get a bit value: get 32 (or whatever) bit value from memory, and then use AND to check the bit's value; and to store: get value from memory, use OR or AND or XOR to set or toggle the bit in question, and then store the value in memory.

        But it's probably unlikely that the speed difference will matter in most cases. I'd expect that usually the speed tradeoff is well worth the space advantage

        (update:) Thinking about this more makes me think it more of a toss-up. You probably would have a speed improvement if you would otherwise have to get lots of values from memory, because that is relatively "slow". If you're just doing something like going through a bunch of these numbers to check for one or two bits, or to set one or two bits, the way of using lots of numbers will probably be faster, I think. But if you're just using one number for a bunch for a bunch of things you're using a runtime, I'd guess that it would be faster because you could actually keep the value in one register, or it would be easier to cache it. (With perl, we probably don't have any hope of it staying in a register, so the cache is your only hope; and the cache might just as easily cache 32 32-bit values, too.) (All this is of course IMO, and subject to being totally wrong.)