in reply to Re: Can you spot the problem?
in thread Can you spot the problem?

Assuming for the moment that we apply tye's two-stroke fix that makes the code work as intended, it still only works because of certain bit-related properties of the numbers in question (centering around the fact that 256 is a power of 2).

In this particular domain each number should be a single byte. The bit related property isn't arbitrary or coincidental - it's the very essence of the thing we're testing for. To me it's an obvious property that or-ing them together should result in a number less than 256. To others it may not be.

You can see why someone might view this as unintuitive.

Of course. It's a completely reasonably attitude. Just not a universal one.

To others, myself included, using a bitwise numerical or is a completely intuitive solution. Whether somebody find it obvious or not is going to be dependent on their experience.

I too would use Regexp::Common, but only because I loathe reinventing wheels. Are the regexpes that Regexp::Common::net uses internally any more or less "intuitive" that the other solutions offered? This, of course, depends on how familiar you are with Perl's regular expressions. It also depends on the domain you're playing with.

I still think the real problem is with the behaviour of the | operator in Perl. This is one of the very few places where the difference between a string containing a number and a number makes a real difference.

With all the other numerical operations numbers and strings containing numbers are treated the same way. It's only the bitwise operators that are treated differently. This, in my opinion, was a poor design choice and doesn't fit in with the way the rest of Perl is organised (eq vs ==, . vs +, etc.) I imagine this is one of the reasons the numerical and string bitwise operators are being separated out in Perl 6.

For example, the difference between the bitwise string and numerical operators has caught me out when I was manipulating preference values that were being represented as bit masks. Comparing different preferences sometimes worked and sometimes didn't, depending on whether preferences got injected into the system as strings or numbers. Since the injection of values happened a long way from the comparisons it took a heck of a long time to figure out what was going on.

(For those about to chime in and say bitmasks were a poor choice for preference values - I quite agree. We were manipulating values from a third party application.)

So, in conclusion, flyingmoose is not smoking crack (or, at least, if he is, you can't tell that from the node in question).

I never accused anybody of smoking crack.

Ignoring the reinventing the wheel issue, and just comparing the bitwise and comparison solutions, on some days I might even agree that the comparison solution was a better design choice.

However I do disagree that the original solution "does something that doesn't even make sense". It makes perfect sense - it's just that Perl in this particular instance does something that most developers wouldn't expect.

Replies are listed 'Best First'.
Re: Re^2: Can you spot the problem?
by jonadab (Parson) on Mar 08, 2004 at 15:39 UTC
    I still think the real problem is with the behaviour of the | operator in Perl. This is one of the very few places where the difference between a string containing a number and a number makes a real difference.

    With all the other numerical operations

    Whoah, hold on. Where on earth did you get the idea that bitwise operations are numerical operations? They're not. Bitwise operations operate in an arithmetic fashion, yes, but they operate on sequences of bits; what the arrangement of the bits represents is not a key feature of the bitwise operation, and it is not reasonable to assume that they necessarily represent numbers. Often (perhaps even usually) they represent characters, either in ASCII or unicode. Doing a bitwise operation in strings (especially XOR or a left or right shift, but sometimes and and or an or) is quite a common operation, one that we would *not* want to have randomly break if one of the strings happens to contain digit characters by some coincidence.

    Sure, if you have separate bitwise operators for working on strings versus numbers, that would be fine. In Perl6, that will work out, no problem.

    But you absolutely don't want to just make the bitwise operators that we have in Perl5 magically numerify their arguments. That would be exceedingly bad. That seems to be what the person who wrote the code expected to have happen; where he came up with such an absurd idea is quite beyond me. Like I said, it took me three readings to figure out why the author expected it to work. Why it didn't work was somewhat more obvious. Figuring out why the author expected it to work took more time, significantly more time, than subsequently figuring out why his expectations were broken.


    ;$;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}} split//,".rekcah lreP rehtona tsuJ";$\=$;[-1]->();print
      Whoah, hold on. Where on earth did you get the idea that bitwise operations are numerical operations? They're not.

      I meant numerical as in takes a number as an argument (123) as opposed to a string ("123"). I thought this was obvious from the context.

      Doing a bitwise operation in strings (especially XOR or a left or right shift, but sometimes and and or an or) is quite a common operation, one that we would *not* want to have randomly break if one of the strings happens to contain digit characters by some coincidence.

      Which is exactly my point :-) Random breakage is exactly what the current | operator's behaviour causes, just from the other side. Numerical bit based operations and string based bit operations are equally useful - but munging them together in one operator is asking for trouble in a language that transparently translates between numbers and strings in most contexts.

      But you absolutely don't want to just make the bitwise operators that we have in Perl5 magically numerify their arguments.

      I agree completely. I don't think I ever suggested it. However what we have now is just as bad since Perl programmers are used to strings containing numbers acting like numbers.

        Whoah, hold on. Where on earth did you get the idea that bitwise operations are numerical operations? They're not.
        I meant numerical as in takes a number as an argument (123) as opposed to a string ("123").

        Right. Where did you get the idea that the bitwise or operator is numerical in this sense? It's not. None of the bitwise operators are. I think this is the crux of the problem, the reason why whoever wrote the code originally (was that you? the OP didn't say who it was) thought it would work: apparently he was under the impression that the bitwise operations are exclusively numeric in nature, imposing a numeric context on their operands. I don't know where he got this notion, but it's wrong; Perl5's bitwise operators are not numeric operators; they also work on strings.

        Random breakage is exactly what the current | operator's behaviour causes

        Perhaps, but...

        However what we have now is just as bad since Perl programmers are used to strings containing numbers acting like numbers.

        No, strings don't act like numbers, normally. Whoever told you that was trying to oversimplify *way* too much (probably trying to avoid explaining context, but context is the most important thing to understand about Perl, so leaving it out of the explanation was misguided and lead you into the misunderstanding we're now sorting out). Strings act like strings, normally.

        The only time strings ever get converted automagically into numbers is in numeric context. Sure, if you use a numeric operator like + they'll get converted, because the numeric operators supply a numeric context to their operands, but that's not at all the same as expecting them to get converted any time you do anything with them in any context at all. The fact that the string "looks" like a number doesn't have a great deal to do with its getting converted, either. In numeric context, the string "George" will get converted to a number (which, as it turns out, will be 0), though this will generate a warning since the conversion isn't "clean". If in any given situation you wouldn't expect "George" to be treated as a number, don't expect "123" to be treated as a number either; it won't be, because it's a string.

        Would you expect "123" to change into 123 if you assigned it to a scalar, tested it with the defined operator, or passed it as an argument to an arbitrary user-defined function? No? Then why did you think it would get changed into a number if you performed a bitwise operation on it? Were you under the impression that the bitwise operators supplied a numeric context to their arguments? They don't, and why would they? We use them on strings at least as much as on numbers. It wouldn't make sense for them to supply numeric context. This is not "random breakage" to my way of thinking; none of the bitwise operators ever impose a numeric context on their arguments, nor should they. If we had a set of bitwise operators that _only_ work on numbers, as is planned in Perl6, then they _would_ impose a numeric context, of course, but Perl5's bitwise operators are not exclusively numeric.

        You know, when I go back and read this thread, it occurs to me that I'm probably coming across as argumentative. I honestly did not intend to be. I was just trying to understand where you were coming from, why you thought the code should work, and what expectations Perl was breaking. (And, meanwhile, if I could help someone understand Perl better, so much the better.) I think I understand now, but it's possible that I've still missed a subtle point or two. Anyway, I think the basic issue is that whoever wrote the code doesn't think in Perl, but thinks in another language and translates to Perl. This is normal; most programmers think in whatever language they know best and translate to other languages. And yes, Perl being what it is needs to be able to work for people who think in another language (more on this next paragraph), so the move to separate numeric bitwise operators in Perl6 is probably an important step (even though it won't matter for people who think in Perl and understand context implicitely). I think I now understand (well, mostly) why this is a useful change. (Before, I was ambivalent about it.) All of that to say, I wasn't just being argumentative; I was also learning something :-)

        As far as Perl's needing to work for people who think in other languages... all languages to some extent have this need, but Perl6 needs it especially, because if we *ever* hope to get all the hundreds of thousands of C programmers out there switched over to a language that supplies things like garbage collection, Perl6 is the best hope. But they won't switch over if the barrier to entry requires them to think in Perl before they can start doing stuff -- and, in fact, this is the thing that originally attracted me to Perl5; I read two chapters of the Camel book and was writing code that not only worked but was _useful_, not some Hello World exercise. So now, having spent the whole thread trying to explain why Perl5 is the way it is, I'm talking about how glad I am that Perl6 will be better.


        ;$;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}} split//,".rekcah lreP rehtona tsuJ";$\=$;[-1]->();print