in reply to Code critique XS function for extracting a blessed regex's pattern.

That seems like an awful lot of work to go to to replace a single line in package Foo...

package Foo; use overload '""' => sub { qr/$_[0]/ }; package main; my $qr=qr/^normal$/; my $bqr=bless qr/^blessed$/,"Foo"; print "Normal : $qr\n"; print "Blessed: $bqr\n";

-sauoq
"My two cents aren't worth a dime.";
  • Comment on Re: Code critique XS function for extracting a blessed regex's pattern.
  • Download Code

Replies are listed 'Best First'.
Re: Re: Code critique XS function for extracting a blessed regex's pattern.
by demerphq (Chancellor) on Feb 05, 2003 at 23:21 UTC
    Wow! Im impressed, where were you when I wrote the first node? :-) Very nice idea indeed.

    But, unfortunately it doesnt address the question im trying to solve. My question is this: given an arbitrary blessed scalar ref, how does one efficiently determine if the object is in fact a regex? Your solution, which i personally think is rather ingenious, solves "How do I make a blessed ref, when stringified, return the pattern?". Which is I think useful indeed, but unfortunately not what I need. (I recognize I may not have specified the requirement sufficiently.)

    Even though this is a solution from the point of view of designing a class, it has the problem that its underlying concept, that of qr//ing the value, doesn't generalize. How do you detect a failure? There would be no way to determine if the wrapped object actually had produced regex, or just a ref stringified, or any number of other magic events.

    Anyway, ++ for the idea...

    --- demerphq
    my friends call me, usually because I'm late....

      My question is this: given an arbitrary blessed scalar ref, how does one efficiently determine if the object is in fact a regex?

      OK, I see the "problem" you are trying to solve now. I'm still not sure it is really worth solving though. In fact, your code might do more harm than good if its use became widespread. Why? Because it uses an undocumented "feature" of an undocumented quasi-type to provide functionality of questionable necessity to people writing ill-conceived code.

      Regexp thingies are a terrible kludge. They drift about in limbo, being neither entities of a true Perl type nor normal objects. Yes, you can play some tricks with them but that doesn't mean it is a good idea to do so. The fact that the blessed reference returned by qr// keeps its magical regular expression value after being reblessed into another class is probably not a good thing; it may even be a bug. Regardless, it is undocumented and we shouldn't rely on the behavior. (All of which begs the question of whether we should even rely on qr// returning a blessed reference in the first place.)

      If Regexp objects are elevated to a real Perl type someday, then code like

      my $r = bless qr/foo/, "MyPackage";
      probably won't even work and we'll be forced into writing code that is consistent with other types. Instead of getting a reference directly from qr// we'll have to take a reference to whatever it returns and bless that instead. There's no reason not to do that now. Code like
      my $r = bless \qr/foo/, "MyPackage";
      should continue to work even if Regexps are promoted to a real type. It does require that $$r is used when you want to get at the underlying regular expression but dereferencing isn't that much of an inconvenience, is it?.

      The whole mess gets even stickier when you consider that strings can be used in much the same way that precompiled regexes are.

      $ perl -le 'my $r = "bar"; print "yes" if "foobarbaz" =~ $r' yes
      Now, keep that in mind as you reconsider the issue of whether Regexp thingies should maintain their magic after being reblessed into another class. It can lead to inconsistent behavior. For instance:
      #!/usr/bin/perl -w use strict; package P; use overload '""' => sub { 'stringified' }; package main; local $\ = "\n"; my $regex = qr/match/; bless $regex, 'P'; my $plain = \my $t; bless $plain, 'P'; print '"stringified" matched $regex' if "stringified" =~ $regex; print '"stringified" matched $plain' if "stringified" =~ $plain; __END__ "stringified" matched $plain
      So, because of Regexps, not all references are created equal. Bummer.

      Yet another inconsistency due to the Regexp quasi-pseudo-sorta class is that you can write your own Regexp package and the things returned by qr// get access to your methods.

      #!/usr/bin/perl -w use strict; package Regexp; sub new { my $r; bless \$r } sub f { q("I'm a Regexp.") } package main; local $\ = "\n"; my $qr = qr/foo/; my $ob = Regexp->new(); print '$qr says, ', $qr->f; print '$ob says, ', $ob->f; print '$qr isa Regexp' if $qr->isa('Regexp'); print '$ob isa Regexp' if $ob->isa('Regexp'); print '$qr: ', $qr; print '$ob: ', $ob; __END__ $qr says, "I'm a Regexp." $ob says, "I'm a Regexp." $qr isa Regexp $ob isa Regexp $qr: (?-xism:foo) $ob: Regexp=SCALAR(0x805f148)
      That's not very nice behavior given that it isn't, AFAIK, documented that you shouldn't write a Regexp package of your own.

      All of this leads me to the conclusion that, if someone actually finds your XS code useful, they are almost certainly doing things that they ought not be doing anyway. ;-)

      -sauoq
      "My two cents aren't worth a dime.";
      
        Because it uses an undocumented "feature" of an undocumented quasi-type to provide functionality of questionable necessity to people writing ill-conceived code.

        Them's pretty strong words you are using there dude.

        First off there are many "feature"s of perl that are not properly documented. This is probably natural given that the code changes much faster than the documentation. Nevertheless you have a point. I will request Hugo make a decision on this, and if it is determined that it is a feature then I will provide a patch for perlop so that it becomes a documented feature. (As I said this is not uncommon at all.)

        Second off, it seems that you have been stuck by the "since I can't see a good reason to do this there must not be a good reason" bug. One of my hobbies is writing an improved Dumper. Being able to correctly dump an object that is in fact a blessed qr// is very useful. Both for data storage purposes, also for development use.

        Personally I don't think that an improved dumper is ill-conceived, and the functionality is required if the dumper is going to be complete.

        Now, keep that in mind as you reconsider the issue of whether Regexp thingies should maintain their magic after being reblessed into another class. It can lead to inconsistent behavior. For instance:

        I fail to see why this behaviour is inconsistent. One item is a regex, the other item is not. Since they are different the fact that they behave different can hardly come to a suprise to anyone. The only aspect of this that makes it seem inconsistent is that under normal circumstances you cant tell whats different. Your argument seems to amount to saying that "Since you cant distinguish a blessed scalar ref from a blessed qr// you shouldnt implement a way to do so." Which hardly seems like a logical position to take.

        (All of which begs the question of whether we should even rely on qr// returning a blessed reference in the first place.)

        I believe that it is a feature. And one that is exploited too. I think it is extremely unlikely that this behaviour will change, and if it does it will change over several versions as it must be deprecated first, then eliminated. Either way, the decision of Hugo will resolve this.

        The whole mess gets even stickier when you consider that strings can be used in much the same way that precompiled regexes are.

        Precisely the problem I am trying to address. How do I tell a string from a regex? Consider I might have a search routine. If you pass in a string it finds all the elements that equal that string exactly. If you pass in a regex it finds all the elements that match the regex. Being able to distinguish the two seems to be of obvious utility.

        That's not very nice behavior given that it isn't, AFAIK, documented that you shouldn't write a Regexp package of your own.

        I dont get it. This is exactly the behaviour I would expect given that it is not documented that you shouldn't write a Regexp package of your own.

        --- demerphq
        my friends call me, usually because I'm late....

      My question is this: given an arbitrary blessed scalar ref, how does one efficiently determine if the object is in fact a regex?

      Once you bless a Regexp into another class it isn't a Regexp anymore... try:

      <update>As demerphq kindly pointed out I lied :-) Can you spot the silly mistake in the "demonstration" below :-)</update>

      my $bqr=bless qr/^blessed$/,"Foo"; print "no match for $bqr\n" unless "normal" =~ m/$bqr/;

      :-)

      I guess you could subclass it (although this is something I've never tried) - in which case

      UNIVERSAL::isa($qr, 'Regexp')

      would be the right solution.

        Once you bless a Regexp into another class it isn't a Regexp anymore...

        Nope. The magic doesn't go away. As you can see.

        sub t { printf "%10s %s /%s/\n", $_[0], ($_[0]=~/$_[1]/ ? "=~" : "!="), $_[1]; }; $bqr=bless qr/^blessed$/,"Foo"; $qr=qr/^normal$/; foreach $rex ($bqr,$qr) { t($_,$rex) foreach qw(normal blessed); } __END__ normal != /Foo=SCALAR(0x1abf1d8)/ blessed =~ /Foo=SCALAR(0x1abf1d8)/ normal =~ /(?-xism:^normal$)/ blessed != /(?-xism:^normal$)/
        In fact i think its considered a feature. The possibilities are kinda interesting. :-)

        --- demerphq
        my friends call me, usually because I'm late....