Re: Possessive Quantifiers
by blakem (Monsignor) on Aug 17, 2002 at 11:30 UTC
|
| [reply] [d/l] |
|
|
The cut operator was only used there for performance reasons.
And I did it only because I _think_ it's faster, not because
I know for sure. It certainly didn't change the meaning of
the regex, it would still match the same set of strings.
Abigail
| [reply] |
Re: Possessive Quantifiers
by sauoq (Abbot) on Aug 17, 2002 at 03:08 UTC
|
Are you suggesting it is somehow better in Java because the syntax is shorter? Or, do you know something about the way that this is implemented in Java that makes it somehow better than the implementation in perl?
If it is the former, I'd argue that shorter syntax doesn't mean better, especially for such an infrequently used construct. Besides, if shorter syntax does mean better then Perl beats Java hands down overall. (Ever seen any Java golf? Me neither...)
If it is the latter, I'm interested in knowing what you do.
To answer your questions in order:
1. I don't know.
2. I don't think I've ever used it.
3. There is no cure. Soon you'll be making up regexes to match strings of cars on the freeway and other such nonsense.
-sauoq
"My two cents aren't worth a dime.";
| [reply] |
|
|
I wish I knew something (significant) about the Perl or Java implementations. In fact, quite the opposite is true - I've done no more than read the quotes in the Perl headers, yet.:-P
That's part of the reason I submitted this question. I'm quite aware that Perl beats Java hands down - the book has enough examples to nail that down even if I didn't already believe it from previous experience. I'm also aware that shorter isn't necessarily better.
If there's some particular insight as to why it would be worse in this particular case, I'd be delighted to be enlightened. As it is, in my ignorance, I like the + possessive because I think it looks more readable than the (?>) alternative, and is likely a better mnemonic, since I have to keep looking at the man page to remember which odd character goes into (?>) .
Java golf. That'd be a laugh. 'Look, I done it in 15!' 'Characters?' 'No, classes!' Thbbppt.
note:Thanks to whoever edits the above link to regex.info for fixing my stupidity.
| [reply] |
|
|
If there's some particular insight as to why it would be worse in this particular case, I'd be delighted to be enlightened. As it is, in my ignorance, I like the + possessive because I think it looks more readable than the (?>) alternative, and is likely a better mnemonic, since I have to keep looking at the man page to remember which odd character goes into (?>) .
I don't know enough about the Java half of this question to be strongly opinionated but I can offer some thoughts.
The construct is mostly good for optimizing the failure case of a specific subset of patterns. Consequently, it is infrequently used in Perl and presumably, that's the case in Java as well. The symbol "+", however, is frequently used for a very common case. Overloading its meaning could be confusing.1 It might be easily missed or look like a typo, particularly to someone unfamiliar with it.
The (?>) construct allows grouping that Java's doesn't seem to handle. I'm guessing2 that (?>a*b*) is equivalent to Java's a*+b*+. If so, and (?>a+b?c{3,7}d*e+) looks like a++b?+c{3,7}+d*+e++ in Java, then Java's representation starts to get long and messy. What about (?>a*(b|c)d*)? Can that be expressed in Java at all? Or is Java restricted to modifying quantifiers?
1. It might be a real + for obfuscation though. :-)
2. I'm not sure of this. Someone please correct me if I'm wrong.
-sauoq
"My two cents aren't worth a dime.";
| [reply] [d/l] [select] |
|
|
|
|
|
|
sauoq said "3. There is no cure. Soon you'll be making up regexes to match strings of cars on the freeway and other such nonsense."
Don't forget dreaming in regexes... (I always seem to wake up with a headache after those dreams.)
| [reply] |
Re: Possessive Quantifiers
by John M. Dlugosz (Monsignor) on Aug 17, 2002 at 04:15 UTC
|
I read about regex for Perl 6 in the Apocolipse. Totally redone, cool stuff, and includes easier use of this feature (but with different syntax). Even better, it has at least three "strengths" of it, if memory serves. | [reply] |
Re: Possessive Quantifiers
by agentv (Friar) on Aug 18, 2002 at 17:47 UTC
|
...I'm not sure what led you to the conclusion that Java does RE matching better than Perl. (I thought for a while you were just baiting us.)
But I will say that regular expression support in Java is fairly brand new, and for many of us IT Unitarians, regex support in Java is here AFT if you know what I mean.
If the possessive quantfier is all you can point to in support of the premise that "Java does it better than Perl," I'd have to differ with your conclusion. Benchmarks, I would consider as some mild evidence. Examples of nasty regexes that are simplified in Java somehow, I'd consider.
But in the end, why the heck do we have to have a "this one better than that one" metality all the time? Why isn't it good enough that programmers who want to use Java can count on regex matching, and programmers who want to use Perl can do regex matching. I really don't care who was first. I really don't care about which one is incrementally faster, and I really don't care about which one has a more concise way to accomplish some arcane backwater operation.
Wanna talk about Real Life(tm)? Here's something about it. I don't choose the implementation language on the basis of something as marginal as minor differences in regular expression support. (In fact, even that absence of regex support in a language (as was the case with Java until very recently) will not sway my decision most times.)
It breaks down like this. If I have one smart guy (or gal) to do the implementation, I may very well choose Perl. If I have a team which must coordinate in the implementation, I want a more strongly typed language like Java. (Even if I have a very smart team of hotshot developers, I may still go with Java just to cut down on integration headaches and intra-team communications overhead.) If I want the most rapid prototype possible, I'll probably choose to do it in Perl. If I plan to distribute to a widespread audience, I may choose Java. If I have to consider portability and provide a GUI, I'll probably choose Java (although I may regret it later).
It's all good people! There doesn't have to be a "weener and still champeeon!" When are we going to stop watching Highlander and believing that "there can only be one." That's a Redmond mentality. You can use either langauge; they're both free with your purchase of an Internet connection. (Remember what polyamorists say: "AND, NOT OR.")
Okay, so consider me to have officially been baited. And I promised my family I'd stop. :-)
...All the world looks like -well- all the world,
when your hammer is Perl.
---v
| [reply] |
|
|
I'm sorry to have confused you, but I was definitely not intending to imply that Java did better RE matching than Perl - I am certain, particularly after reading Mastering REs, that the opposite is true.
What I was trying to say (badly, apparently) was that the possessive quantifier + was a feature in the java RE engine which I liked better than its (?>) counterpart in Perl.
I hope that clears up any confusion you may have had. Thank you for your input, though, since I always like to see language comparisons (and find myself often treating Perl as the god of all languages).
update: Looking at this later, I realize that I have yet again said something other than what I mean, for which I feel greatly foolish.
The possessive quantifier +, is a feature which I see as potentially very convenient. The (?>) construct, for some simple cases, is less readable, IMHO, but cannot be ignored for more complex cases or the must-match-exactly-once case.
Didn't mean to imply any mine's-bigger-than-yours stuff, just a hey-lookit-this-nifty-feature thing.
| [reply] |