in reply to Re: Tokenizing and qr// <=> /g interplay
in thread Tokenizing and qr// <=> /g interplay
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Tokenizing and qr// <=> /g interplay
by japhy (Canon) on Apr 23, 2005 at 17:09 UTC | |
In the above code, the regex 'ab' is compiled and executed, and then the regex 'cd' is compiled and executed. Compare that with: Here, even though $x and $y change, the ACTUAL regex ('abc') does not change, so the regex is compiled only once. The process that Perl does internally is this:
So what is qr// good for? Consider this: This code compiles a grand total of 30 regexes. Why? Because for each string in @strings we've got three patterns to execute, and because each time the $_ =~ $p is encountered the contents of $p has changed, the regex is compared and recompiled each time. Now sure, you could reverse the order of the loops, but that will result in the calls to handle() happening in a different order. So enter the qr//. When Perl sees a regex comprised solely of a single variable, Perl checks to see if that variable is a Regexp object (what qr// returns). If it is, Perl knows that the regex has already been compiled, so it simply uses the compiled regex in the place of the regex. That means doing: is considerably faster. There is no additional compilation happening. It's probably even better to move the qr// values into an array, but that might be moot since they're made of constant strings in this example. The point is, the use of qr// in a looping construct is the primary benefit it offers. Yes, it helps break a regex up into pieces too, but that's just a matter of convenience. Be warned that the benefit of qr// objects is lost if there is additional text in the pattern match. I mean that $foo =~ /^$rx_obj$/ suffers from the same problem as $foo =~ /^$text$/. | [reply] [d/l] [select] |
by Tanktalus (Canon) on Apr 23, 2005 at 19:22 UTC | |
When Perl sees a regex comprised solely of a single variable, Perl checks to see if that variable is a Regexp object (what qr// returns). If it is, Perl knows that the regex has already been compiled, so it simply uses the compiled regex in the place of the regex. Really? That is very interesting. So if I have: you're saying that in the die's unless clause, it will need to completely recompile the regex? That is not my interpretation, but I could be completely wrong here. My assumption is that both $true and $false are compiled once, and only once, and the unless modifier above would not need to recompile either one. Even if that is the case, I use code like the above because I like to be able to reuse a common criteria for truth and falseness across many expressions - sometimes, as in the die statement above, for validation that the value is something (i.e., not a typo - if someone had "y]", we'd not accidentally treat that as a false value, we'd simply reject it so the user could fix the typo), or, at other times, such as the if statement above, just to see which one it was. Which goes to the OP's question on why it's useful, somewhat in agreement with other posts here. I'm just showing a concrete example of real, live, production code where I use this construct. | [reply] [d/l] |
by japhy (Canon) on Apr 23, 2005 at 19:25 UTC | |
| [reply] |
by ff (Hermit) on Apr 25, 2005 at 11:59 UTC | |
However, as you can see (?) from the example above, I have lots of crummy/good switchouts to do, and is my plodding approach above the best that can be expected?
P.S. Can you clarify/update what you meant by: I think you are saying that a precompiled/qr regex used in a follow-on regex will have to be recompiled if you snap additional text on to the qr'd variable, because the overall text of the new regex will be different. Although at least one would still have the benefit of 'concentrated regex logic' within the qr'd variable? | [reply] [d/l] |
by japhy (Canon) on Apr 25, 2005 at 12:36 UTC | |
then you could do your loop as This way, even if you end up looping over THAT code, you'd still be dealing with already-compiled regexes. As soon as you put additional text into a regex with qr// in it: Perl has to do the "compare physical regex forms" test. Only if the qr// object is all alone will it have the entire benefits it was made for. | [reply] [d/l] [select] |
by ff (Hermit) on Apr 25, 2005 at 13:37 UTC | |
|
Re^3: Tokenizing and qr// <=> /g interplay
by MarkusLaker (Beadle) on Apr 23, 2005 at 16:34 UTC | |
Here's an example from a code-filtering assertions module (yes, another one) that's not yet tested thoroughly enough to submit to CPAN:
Read more... (4 kB)
| [reply] [d/l] [select] |