in reply to Re: Impact of special variables on regex match performance
in thread Impact of special variables on regex match performance

Oh that explains the difference of behavior between the code posted above and the multi-line approach I describe at the bottom of my post.
Considering you aren't using $`, $& and $', it seems the obvious thing to do is to keep not using them
As per the rules of this forum I posted the smallest amount of code reproducing the issue. Not using those special variables (or rather not have them seen by perl ) isn't an option in the real project...
  • Comment on Re^2: Impact of special variables on regex match performance

Replies are listed 'Best First'.
Re^3: Impact of special variables on regex match performance
by JavaFan (Canon) on Dec 09, 2010 at 21:44 UTC
    Not using those special variables (or rather not have them seen by perl ) isn't an option in the real project
    Of course it is.

    It doesn't mean you can do better. If you are using $` and $' on every match you make, it doesn't matter: any alternative will make you pay the price.

    But if you're using $` and friends on some matches, then /p, @- and @+, and adding more captures in your patterns classical alternatives.

      Yes, I didn't mean to say that there are no technical alternatives to my predicament. The obstacles are more organizational in nature. Thanks for the help.
        Well, if you cannot avoid using $' and $` in your program, you cannot avoid the costs that come with it. Just wishing the costs would go away isn't going to work.
Re^3: Impact of special variables on regex match performance
by ww (Archbishop) on Dec 09, 2010 at 23:10 UTC
    I'm sure it's possible to pose a challenge ("real project") in which those variables would be needed... but it's hard to come up with one. Consider:
    $` The text before matching -- In your code, that's $in $' The text which comes after the match in an input string -- Alt: Lo +okarounds $& The text of the match itself -- Use captures instead

    As mentioned, the cost (= slowdown) is well documented in the standard regex docs; and in many books, tutorials and nodes devoted to regular expressions.

    Concerning your code: given that sub uncomment_one is never explicitly called in what you posted, you may have over-reached in your diligence to follow the guidance ( not exactly "rules ) of this forum. /me suspects that profiling what you show would be informative; certainly, as is, the slowdown using your code is largely caused by the copying which is expensive, as JavaFan points out above.

      Concerning your code: given that sub uncomment_one is never explicitly called in what you posted, you may have over-reached in your diligence to follow the guidance ( not exactly "rules ) of this forum.
      I put that code in a sub on purpose. The performance impact is triggered by the mere presence of those variables in the code, not their actual use as part of the execution of the script. And so there is no need to imagine a situation where those variables would be actually needed, only one where you'd need to load a module that contains those variables in a sub your code doesn't actually call.

        Just a couple nodes back you asserted that NOT using the vars "isn't an option;" Then you conceed that you "didn't mean to say that there are no technical alternatives" Now you're offering justification by way of a hypothetical case in which one might "load a module that contains those variables."

        1. That suggests you'd better take a look at what's in the module that you might load (and, IMO, any module that uses those needs to be scrutinized very carefully for {other} sub-optimal techniques).
        2. Some -- very possibly most -- of the "performance impact is triggered by the mere presence of those variables" but you haven't allocated any of it to the difference between calling a no-op sub (all commented) and a sub with as much as a single operation (one, uncommented). Again, profile it.
        3. You are correct, of course, that some..most of the impact is from declaration, but, as noted by numerous respondents, that's well documented. Learning facts like that is part of the reason for reading the docs.

        UPDATE: Inserted dropped negation, para 1, sentence 1.