G'day AnomalousMonk,

[Your Update just appeared as I hit [reply]. I think your post is fine where it is: htmanning gets a notification of your response with an alternative point of view and you had sent me a /msg anyway, so I was aware of it (thanks for that).]

"I tend to differ with kcott in the area of regex best practice."

While we certainly differ in some areas, I don't think the gulf is as wide as you suggest. I had originally intended to mention PBP in my post: I had a very long (over an hour) interruption in the middle of typing it and, when I finally returned to it, forgot to include the PBP part. My response below covers the points I wanted to make.

I was very impressed with PBP when I first read it over a decade ago — in fact, I read it cover-to-cover twice — and started using most (if not all) of its recommendations in my code. I suspect that, 10 years ago, our views on "regex best practice" may have been perfectly aligned. I still use much of PBP; although, these days, it's just become part of my standard practices and I don't really think of it in terms of following those specific recommendations. One area that I have departed from is adding /msx to the end of every regex.

"kcott implies that one should avoid using the  /x /m /s modifiers where they are not necessary. I think they are (almost) always necessary: ..."

I wasn't trying to imply anything as strong as "should avoid"; rather, my comments were intended to convey something closer to "could avoid".

Many organisations have Perl coding standards based on PBP. These are often quite inflexible: "You must write your matches like this: m{...}msx!". On the odd occasion that I've been faced with this, especially for short-term contracts, I just take the pragmatic approach and do it. Unfortunately, many of the programmers have no idea why they're doing this: I consider this to be a real problem. So, use all of those modifiers if your pay packet relies on it, but understand what they do and which are really required for the code being written.

I think we're pretty much on the same page with /x, so I'll say no more about that.

We definitely seem to be at odds with /m and /s. Perhaps it's a function of the type of data we normally process but I rarely need those: sometimes I need one of them; I need both far less often. There's not a lot more I can say about that: "(almost) always necessary" is not my experience.

Using the qr{(?mods:...)} form over the qr{...}mods form is something of a personal preference. I've only been using it for a year or two. The latter form makes the modifiers global: you can't get finer control such as qr{(?mo:...)(?ds:...)} or qr{(?mo:...(?ds:...)...)}. Having said that, my requirements for such fine control are exceptionally limited. I really have no strong feelings regarding which form people choose to use. I don't think your arguments against using qr{(?mods:...)} because of potential typos are particularly compelling: I'm far more likely to not release the Shift key quickly enough and terminate a statement with a colon (and that can be a much harder bug to track down).

Whether or not it's a good idea to include captures in qr// is a matter of context: hardly something to be "assiduously" avoided. Where it's used like I did (s/$re/.../), there's no problem. The issue with the OP code was capturing the entire match (s/($re)/.../) when only part of the match was wanted in $1.

— Ken


In reply to Re^2: Recognizing 3 and 4 digit number by kcott
in thread Recognizing 3 and 4 digit number by htmanning

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.