in reply to Re: example of 'm / / m' related example and compare to 'm / / s'
in thread PERL regex modifiers for m//

I often get confused as to which is which myself.

This is the point of TheDamian's injunctions in Perl Best Practices (PBP) to always use the /m and /s (and /x) modifiers (BPs 148 and 151 – and 147) in every match and regex object definition:  . ^ $ always behave the same way and confusion is at least reduced if not eliminated.

  • Comment on Re^2: example of 'm / / m' related example and compare to 'm / / s'
  • Download Code

Replies are listed 'Best First'.
Re^3: example of 'm / / m' related example and compare to 'm / / s'
by BrowserUk (Patriarch) on Nov 29, 2011 at 11:38 UTC

    Hm. That's like advocating always taking a swimsuit & sunblock and a raincoat & umbrella cos it saves listening to the weather forecast. More than slightly ridiculous.

    The very reason it is hard, even for long-time Perlers with scads of frequent regex user miles, to remember which (/s /m) does what, is because they are so rarely required.

    So what could possibly be wrong with advocating their use at all times?

    For a start, you're crying wolf. By using them everywhere they become the norm, and after a while people stop asking themselves why is he using that here. And that is bad.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Hm. That's like advocating always taking a swimsuit & sunblock and a raincoat & umbrella cos it saves listening to the weather forecast.

      /m and /s don't add functionality. They modify functionality.

      So using /ms is more like rewiring your oddly designed radio so that the tuning dial and the volume control actually work as you expect...thereby—for example—enabling you to successfully listen to weather forecasts.

      The very reason it is hard, even for long-time Perlers with scads of frequent regex user miles, to remember which (/s /m) does what, is because they are so rarely required.

      I'd argue that they are often required, just rarely used correctly.

      In my experience, matching start- and end-of-line is far more commonly needed that matching start- and end-of-string. The default behaviour is wrong practically every time anyone has to deal with multi-line data.

      Likewise, the vast majority of .* instances I see in deployed code are being used as "match anything", which they don't.

      BTW, it's easy to remember which is which: /s alters the behaviour of a single metacharacter (.) whereas /m alters the behaviour of multiple metacharacters (^ and $).

      By using them everywhere they become the norm

      Yes, that's precisely the point. The modified behaviours they provide should have been the norm from the start.

      after a while people stop asking themselves why is he using that here. And that is bad.

      Except that using them everywhere actually makes regexes work the way most people mistakenly think they already work. So even if they don't ask themselves why, they still get the "expected" behaviour.

      In other words, using /ms consistently on regexes makes the (idiosyncratic) behaviour of regexes conform to people's (reasonable) expectations, rather than vice versa. It's a simple technique that fixes an infelicity in Perl 5. And that's why PBP recommends it.

      Damian

        Sorry, but I respectfully disagree.

        In my experience, matching start- and end-of-line is far more commonly needed that matching start- and end-of-string. The default behaviour is wrong practically every time anyone has to deal with multi-line data.

        I'll bet you 100 hours of my time on any (on-line accessible) project of your choosing, that if we do a survey of the regex uses on this site, not only will most of them be targeted at single line strings, an overwhelming majority will be targeted at single line strings.

        For sake of putting a number on overwhelming" let's say 10 single line uses to every one multi-line. I'd probably be quite happy to go to 20 to 1 if it would sway you into accepting the bet.

        You might find a slightly reduced ratio if you searched CPAN, but I doubt it would be by much.

        And once you squash the idea that matching against multi-line strings is the norm, giving away the heads-up that seeing those options explicitly stated should give the programmer, in favour of cargo-culting a 'throw it all in there cos it probably won't cause any problems' mandate, is a really bad idea in my book. In preference to asking the programmer to look up the documentation when they need it is dangerous.

        Every time educationalists have tried to "simplify the learning process", by dumbing down, it has increased the pass rate but also wholly devalued it. There's no point in having more people pass if they don't understand how to apply what they've learnt.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.