G'day harangzsolt33,

"Is it possible to "break up" a regex so that it spans multiple lines?"

The /x and /xx modifiers exist for this purpose. See "perlre: Modifiers: /x and /xx". There are quite a few gotchas associated with these.

In http://wzsn.net/perl/index.html, you say you're using "TinyPerl 5.8". I'm not familiar with this, but I'll assume it's a cut-down version of the standard "Perl 5.8"; I don't know what features or support it modifies or excludes. The following version notes refer to "Perl 5.8"; you'll need to adjust for any "TinyPerl 5.8" limitations. (Perl v5.8.0 was released over 20 years agoperlhist; you're missing out on many features, bug & security fixes, and optimisations, with such an old version; an upgrade is recommended.)

I personally find the /x modifier to be very helpful, particularly with respect to improved readability, and use it often (except for the simplest regexes). On the other hand, I'm not convinced that /xx offers equivalent enhancements; making changes can, on occasion, be tricky. Of course, those are my preferences; they're not recommendations, make your own choices.

When using either /x or /xx, you need to be mindful of whitespace and hash characters. Here's a non-exhaustive demonstation of some of the similarities and differences:

$ perl -E ' my $x = qq{ A B\tC # comment}; say q{Original string: |}, $x, q{|}; say q{s/\s//g: |}, $x =~ s/\s//gr, q{|}; say q{s/\s//gx: |}, $x =~ s/\s//grx, q{|}; say q{s/\s//gxx: |}, $x =~ s/\s//grxx, q{|}; say q{s/ //g: |}, $x =~ s/ //gr, q{|}; say q{s/ //gx: |}, $x =~ s/ //grx, q{|}; say q{s/ //gxx: |}, $x =~ s/ //grxx, q{|}; say q{s/[ ]//g: |}, $x =~ s/[ ]//gr, q{|}; say q{s/[ ]//gx: |}, $x =~ s/[ ]//grx, q{|}; #say q{s/[ ]//gxx: |}, $x =~ s/[ ]//grxx, q{|}; say q{s/[\ ]//gxx: |}, $x =~ s/[\ ]//grxx, q{|}; say q{s/#//g: |}, $x =~ s/#//gr, q{|}; say q{s/#//gx: |}, $x =~ s/#//grx, q{|}; say q{s/#//gxx: |}, $x =~ s/#//grxx, q{|}; say q{s/[ #]//g: |}, $x =~ s/[ #]//gr, q{|}; say q{s/[ #]//gx: |}, $x =~ s/[ #]//grx, q{|}; say q{s/[ #]//gxx: |}, $x =~ s/[ #]//grxx, q{|}; ' Original string: | A B C # comment| s/\s//g: |ABC#comment| s/\s//gx: |ABC#comment| s/\s//gxx: |ABC#comment| s/ //g: |AB C#comment| s/ //gx: | A B C # comment| s/ //gxx: | A B C # comment| s/[ ]//g: |AB C#comment| s/[ ]//gx: |AB C#comment| s/[\ ]//gxx: |AB C#comment| s/#//g: | A B C comment| s/#//gx: | A B C # comment| s/#//gxx: | A B C # comment| s/[ #]//g: |AB Ccomment| s/[ #]//gx: |AB Ccomment| s/[ #]//gxx: | A B C comment|

If I uncomment line 16 (s/[ ]//gxx), I get a single line of output:

Unmatched [ in regex; marked by <-- HERE in m/[ <-- HERE ]/ at -e lin +e 16.

Line 17, with s/[\ ]//gxx, fixes this. It also demonstrates one of the traps for the unwary.

As well as adding modifiers to the end of m// and s///, you can also embed them in "Extended Patterns". I find this is handy when used with qr//:

my $re = qr{(?x: ... multiline regex pattern here ... )};

Examples of usage crop up here fairly often. A couple of my most recent offerings: "a fairly simple example"; "a more involved, and fully commented, example". And, from many years ago, "a very long and complex example, using qr{...}msx".

— Ken


In reply to Re: Multiline regex by kcott
in thread Multiline regex by harangzsolt33

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.