in reply to Multiline regex
G'day harangzsolt33,
"Is it possible to "break up" a regex so that it spans multiple lines?"
The /x and /xx modifiers exist for this purpose. See "perlre: Modifiers: /x and /xx". There are quite a few gotchas associated with these.
In http://wzsn.net/perl/index.html, you say you're using "TinyPerl 5.8". I'm not familiar with this, but I'll assume it's a cut-down version of the standard "Perl 5.8"; I don't know what features or support it modifies or excludes. The following version notes refer to "Perl 5.8"; you'll need to adjust for any "TinyPerl 5.8" limitations. (Perl v5.8.0 was released over 20 years agoperlhist; you're missing out on many features, bug & security fixes, and optimisations, with such an old version; an upgrade is recommended.)
I personally find the /x modifier to be very helpful, particularly with respect to improved readability, and use it often (except for the simplest regexes). On the other hand, I'm not convinced that /xx offers equivalent enhancements; making changes can, on occasion, be tricky. Of course, those are my preferences; they're not recommendations, make your own choices.
When using either /x or /xx, you need to be mindful of whitespace and hash characters. Here's a non-exhaustive demonstation of some of the similarities and differences:
$ perl -E ' my $x = qq{ A B\tC # comment}; say q{Original string: |}, $x, q{|}; say q{s/\s//g: |}, $x =~ s/\s//gr, q{|}; say q{s/\s//gx: |}, $x =~ s/\s//grx, q{|}; say q{s/\s//gxx: |}, $x =~ s/\s//grxx, q{|}; say q{s/ //g: |}, $x =~ s/ //gr, q{|}; say q{s/ //gx: |}, $x =~ s/ //grx, q{|}; say q{s/ //gxx: |}, $x =~ s/ //grxx, q{|}; say q{s/[ ]//g: |}, $x =~ s/[ ]//gr, q{|}; say q{s/[ ]//gx: |}, $x =~ s/[ ]//grx, q{|}; #say q{s/[ ]//gxx: |}, $x =~ s/[ ]//grxx, q{|}; say q{s/[\ ]//gxx: |}, $x =~ s/[\ ]//grxx, q{|}; say q{s/#//g: |}, $x =~ s/#//gr, q{|}; say q{s/#//gx: |}, $x =~ s/#//grx, q{|}; say q{s/#//gxx: |}, $x =~ s/#//grxx, q{|}; say q{s/[ #]//g: |}, $x =~ s/[ #]//gr, q{|}; say q{s/[ #]//gx: |}, $x =~ s/[ #]//grx, q{|}; say q{s/[ #]//gxx: |}, $x =~ s/[ #]//grxx, q{|}; ' Original string: | A B C # comment| s/\s//g: |ABC#comment| s/\s//gx: |ABC#comment| s/\s//gxx: |ABC#comment| s/ //g: |AB C#comment| s/ //gx: | A B C # comment| s/ //gxx: | A B C # comment| s/[ ]//g: |AB C#comment| s/[ ]//gx: |AB C#comment| s/[\ ]//gxx: |AB C#comment| s/#//g: | A B C comment| s/#//gx: | A B C # comment| s/#//gxx: | A B C # comment| s/[ #]//g: |AB Ccomment| s/[ #]//gx: |AB Ccomment| s/[ #]//gxx: | A B C comment|
If I uncomment line 16 (s/[ ]//gxx), I get a single line of output:
Unmatched [ in regex; marked by <-- HERE in m/[ <-- HERE ]/ at -e lin +e 16.
Line 17, with s/[\ ]//gxx, fixes this. It also demonstrates one of the traps for the unwary.
As well as adding modifiers to the end of m// and s///, you can also embed them in "Extended Patterns". I find this is handy when used with qr//:
my $re = qr{(?x: ... multiline regex pattern here ... )};
Examples of usage crop up here fairly often. A couple of my most recent offerings: "a fairly simple example"; "a more involved, and fully commented, example". And, from many years ago, "a very long and complex example, using qr{...}msx".
— Ken
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Multiline regex
by harangzsolt33 (Deacon) on Dec 18, 2022 at 20:01 UTC | |
by Fletch (Bishop) on Dec 19, 2022 at 08:57 UTC |