Regex troubles...

kepler has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Regex troubles... by choroba (Cardinal) on Apr 20, 2016 at 11:18 UTC
That's how repeated captures work. $2 indicates the match that starts at the second group, it can't populate $3 (maybe it should return an array reference?) I'd solve this in two steps. In the first one, match the whole repetition, than split it into single expressions: `my @source = $text =~ /(regex1)((?:regex2)+)(regex3)/g; my @repeated = $source[1] =~ /.../g; # or split or whatever` [download] ($q=q:Sq=~/;[c](.)(.)/;chr(-\|\|-\|5+lengthSq)`"S\|oS2"`map{chr \|+ord }map{substrSq`S_+\|`\|}3E\|-\|`7**2-3:)=~y+S\|`+$1,++print+eval$q,q,a, [download]	[reply] [d/l] [select]
Re^2: Regex troubles... by kepler (Scribe) on Apr 20, 2016 at 13:42 UTC
Hi, Thank you very much for the solution - I've been programming in Perl for some years and I must confess that I wasn't aware of this important detail in regexs that you kindly provided. Kind regards, Kepler	[reply]
Re^3: Regex troubles... by choroba (Cardinal) on Apr 23, 2016 at 17:39 UTC
You can also store the matching parts in an array in a `(?{})` expression: `#!/usr/bin/perl use warnings; use strict; use feature qw{ say }; for my $string (qw( ab1b2b3d ab1b2x )) { my @two; my @matches = $string =~ /(a) (b.(?{ push @two, $2 if defined $2 }))+ (d)/x; say for @matches, '---', @two, '======'; }` [download] ($q=q:Sq=~/;[c](.)(.)/;chr(-\|\|-\|5+lengthSq)`"S\|oS2"`map{chr \|+ord }map{substrSq`S_+\|`\|}3E\|-\|`7**2-3:)=~y+S\|`+$1,++print+eval$q,q,a, [download]	[reply] [d/l] [select]
Re^4: Regex troubles... by AnomalousMonk (Archbishop) on Apr 23, 2016 at 19:24 UTC
Re^5: Regex troubles... by choroba (Cardinal) on Apr 23, 2016 at 19:28 UTC
Some notes below your chosen depth have not been shown here
Re^5: Regex troubles... by Anonymous Monk on Apr 23, 2016 at 19:56 UTC
Some notes below your chosen depth have not been shown here
Re: Regex troubles... by LanX (Saint) on Apr 20, 2016 at 11:22 UTC
it's possible to hack a recursive regex somehow but it's much easier to put the parens around the quantifier and split the result again. `((?:regex2)+)` HTH :) Cheers Rolf _{(addicted to the Perl Programming Language and ☆☆☆☆ :) Je suis Charlie!}	[reply] [d/l]
Re: Regex troubles... by AnomalousMonk (Archbishop) on Apr 23, 2016 at 22:50 UTC
Just for grins, here's another approach, although I would still recommend the two-step approach outlined above: c:\@Work\Perl>perl -wMstrict -le "use Data::Dump qw(pp); ;; print qq{Perl version: $]}; ;; my $ra = qr{ a }xms; my $rb = qr{ b. }xms; my $rc = qr{ c }xms; for my $string (qw(ab1b2b3b4c ab5b6b7c ab8b9c abxc ac b0)) { my @all = $string =~ m{ \G (?: $ra (?= $rb) \| $rb (?= $rb \| $rc) \| $rc (?= \z) ) }xmsg; print qq{'$string' -> }, pp \@all; } " Perl version: 5.008009 'ab1b2b3b4c' -> ["a", "b1", "b2", "b3", "b4", "c"] 'ab5b6b7c' -> ["a", "b5", "b6", "b7", "c"] 'ab8b9c' -> ["a", "b8", "b9", "c"] 'abxc' -> ["a", "bx", "c"] 'ac' -> [] 'b0' -> [] [download] Note that this runs under 5.8.9.	[reply] [d/l]
Re: Regex troubles... by Anonymous Monk on Apr 20, 2016 at 13:50 UTC
http://ideone.com/QUfuUp	[reply]
Re^2: Regex troubles... by LanX (Saint) on Apr 20, 2016 at 14:43 UTC
Please use code tags when providing source here `<code>` then please note that this regex is matching more cases because order constrains are lost! Cheers Rolf _{(addicted to the Perl Programming Language and ☆☆☆☆ :) Je suis Charlie!}	[reply] [d/l]
Re^2: Regex troubles... by beech (Parson) on May 04, 2016 at 07:46 UTC
Interesting this ideone if it really lets you run actual perl code, anyway, its best in the future if you also post the code here in code tags, here is the output on my machine `#!/usr/bin/perl use strict; use warnings; use Data::Dumper; my $text = 'XYYYZ'; my @source; push @{$source[$1 ? 0 : $2 ? 1 : 2]}, $& while $text =~ /(X)\|(Y)\|(Z)/g +; print Dumper \@source; __END__ $VAR1 = [ [ 'X' ], [ 'Y', 'Y', 'Y' ], [ 'Z' ] ];` [download] update: yeah wrong window, pasted the wrong ideone, fixed	[reply] [d/l]
Re: Regex troubles... by polettix (Vicar) on Apr 24, 2016 at 22:16 UTC
Hack using `(?{ code })`, which is not experimental any more at least as of 5.22(.0+) judging from the docs, and was available at least as of 5.8(.8+), although I don't know how stable it has been in time: `#!/usr/bin/env perl use strict; use warnings; use English qw( -no_match_vars ); use Data::Dumper; my $string = 'foobarbaaaarbaz'; my @second; if (my @matches = $string =~ m{\A (fo) (?: (?<BAR>\sba+r) (?{push @second, $^N}))+ (\sbaz) \z}mxs ) { $matches[1] = \@second; print Dumper \@matches; } # $VAR1 = [ # 'foo', # [ # 'bar', # 'baaaar' # ], # 'baz' # ];` [download] Update: as noted below in a comment, the capture `BAR` is not needed. I added it while playing with `%-` and `%+`, and forgot to remove it before posting the example... I'm leaving it though, so that the comment below can still make sense. Anyway, always remember the old adage about regular expressions... perl -ple'$_=reverse' <<<ti.xittelop@oivalf Io ho capito... ma tu che hai detto?	[reply] [d/l] [select]
Re^2: Regex troubles... by AnomalousMonk (Archbishop) on Apr 25, 2016 at 07:04 UTC
`(?: (?<BAR>\sba+r) (?{push @second, $^N}))+`* What is the advantage of using the `(?<BAR>\s*ba+r)` named capture group rather than an ordinary capture group? The name is never referred to anywhere. Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]