comment on

/([0-9]{2,})(?(?{index("0123456789", $1) == -1})(*FAIL))/
[download]

It is not clear (at least, not to me) from the OP if the consecutive sub-string in a string like '91239' should be matched or not, and asif has yet to clarify this point. The regex above will match '123' in '91239'.

Not having much experience with the newfangled backtracking control verbs, I wanted, as an exercise, to come up with a version of JavaFan's regex that would only accept 'strictly' consecutive digit strings. Using non-digit look-arounds before and after the regex did the trick, but was not very enlightening about backtracking verbs.

I spent some time trying to use possessive matching and capturing in conjunction with (*SKIP) and (*PRUNE) and (*FAIL) combinations, but without success. It slowly dawned on me that the possessiveness of possessive matching does not affect the start-point of a match, but only the potential end-point and backtracking therefrom. If an otherwise-successful possessive match is forced to fail by (*FAIL), all that happens is that the match start point advances one character and the regex tries again. What I wanted to do was to skip (hint, hint) entirely over a sequence of digits if they failed the test of consecutiveness.

After considerable staring at Special Backtracking Control Verbs in the FM, I finally realized that (*SKIP) did indeed control the start-point of a match just as the documentation and the specific example promised.

Here's my (very simple) modification to add 'strictness' to the matching. Take out the (*SKIP) verb from $skip_if_not_consecutive and a bunch of '12's will be produced.

>perl -wMstrict -le
"my $skip_if_not_consecutive = qr{
   (?(?{index('0123456789', $^N) == -1}) (*SKIP) (*FAIL))
   }xms;
 ;;
 my $digits = qr{ \d{2,} }xms;
 ;;
 my $str = 'a1a11a9129a912a129a112a122a34a345a';
 my @cons = $str =~
   m{ ($digits) $skip_if_not_consecutive }xmsg
   ;
 ;;
 my $q_cons = join ' ', map { qq{'$_'} } @cons;
 print qq{'$str'};
 print qq{  $q_cons};
"
'a1a11a9129a912a129a112a122a34a345a'
  '34' '345'
[download]

Learned something today.

In reply to Re^2: Regex help pls by AnomalousMonk
in thread Regex help pls by asif

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.