in reply to Re: match longest sequence in an "or" RE
in thread match longest sequence in an "or" RE

I think one of the problems is the greedy .+ at the start of the regex. It would be better to leave it off or make it nongreedy. Then, just sorting by length decreasing makes the thing work again:

>perl -wMstrict -lE "say /^.+?(s m e f|s f pl|s f).+$/ for '-s m e f s + f pl s f-';" s m e f

Replies are listed 'Best First'.
Re^3: match longest sequence in an "or" RE
by AnomalousMonk (Archbishop) on Sep 24, 2011 at 19:54 UTC

    It's necessary to leave off the first  .+ entirely and sort all the captured alternations by length decreasing (or some similar approach). Just making the first  .+ non-greedy is not sufficient to find the longest match anywhere in the string:

    >perl -wMstrict -lE "say q{'}, /^.+?(s m e f|s f pl|s f).+$/, q{' at offset }, $-[1] for '-s f s f pl s m e f-'; " 's f' at offset 1