Before lookbehind assertions were invented, there already existed code to calculate the minimum and maximum possible matching lengths of a subpattern, used by the optimiser (eg to bail out of the match before starting when the target string is too short). When support for lookbehind was added, the check was added in the simplest possible way: check that the minimum and maximum possible match lengths are the same.

The restriction could be relaxed in several ways given some slightly more intelligent code support. Your example of alternation is one such, and another one is backreferences (since their length is always fixed by the time it is needed):

"aabbaababb" =~ /(a+).*(?<=\1)b/; # should match "aabbaab"

The regular expression engine is due for a bit of an overhaul during the development of perl-5.10.0, and we may see some improvements in this area as a part of that if they can be fitted in without slowing down patterns that don't use lookbehinds.

For your example, prepending dots may be inaccurate if the non-digits could be near the beggining of the string. I'd suggest moving the alternation outwards instead:

(?:(?<=199\d|200\d)|(?<=\D))

Hugo

In reply to Re: Not-really variable length lookbehind by hv
in thread Not-really variable length lookbehind by diotalevi

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.