You've found an issue that bites many programmers (including me). At first glance, the star, being greedy, slurps up everything. However, when you have an alternation, the regex will take the first successful match. Thus, the (ab)* will successfully match nothing and the regex is satisfied. If you reverse the (ab)* and (b)*, the star, being greedy, will match the "b". Try the following code:
$text = 'ab';
if ($text =~ /(a*)((?:ab)*|b*)/) {
print "'$1', '$2' \n";
}
if ($text =~ /(a*)(b*|(?:ab)*)/) {
print "'$1', '$2' \n";
}
The output is as follows:
'a', ''
'a', 'b'
Incidentally, Perl uses a traditional NFA engine for regex matching. If it used the POSIX-NFA engine or a DFA engine, your regex would work as you expect because those engines try to find the longest match that satisfies the regex. If you have experience with those engines, Perl may cause you some confusion.
Cheers,
Ovid
P.S. I'm glad to see you have a sense of humor about the flack you took :)
Update: Oops. dchetlin is right. I was typing too fast and didn't consider the DFA issue.
Join the Perlmonks Setiathome Group or just go the the link and check out our stats.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.