Here's the text it should be matching in:
And here are the two regexes:<!-- filename: full-000-body-cdl90 --> <tr> <td class="contentSmall" valign="top" id=bold width="5%" nowrap><str +ong> 020</strong></td> + <td class="contentSmall" valign="top">|a 9780470086223 (hardback)</t +d> </tr> <!-- end: full-000-body-cdl90 --> <!-- filename: full-000-body-cdl90 --> <tr> <td class="contentSmall" valign="top" id=bold width="5%" nowrap><str +ong> 24510</strong></t +d> <td class="contentSmall" valign="top">|a Heads in the sand : |b how +the Republicans screw up foreign policy and foreign policy screws up +the Democrats / |c Matthew Yglesias</td> </tr> <!-- end: full-000-body-cdl90 --> <!-- filename: full-000-body-cdl90 --> <tr> <td class="contentSmall" valign="top" id=bold width="5%" nowrap><str +ong> 24610</strong></t +d> <td class="contentSmall" valign="top">|a How the Republicans screw u +p foreign policy and foreign policy screws up the Democrats</td> </tr> <!-- end: full-000-body-cdl90 --> <!-- filename: full-000-body-cdl90 --> <tr> <td class="contentSmall" valign="top" id=bold width="5%" nowrap><str +ong> 61020</strong></t +d> <td class="contentSmall" valign="top">|a Democratic Party (U.S.)</td +> </tr> <!-- end: full-000-body-cdl90 -->
if ($MARC_page =~ m{ (?:020<)? # MARC code followed by a bracket to identify .*? # followed by anything \|a\s # followed by a pipe and the subfield (\d{13}) # followed by a 13-digit ISBN code }xmgs) { my $isbn = $1; } if ($MARC_page =~ m{ 245\d{0,2} # MARC code 245 followed by 0-2 indicators .*? # followed by anything \|a\s # followed by a pipe and the subfield (.*?) # followed by the title \| # followed by a pipe and the next subfield }xmgs) { my $title = $1; }
It works correctly now after I rearranged the regexes. However, before when I had the ISBN regex coming after, it would not match anything. I tried changing \d to just . to see where it would even land, and it was matching with "Democratic Pa," which would have been the next match after where the title regex matched. For the record, the correct matches should be "9780470086223" for the ISBN and "Heads in the sand : " for the title match.
As far as I'm aware, a regex with the g flag should match globally, meaning it would ignore wherever another regex happened to stop searching. Is this not correct? If I am right, can someone tell me why I'm seeing this behavior, and how I might correct it? Thanks a lot.
p.s. this is just a random example book and I don't mean to make any political statements by its use
In reply to regex only matching from last match by Foxpond Hollow
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |