You need to capture exactly three digits that are preceded by a space and followed by a non-digit (or, possibly, end of string). As well as being preceded by the space, the digits are preceded by either 'APC', 'APCs', ',' (comma) or 'and' which you can specify as an alternation of look-behinds. A look-behind has to be of a fixed length which is why I use an alternation of four look-behinds rather than one look-behind containing four alternations.

use strict; use warnings; my $text = <<'TEXT'; Those APCs are APC 282, 376, 377 and 398. The APC assignments are also + shown in attachment K1. In the Final Rule, we indicated that clinica +l characteristics and expected resource use. Procedures are sufficie +ntly similar to those other procedures assigned to APC 282, 376, 377, + and 398, and that we believe those APC assignments were appropriate. + Specifically APCs 662 and APC 282. As shown in attachment K3 under o +ption number 1, to be placed in APC 662. Our data analysis shows that + combining services currently assigned to APC 662 would result in an +APC median cost of about 302. The 6 CPT-Codes that would go into APC +662 are: CPT-Codes 0145T through 0150T. The two other cardiac CT code +s, specifically 0144T and 0151T would be assigned to APC 282. The inc +lusion of the two codes into APC 282 would result in... TEXT my $rxExtract = qr {(?x) (?: (?<=APC) | (?<=APCs) | (?<=,) | (?<=and) ) \s(\d{3})(?:\D|\z) }; my @extracts = $text =~ m{$rxExtract}g; print qq{Match $_: $extracts[$_]\n} for 0 .. $#extracts;

The output is

Match 0: 282 Match 1: 376 Match 2: 377 Match 3: 398 Match 4: 282 Match 5: 376 Match 6: 377 Match 7: 398 Match 8: 662 Match 9: 282 Match 10: 662 Match 11: 662 Match 12: 662 Match 13: 282 Match 14: 282

I hope this is of use.

Cheers,

JohnGG


In reply to Re: regexp match repetition breaks in Perl by johngg
in thread regexp match repetition breaks in Perl by barkingdoggy

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.