coldfingertips has asked for the wisdom of the Perl Monks concerning the following question:

Working on the same script as before but ran into another question/problem.

How can you setup a regex to only work an X number of times per separated data? This script reads from a text file different pieces of text separated by ====\n\n and I need to have the regexes work a certain number of times.

For example, running the script below can return 2 or more prices it finds but I ONLY want it to find the first one for each set. And the phone numbers can match 1 or 2 times but no more than that.

I tried wrapping it in a for ( 0 .. 1) but it errors out instead. What can be done for something like this?

$/="====\n\n"; open (READFROM, "$readfrom") or die "Cannot open $readfrom: $!"; open (WRITETO, ">$writeto") or die "Cannot open $writeto: $!"; while ( <READFROM> ) { chomp; my $price; for ( 0 .. 1) { $_ =~ m/ ( \$ (?:\d{1,3},?)+ (?:\.\d{0,2})? (?![.\d]) ) /x; $price = "$1"; } print "$price\n"; } close (WRITETO) or die "Cannot close $writeto: $!"; close (READFROM) or die "Cannot close $readfrom: $!"; the bottom part is just a piece of the file we're reading from so you +can get an idea <code> Boardman $157,000 COLONIAL/1 ACRE WOW! We finally found the 4 bedroom home you’ve been looking for. Call our +office today for details and directions for the best buy in Boardman. Appraised at $157 +,000. Asking ONLY $139,900 David Realty 330-758-8363 330-758-8363 ==== Boardman $475 A BARGAIN - 0 DOWN $475 & up/mo. 2 Bedroom, extra sharp ranch. Ready to occupy, not for rent. Easy to p +urchase. 2 homes available. 7372 Oregon Trail 7421 Siera Madre All Credit Considered Jim Rich Realty 330-783-9300 ====

Replies are listed 'Best First'.
Re: Specifying how many times a regex should work
by blokhead (Monsignor) on Jun 05, 2004 at 18:17 UTC
    I can see at least two things wrong. First, the inner for loop overwrites $_. Therefore your regex is matching against the strings "0" and "1" and not against the record from the input file.

    Also, without the /g modifier on the match, you will simply match the same thing twice. The /g forces the next match to start looking where the last match left off.

    while (<FH>) { chomp; my $record = $_; for (0 .. 1) { $record =~ m/ ... /xg and print "matched: $1\n"; } }
    Or you may be better off getting all matches at once by using /g in list context. Then you can just use a slice to get the ones you want, and not have to fuss around with a for loop. I think the readability is greatly improved.
    while (<FH>) { chomp; my @all_matches = m/ ... /xg; my @first_few = @all_matches[0 .. 1]; }

    blokhead

      Thanks for your help. I finally was able to get it to work with yoru first example but ran into *another* problem. I spent a few hours trying different methods to do this but nothing seems to work. The original script only saved the regex information to the variable but I need the entire contents of each separated data to be stored so I can use it again.

      I'm only using regexes so I can resort the data. I'm trying to get the phone1, phone2, price, everything else in that order so I really need to store ALL of that paragraph stored so I can apply my s///s and other regexes to it.

      Does this make any sense or am I in too deep and this won't ever work?

        I'd tackle your original problem this way. I've used simplified regexes to demonstrate the technique. They seem to work pretty well on your original sample data, but you can adapt them to your needs.

        #! perl -slw use strict; local $/= "\n====\n"; while( <DATA> ) { ## Clear out residual values from global vars our( $no1, $no2, $price ) = ( undef ) x 3; tr[\n][ ]; ## Strip newlines ## Extract the cash value into $price s[ ( \$[\d,]+ (?:\.\d+)? ) (?{ $price = $^N }) ] +[ ]x; ## Grab the first 1 or 2 telephone numbers s[ ( [\d-]{12} ) (?{ $no1 ? $no2 = $^N : $no1 = $^N }) ] +[ ]xg; ## join them together in the requisite order and output print join ', ', $no1||'n/a', $no2||'n/a', $price||'n/a', $_; }

        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail