aquarium has asked for the wisdom of the Perl Monks concerning the following question:

The code below looks for lines starting with .949. and is supposed to capture all the pairs of |i barcode and |l homeloc, using the g modifier. only it doesn't do that, it only prints the very first occurence of the pair. please help.
while($line=<>) { chomp $line; if($line =~ /^\.949\./) { while($line =~ /\|i([^|]*).*\|l([^|]*)/g) { ($barcode,$homeloc) = ($1,$2); print "$barcode|$homeloc|\n"; } } }

Replies are listed 'Best First'.
Re: execute loop code for every occurence of regex
by Eimi Metamorphoumai (Deacon) on Sep 08, 2004 at 14:51 UTC
    I think the problem is your .*. If you look closer, I think you'll find it's finding the first |i, but the last |l. .* tries to match as much text as it can, so it scoops up all your other codes. If you use the non-greedy .*? it should do what you want.
    while($line =~ /\|i([^|]*).*?\|l([^|]*)/g) {
      thank you...i'm a silly duffer...and it's important database field updates..thanks heaps.
Re: execute loop code for every occurence of regex
by ikegami (Patriarch) on Sep 08, 2004 at 14:47 UTC

    Update: Note to self, don't come to PM first thing in the morning. I missed the "must match pair" part of the question. Just change .* to .*? in your regexp and ignore what follows in this post. It's still neat reading, though :)

    \|i([^|]*).*\|l([^|]* means match everything from the first |i barcode to the last |l homeloc, inclusively. Try replacing your while with the one below:

    $line = '|i1234|lABCD|i5678|i90|lBOO'; while($line =~ /\|i([^|]*)|\|l([^|]*)/g) { print("barcode: $1\n") if (defined($1)); print("homeloc: $2\n") if (defined($2)); } __END__ output ====== barcode: 1234 homeloc: ABCD barcode: 5678 barcode: 90 homeloc: BOO

    Here's another nice solution that uses m/\G.../gc

    $line = '|i1234|lABCD|i5678|i90|lBOO'; for ($line) { /\G \|i ([^|]*) /gcx && do { print "barcode: $1\n"; redo; }; /\G \|l ([^|]*) /gcx && do { print "homeloc: $1\n"; redo; }; /\G \|[^il] [^|]* /gcx && do { redo; }; # skip bad stuff } __END__ output ====== barcode: 1234 homeloc: ABCD barcode: 5678 barcode: 90 homeloc: BOO
Re: execute loop code for every occurence of regex
by VSarkiss (Monsignor) on Sep 08, 2004 at 14:53 UTC

    It appears ikegami has the right answer above. I just wanted to express one of my favorite hobbyhorses: "Don't use a regex when a simple string compare will suffice." Determining whether $line starts with .949. can be done with a simple:

    if (substr($line, 0, 5) eq '.949.') { # and so on

Re: execute loop code for every occurence of regex
by Random_Walk (Prior) on Sep 08, 2004 at 14:51 UTC
    your .* in the middle is greedy. fix it with a .*? or something a little more specific is possible
    perl -e '$l="|i ba |l L1 |i b2 |l L2"; while ($l=~ /\|i([^|]*).*\|l([^ +|]*)/g) {print "$1 $2\n"}' ba L2 perl -e '$l="|i ba |l L1 |i b2 |l L2"; while ($l=~ /\|i([^|]*).*?\|l([ +^|]*)/g) {print "$1 $2\n"}' ba L1 b2 L2

    Cheers,
    R.

Re: execute loop code for every occurence of regex
by Anonymous Monk on Sep 08, 2004 at 14:55 UTC
    What do you mean with the first occurence?

    If your input is: |i abc |l def |i ghi |l jkl, then the output will be: abc|jkl, which doesn't seem as the first pair to me...

    The problem with your code is that * (and +) are greedy.
    You can find more information about that, and how to fix it, in the perlre pod and/or perlretut pod. You can/should look for greedy and/or maximal match.