kcitren has asked for the wisdom of the Perl Monks concerning the following question:

I've got a file that I'm parsing through with:
while (<file>)

the format of the file is this:
blahal askdjf, asdf jdfasd"asdf"(REGEX1)
kkdjf "REGEX2", (REGEX1), (REGEX3) "teets"

I'm trying to run a function any time certain regex's appear.

if (/(regex1|regex2|regex3)/gi) { f($1); }

but this only seems to grab once per line even if multiple regex's appear on a line.
How do I get this to run for each regex found?

Replies are listed 'Best First'.
Re: Multiple regex catches on a line
by stephen (Priest) on Jun 20, 2001 at 04:06 UTC
    When you call m/regex/g in scalar context, it first returns the first occurrance of regex. If you call it again, it'll return the second occurance of regex, and so on.

    In using if, you're only calling the regexp once per line. Change it to this:

    while (<file>) { while (/(regex1|regex2|regex3)/gi) { f($1); } }

    Update: Added the 'g' modifier. Thanks to Hofmator for keen eyes.

    stephen

Re: Multiple regex catches on a line
by lemming (Priest) on Jun 20, 2001 at 04:06 UTC
Re: Multiple regex catches on a line
by dimmesdale (Friar) on Jun 20, 2001 at 19:31 UTC
    Its already been offered what to do to fix it. However, I have a suggestion to make it go faster. Instead of using alternation within the regex, use Perl's logical or outside of it. That is:
    while(/(regex1)/gi || /(regex2)/gi || /(regex3)/gi)
    You need the /g in this example even if you don't expect the same regex to appear on the same line multiple times. The advantage to this example is that it significantly(well,depending on the regex) cuts down on the amount of backtracking that is neccessary by the NFA machine. You could also take that out of the while loop, and do this:
    while(<file>) { (/(regex1)/i || /(regex2)/i || /(regex3)/i) && f($1); }

    (Only do this, though, if you only expect one regex per line)

    The advantage to this situation is that, if needed, you can call specific functions(or do specific actions) on a per regex level. E.g., :

    while(<file>) { #put a while around the regexes if there is multiple #regexes per line; aslo add a /g ( (/(regex1)/i && doSomethingForRegex1($1)) || (/(regex2)/i && doSomethingForRegex2($1)) || (/(regex3)/i && doSomethingForRegex3($1)) ) && still_do_a_general_func_if_needed($1); }