in reply to Regex optimization

If you're sure of the table format...
while # use a loop to grab all instances (m| # use pipes to delimit, so no escaping / <tr # beginning of row .*? # minimal match of anything >(\d{5}) # > followed by 5 digits (remember digits) .*? # minimal match of anything >(\d{2}) # > followed by 2 digits (remember digits) .*? # minimal match of anything >(\d{3}) # > followed by 3 digits (remember digits) .*? # minimal match of anything >(\d{3}) # > followed by 3 digits (remember digits) .*? # minimal match of anything >(\d{2}) # > followed by 2 digits (remember digits) .*? # minimal match of anything (&nbsp;|\w) # &nbsp; or a letter </FONT> # followed by a closing font tag |isxg) { # case (i)nsensitive, treat as (s)ingle line, # e(x)tended comments, match (g)lobally (all) my @row = ($1,$2,$3,$4,$5,$6); # now do whatever with @row } # condensed while(m|<tr.*?>(\d{5}).*?>(\d{2}).*?>(\d{3}).*?>(\d{3}).*?>(\d{2}).*?( +&nbsp;|\w)</FONT>|isg) { my @row = ($1,$2,$3,$4,$5,$6); }
not tested, but I think it's OK :)

cLive ;-)

Replies are listed 'Best First'.
Re: Re: Regex optimization
by deryni (Beadle) on May 08, 2001 at 11:21 UTC
    Thank you for the prompt response.
    First off, as I said a lot of regexes escapes me, so reminding me that I can use (#) instead of repitition was good.
    Second, while I did remove the checks for the extra <font> tags I did indeed forget that I needn't check for the <td> tags either.
    Third, while I'd imagine that this works for the parts involved I do not need either of the nbsp's or the letter that may be in their place, but the data in between them is important.

    Thank you for all the help.

    -Etan
      No! # is a comment (the x modifier allows you to do this...)

      cLive ;-)

        Thank you, I realize that # is a comment. In this case I was using it in it's more common meaning, to signify a number.

            -Etan