in reply to Optimizing a regex
No, that's not the entire loop, just a hastily edited version. What you're not seeing is an edited form of the article I mentioned. It walks through all source documents, checks their file dates, opens them into DATAFILE, does this, and then closes them. Since we're talking several thousand documents, optimization is a concern.
As far as what I want to match, the second, though the first works, too. The documents assign values for page title, site category, and so on. The actual content is a here document. It's a crude form of ASP.
I'm ignoring the leading $ because the variable names don't appear in the content.
Can the expression assigned to your $qr contain an interpolated reference?
I would be happy to post my current attempts, but they don't compile and I'm sure that if I see an example, I'll understand why they're not working.
And thanks to lemming for catching the eq problem and for seeing what I was trying to accomplish. Once I've found a match, I don't want to look for more.
Thank you for replying so nicely. It's nice to see that not everyone is a jerk.Update #1 - Just saw runrig's reply.
I think it can help because I want one regex that I call twice, where the variable portion is the name of the variable I'm searching for.
Update #2 - Just saw what probably prompted runrig's reply. there's a mistake in the code I posted. This should be clearer:
while (<DATAFILE>){ if ($pt eq "") {if (/pagetitle.*?"(.*?)"/i){$pt = $1;}} if ($pc eq "") {if ( /category.*?"(.*?)"/i){$pc = $1;}} }
Okay, I cheated...just to show that I was listening. :)
Update #3 - I'm not too worried about the size of the data file, since the first several lines are variable declarations that ensure the right HTML snippets are used and to brand the page. Once I have the values I'm after, I bail out of the while loop and move on to the next file. but, I'll file the suggestion for later use. :)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: Optimizing a regex
by runrig (Abbot) on Jan 29, 2001 at 22:21 UTC |