No, that's not the entire loop, just a hastily edited version. What you're not seeing is an edited form of the article I mentioned. It walks through all source documents, checks their file dates, opens them into DATAFILE, does this, and then closes them. Since we're talking several thousand documents, optimization is a concern.

As far as what I want to match, the second, though the first works, too. The documents assign values for page title, site category, and so on. The actual content is a here document. It's a crude form of ASP.

I'm ignoring the leading $ because the variable names don't appear in the content.

Can the expression assigned to your $qr contain an interpolated reference?

I would be happy to post my current attempts, but they don't compile and I'm sure that if I see an example, I'll understand why they're not working.

And thanks to lemming for catching the eq problem and for seeing what I was trying to accomplish. Once I've found a match, I don't want to look for more.

Thank you for replying so nicely. It's nice to see that not everyone is a jerk.

Update #1 - Just saw runrig's reply.

I think it can help because I want one regex that I call twice, where the variable portion is the name of the variable I'm searching for.

Update #2 - Just saw what probably prompted runrig's reply. there's a mistake in the code I posted. This should be clearer:

while (<DATAFILE>){ if ($pt eq "") {if (/pagetitle.*?"(.*?)"/i){$pt = $1;}} if ($pc eq "") {if ( /category.*?"(.*?)"/i){$pc = $1;}} }

Okay, I cheated...just to show that I was listening. :)

Update #3 - I'm not too worried about the size of the data file, since the first several lines are variable declarations that ensure the right HTML snippets are used and to brand the page. Once I have the values I'm after, I bail out of the while loop and move on to the next file. but, I'll file the suggestion for later use. :)


In reply to Re: Optimizing a regex by ZydecoSue
in thread Optimizing a regex by ZydecoSue

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.