Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm unable to match arbitrary expressions of this form:

BASKET

OPEC WEEKLY BASKET PRICE RISES TO 28.23 DOLLARS

that is - a single keyword followed by 2 newlines, followed by a single line of 3 or more words terminated by another newline. Thanks in advance to any respondants.

Replies are listed 'Best First'.
RE: matching multiline regular expressions
by Ovid (Cardinal) on Jun 13, 2000 at 23:14 UTC
    I reread your question and realized that I hadn't answered it. Assuming that you meant exactly what you said, I'd recommend the following regex:
    /^$key\n\n(\w+\W+){2}\w+.*\n$/;
    This assumes that your 'keyword' is in $key. Note that the word/non-word construct is only repeated twice, though you have specified "three or more words." Following what you wrote, I couldn't guarantee that there would be any non-word after the third word, except for the newline which must be matched immediately prior to the end of string '$'.

    If there will always be at least one non-word character between the third word and the terminating newline, you can simplify the regex just a little:

    /^$key\n\n(\w+\W+){3}.*\n$/;

    Hope this is what you were looking for.

RE: matching multiline regular expressions
by Shendal (Hermit) on Jun 13, 2000 at 22:32 UTC
    Is this what you're looking for?
    $_ = 'BASKET OPEN WEEKLY BASKET PRICE RISES TO 28.23 DOLLARS'; if (/BASKET\n\nOPEN WEEKLY BASKET PRICE RISES TO (\S+) DOLLARS/) { $price = $1; print "Price: $price\n"; }
    Hope that helps!
RE: matching multiline regular expressions
by Ovid (Cardinal) on Jun 13, 2000 at 22:53 UTC
    You mention that you are unable to make such a match arbitrarily. If you truly need to do this on an arbitrary basis, you may need to occassionaly use the /s switch, which allows the "." metacharacter to match a newline.