in reply to Re: pattern matching (greedy, non-greedy,...)
in thread pattern matching (greedy, non-greedy,...)

Wow, thanks!

Processing the entire file at once is fine for what I'm trying to do.

Here's what I had written so far (I just started with Perl so be gentle):

open (IN, 'input.txt') or die "$!"; my $lines = do {local $/; <IN>}; close IN; while ($lines =~ s/Key.+?value=(\d+).+?Screen:add.+?value=(\d+).+?Xml: +sendRequest.+?value=(\d+).+?Xml:onResponse.+?value=(\d+).+?Xml:proces +sing.+?value=(\d+)//s){ # then I would use $1 - $5 }

I'm not sure yet how to incorporate your solution into what I have, but perhaps I should do some more reading.

Also, to clarify, the file has multiple lines but the KEY and PATTERN values don't fall on their own line as my original example illustrates. I made it a bit too simplistic. It looks more like:

BLAH BLAH BLAH KEY blah blah blah BlAH BLAH BLAH ABD KEY blah blah asdf asdf asdf asdf BLAH ASDF PATTERN blah blah

Replies are listed 'Best First'.
Re^3: pattern matching (greedy, non-greedy,...)
by AnomalousMonk (Archbishop) on Dec 17, 2009 at 01:52 UTC

    It looks like you are using a  s/// substitution to repeatedly search from the very start of the string and then snip out already-processed substrings so that you don't encounter them again. It would be so much easier (and faster, if the string/file is huge) to use the  /g modifier on a  m// match and deal with each sub-string as it is found. See Modifiers in perlre, also see perlretut, perlrequick.

    A little whitespace and formatting never hurts, either. See the  /x modifier in the references above.

    Another suggestion is to factor out sub-patterns with a collection of  qr// regex object definitions (see references above). As with code in general, such factoring allows you to better understand and control the final regex. An example of such factoring is in the code of my original reply.

    OTOH, since it looks like you may be trying to parse XML, the best advice might be to not use regexes at all; use one of the many fine XML parser modules from CPAN: see XML::Parse (others will be better able than I to advise you on this).

Re^3: pattern matching (greedy, non-greedy,...)
by AnomalousMonk (Archbishop) on Dec 17, 2009 at 02:15 UTC
    ... the file has multiple lines but the KEY and PATTERN values don't fall on their own line as my original example illustrates.

    No matter. Just don't use the  ^ $ embedded newline anchors at the beginnings and ends of your start and stop patterns. (Of course, they can still be used elsewhere.) See discussions of the m regex modifier (Modifiers) in perlre and other cited refs. The example string in Re: pattern matching (greedy, non-greedy,...) has no newlines in it at all, anywhere!

    Update: Oops. This reply would have been better as an update to Re^3: pattern matching (greedy, non-greedy,...).