in reply to Re: Is there a better way to write this RegEx?
in thread Is there a better way to write this RegEx?

That seems to work, despite the fact I was under the impression that the first .* would gobble up the entire line, being greedy and all.

Also, I guess I should have said I'm attempting to avoid .* ... learning experience, I guess. but, yes, this seems like a much more elegant solution then the one I had written by far.

I'm still trying to figure out why the .* doesn't grab the entire line ... because of the second .*, I guess?

And, I'm going to see if I can find rindex in the perldocs because, frankly, I've never seen/heard of it before. :)

- Erik
theAcolyte

ps. I know there are log parsers on CPAN, but I'm just mucking around ... i have a full script doing what I want and working just fine ... but I noticed how ugly my regEx was and wanted to see how to improve on it :)

  • Comment on Re: Re: Is there a better way to write this RegEx?

Replies are listed 'Best First'.
Re: Re: Re: Is there a better way to write this RegEx?
by japhy (Canon) on Apr 20, 2004 at 13:10 UTC
    The reason the first .* doesn't match the whole string is because the regex is like a persistent ex, it doesn't want to lose. Regexes try hard to match. The regex /.*](.*)/ matches as much of the string as possible, and then tries to match a bracket. When it realizes it can't, it backs up to the last bracket it passed, and then tries matching the rest of the regex. This process is called "backtracking" and is an integral part of any regular expression engine.

    I could tell you that it backtracks one character at a time until it finds a bracket, but that's not true. It's optimized in a case like this to jump backwards to the bracket all at once.

    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a job (NYC-area)
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;