perlNewby has asked for the wisdom of the Perl Monks concerning the following question:

Ok, my problem is quite simple, but it is frustrating since I am a beginner and cannot get my REGEX to match like it is supposed to. I am trying to get it to match:

"sort by (CPU|USER|PID|MEM|COMMAND)" so it should match sort by followed by one of the choices of command and allow for spaces before, between and after each word entered. It works, but it will only match the last 4, but will not match the first option "CPU". I switched words around and put MEM first, then it stopped matching MEM and then matched CPU fine. Oh, and I made it case insensitive also, basically it all works perfect except the first option never matches. Any help would be greatly appreciated. This is my REGEX:

$command =~m/\s*"sort"\s+"by"\s+(CPU|USER|PID|MEM|COMMAND)/i

Replies are listed 'Best First'.
Re: regex issue
by moritz (Cardinal) on Feb 17, 2012 at 08:36 UTC

    The quotes inside the regex seem to be a problem, since there are no such quotes in the command.

    $ perl -wE '"sort by CPU" =~ m/sort\s+by\s+(CPU|USER|PID)/ and say $1' CPU

    Works just fine.

      wow, thanks a lot! I didn't think that would matter. I only put the "" in there in the first place because the word sort turned red like it was going to try to use the sort function and I thought it would give me an error so I put them in there to try to force it to see sort as a string. Thanks again!</P

        Don't trust the syntax highlighting in your text editor as an authority on how perl itself will treat a script.

        Perl has been proven to be impossible to parse. (If you define "parse" to mean "determine the structure of without executing it". Clearly it is possible to determine the structure of Perl code if you actually execute it.) Text editors do their best, but sometimes fall short.

        The editors that tend to do the best highlighting for Perl in my experience are Padre and SciTE. With SciTE, the only Perl syntax that seems to consistently confuse it is:

        sub uppercase ($) { return uc $_[0]; }

        (Yes, I'm well-aware that this is a useless function. It's just an example.) SciTE will highlight the $) in the prototype as if it were the $) built-in EGID variable.

Re: regex issue
by choroba (Cardinal) on Feb 17, 2012 at 08:31 UTC
    Parentheses have a special meaning in regular expressions (capturing). You have to escape them to get the literal characters. Also, there are too many double quotes in your expression:
    $command =~m/\s*"sort\s+by\s+\((CPU|USER|PID|MEM|COMMAND)\)"/i

      I was excited when I saw your post, but it still doesn't work correctly. It now only matches: sort by user, sort by pid, and sort by mem, but will not match sort by cpu or sort by command. Any other suggestions?

        OK, I probably do not undestand what you are trying to match. Can you show several strings that should be matched and several that should not?
Re: regex issue
by Anonymous Monk on Feb 17, 2012 at 08:30 UTC
    Count the " characters in your regex, your source text does not have that many " characters