in reply to Re^2: Using regex to separate parameters
in thread Using regex to separate parameters

Quoting CreateProcess,

If you are using a long file name that contains a space, use quoted strings to indicate where the file name ends and the arguments begin; otherwise, the file name is ambiguous. For example, consider the string "c:\program files\sub dir\program name". This string can be interpreted in a number of ways. The system tries to interpret the possibilities in the following order:

c:\program.exe files\sub dir\program name
c:\program files\sub.exe dir\program name
c:\program files\sub dir\program.exe name
c:\program files\sub dir\program name.exe

So if you want to behave like CreateProcess, you'll have to include file tests in your regexp or solution.

But like I said above, this is a wrong answer, but it might be the wrong answer you're looking for.

It's wrong cause it fails to handle "c:\program.exe files\sub" as "c:\program.exe files\sub.exe" if "c:\program.exe" exists.

It's wrong cause it fails to handle "c:\program.exe files\sub" as "c:\program.exe" if "c:\program.exe" doesn't exists or is currently unavailable (say due to network problems).

Replies are listed 'Best First'.
Re^4: Using regex to separate parameters
by resistance (Beadle) on Jul 06, 2008 at 18:38 UTC
    Thanks ikegami for your explanation, actually my script run on linux and it parse data which it get from windows registry. This input data dont follow logic and can be simple malicious. I cannot use any of Win32 modules which using XS part. The command is ambiguous because it intend to be ambiguous to misslead users and standard programs like antiviruses etc.

    right now I stick to:
    sub find_parameters { my $input = $_[0]; if($input =~ /^(.+?\.\w{3})$/){ print "only one executable: $input\n"; }elsif($input =~ /^(.+?\.\w{3})( +)([\/\-].+)$/ && -f $1 ){ print "$1 with parameters $3\n"; }elsif($input =~ /^“(.+)”$/){ # strip quotes and parse again find_parameters($1); }else{ print "not parsed: $input\n"; }; };
      Have you considered a 1:n solution, where one path leads to multiple possible parsings? I'm not sure what you are trying to do, but you might be able to treat all parsings as valid until proven otherwise.
        a 1:n solution

        ikegami can you provide an example please?

        because I run this script on linux problem with blanks in the file path solved easy by substitution with "\ ". Case-sensitivity-naming not a problem on linux too (but on windows where Text.txt, TEXT.TXT and text.txt can co-exist in the same directory).

        I can parse msdos notation too:
        c:\windows\system32\tourst~1.exe
        with
        \w{1-7}~\n+\.\w{3}


        here an another missleading example:

        start bar where bar - executable without extension. But ok, this is OT, out of scope of my question about parameters.