in reply to Re^3: Using regex to separate parameters
in thread Using regex to separate parameters

Thanks ikegami for your explanation, actually my script run on linux and it parse data which it get from windows registry. This input data dont follow logic and can be simple malicious. I cannot use any of Win32 modules which using XS part. The command is ambiguous because it intend to be ambiguous to misslead users and standard programs like antiviruses etc.

right now I stick to:
sub find_parameters { my $input = $_[0]; if($input =~ /^(.+?\.\w{3})$/){ print "only one executable: $input\n"; }elsif($input =~ /^(.+?\.\w{3})( +)([\/\-].+)$/ && -f $1 ){ print "$1 with parameters $3\n"; }elsif($input =~ /^“(.+)”$/){ # strip quotes and parse again find_parameters($1); }else{ print "not parsed: $input\n"; }; };

Replies are listed 'Best First'.
Re^5: Using regex to separate parameters
by ikegami (Patriarch) on Jul 06, 2008 at 19:03 UTC
    Have you considered a 1:n solution, where one path leads to multiple possible parsings? I'm not sure what you are trying to do, but you might be able to treat all parsings as valid until proven otherwise.
      a 1:n solution

      ikegami can you provide an example please?

      because I run this script on linux problem with blanks in the file path solved easy by substitution with "\ ". Case-sensitivity-naming not a problem on linux too (but on windows where Text.txt, TEXT.TXT and text.txt can co-exist in the same directory).

      I can parse msdos notation too:
      c:\windows\system32\tourst~1.exe
      with
      \w{1-7}~\n+\.\w{3}


      here an another missleading example:

      start bar where bar - executable without extension. But ok, this is OT, out of scope of my question about parameters.
        ikegami 1:n solution is to take all possible interpretations and work with all of them. If you don't know whether 'foo bar' is 'foo.exe bar' or 'foo\ bar.exe' then give both back as solutions. This makes sure you don't overlook a valid interpretation, but it generates more false positives.