Re: Regex - Matching prefixes of a word

Replies are listed 'Best First'.
Re^2: Regex - Matching prefixes of a word by SuicideJunkie (Vicar) on Jul 27, 2009 at 14:34 UTC
"Lock" can't really be a noise token, since it is the verb of the command (are you trying to lock, fire, enable, disable, etc those tubes?). "On" is definitely noise, and optional. While numbers must be numbers, the names can have numbers mixed in too. At the moment they could be entirely numbers, although I could restrict that. Still, how do you decide whether 'torpedo' is meant to be a ship component, or an object on the sensors? Lock countermeasure (on) torpedo? Or Lock (S.S.)Countermeasure (with) torpedo ? And then what happens when you come up to a port, and try to "dock lock" instead of "lock dock"? :) What I have gone with so far is a general layout of {verb} {listenerCategory {all\|list of numbers}} {parameters}. Sometimes there are no parameters ("enable all drives"), sometimes there are no listeners ("course 90"), but the order holds. The parameters are the most flexible part, but often limited to just one or two values anyways. Sometimes the noisewords might be required to disambiguate the command, such as when the parameters start with a number. I don't have any such situations in the current configuration, and I can't think of a good example. Even "load tube 1 2 3 40" would correctly match the 40 as your ordnance type and not torpedo tube #40, thanks to backtracking. Note: I don't actually check the correctness of the names or categories in the regex; provided that it looks like a specific command from a high level view, I capture the fields and check them in the addCommand() function. If the parameters are not named, then they have to be in a fixed order. ("helm 180 5" vs "helm course 180 speed 5" or "helm speed 5 course 180) The listeners are restricted to an exact match from a list of category synonyms. "Disable all R" does not shut down the reactor and the radar and the railgun in one go; you have to do those separately unless the config has defined a common category for them. By default, the config intends such a set of actions to cost multiple turns unless there is such a common category. Object targets, on the other hand, only select a single object from your sensor database hash, and that allows a fuzzy match, with a clarification response if there is not exactly one result. As far as layout, I've got a trinary operator chain where I link each regex to a function call (addCommand()) that checks params, munges them into a set of sub refs and then queues the final result up for action when the turn ends. ("Belay that" is available to pop the order queue, just in case) :) Each command has its own regex or three to provide for alternate syntax without getting too line-noisy, and adding new commands is just a matter of tacking a new `$cmd =~ /regex/i ? addCommand('what', $1, $3, $2) :` type line onto the end of it.	[reply] [d/l]
Re^3: Regex - Matching prefixes of a word by Marshall (Canon) on Jul 28, 2009 at 03:13 UTC
WOW! sounds like a very cool game! I agree with CountZero that this thing doesn't appear to lend itself well to a single regex per command entry. I suspect that you will wind up saying doing a first pass that just identifies the type of tokens on command line: alpha numeric alphanumeric. So for example from what I can tell an alphanumeric thing is always wrong "Speed10", "10speed", or maybe not? - I'm guessing "speed 10" and "10 speed" would be ok, but a "run-together" thing combining 'speed' and '10' isn't valid. If you put regex code that says that "10speed" isn't valid into each command "regex", this could get to be a very big mess! I think that you are into quite a bit more than regex'es and will need a parser of the command line. But depending upon the grammar, it may not be that complex and is perhaps even easier than the regex type approach.	[reply]
Re^4: Regex - Matching prefixes of a word by SuicideJunkie (Vicar) on Jul 28, 2009 at 13:23 UTC
I'm not sure I follow where you're coming from when you say '...put regex code that says that "10speed" isn't valid into each command "regex"...' The idea is that if the regex matches the string, then I immediately know that the command is a $verb, with parameters $1, $2, $3, $4 etc, and can pass that right along with one function call. Only if I can't recognize anything, does the doctor prompt you to use the help command. I'm not trying to filter anything out, just to figure out what they meant as best I can without too much effort. The regex are generous with matching, although the parameters extracted are subject to more scrutiny, and nothing dangerous is actually done with the data in the end. I've got just 39 lines, covering 16 types of command and the various ways to say them. Example snippet from the fire command. I plan to compact the 'all X' vs 'X (all)?' vs 'X Indexes' into a single alternation that I can reuse as a '$componentSelection' but after I've gotten these gunners to shoot in the right direction. `# FIRE!!! $cmd =~ /^$regexSubstringOf{fire}\s+$regexName(?:\s+(all))?\s* +$/i ? setCommand($player, 'fire',[] , $1, 'all') : $cmd =~ /^$regexSubstringOf{fire}\s+(all)?\s+$regexName\s$/i +? setCommand($player, 'fire',[] , $2, 'all') : $cmd =~ /^$regexSubstringOf{fire}\s+$regexName$regexIndexes\s +$/i ? setCommand($player, 'fire',[] , $1, $2) :` [download] PS: I'll add 'replace "\s+" with "\s*" where not required' to the list, so you can punch in things like "angle45dispersion15" and "speed10". Since the pieces of that parameter subsection are known to be alpha-only and number-only, there is no ambiguity. Spaces required, only where required.	[reply] [d/l]