I second JohnGG's parser approach and would go further and say that there seems to be a questionable hobby among some people of trying to solve things with a single regexp and creating a real Godzilla of a regexp in the process that would be rather difficult to maintain in the future. The advantage of a parser is that the code follows directly from the language rules and anyway is normally needed to include a thrower to move past flexible whitespace and comments and a lexer to get and identify language elements such as a quoted string, an operator, an identifier etc., it being that a combination of possibilities may be allowable at each step in the parser's run.