Im not sure how you would 'tokenize' it in the first place ? would that not have a regex as well ?
It sure would, but the point is that it would need one regexp per possible token type, not one huge regex that solves the whole problem.
Usually I use the tokenizer from Math::Expression::Evaluator::Lexer (don't let the name fool you; it's good for more than mathematical expressions), from which you could draw inspiration.
And don't use .* in your regexes, that's almost always an error. See Death to Dot Star!.
In reply to Re^3: Regex within html
by moritz
in thread Regex within html
by ropey
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |