I was thinking of a similar type project. Long ago
japhy posted some suggestions for how to avoid
using regexes unless absolutely necessary, because
often index() or substr() or tr() will do the job
and do it faster and with less use of a more
cryptic mini-langauge.