I personally believe that you're actually getting quite repetitive: the fact that you have admittedly "too small a knowledge" but nevertheless think that "there should be a regex solution" is in conflict with the fact that virtually everybody who is knowledgeable knows that they're poor enough for HTML parsing. And if it's not strictly to do only with HTML parsing, but is of comparable complexity, then they are still just as poor and a good enough solution may be given in terms of code comprising e.g. several regexen and other logic. Point is, regexen are powerful, and have been capable of matching stuff that is far from being "regular" for a long time, but there's some common stuff which in turn is still too irregular for them. It is my understanding that Perl 6's rules will be powerful enough to parse HTML (and some other cool things the most famous of which is Perl 6 itself!) Then they should be able to parse your other thingy, whatever it is. What's more, they will be able to do so in a very clear and readable manner, organized in grammars. Up to then, if really wanting to, you may indeed be able to have at your hands some fraction of their power in Perl 5's regexen, which may be enough or not, depending on the actual situation. But in your real world case, I suspect the solution would come out either severely unreliable or horribly looking, with various intermediate degrees between such extremes.
| [reply] [d/l] |