In the HTML parsers I've written, I've always went loose for several reasons.
- I've seen a lot of malformed HTML.
- I see a lot of odd "HTMLish" tags embedded for processing / templating
- I've done 2. myself.
Personally, I would just make the "strictness" a method you could call so you can have it both ways (carp or croak). The other thing I've done in the parsers I've written is to allow tags to be specified
'tag' and
'/tag' as well so the problem can be circumvented all together. The later fits my thinking well.
-Lee
"To be civilized is to deny one's nature."