The condition/given for this code is that all input to it has already been validated/cleaned via an HTML::TokeParser filter. So the only tags it will see are those that are correct, correctly nested, and don't contain tricky stuff like '"\>".'
Given that, is this a sound approach to removing empty tags? Better ideas? I know it could be done within the TokeParser routines but those are already a bit complex with two big named loops and I'd rather do it the easy way, if it's reasonable, than add in more logic or resort to unget_token() back and forth.
1 while $body =~ s,<(\w+)[^>]*>\s*</\1\s*>,,g;In reply to Empty tags cleaner regex (for pre-validated XHTML) by Your Mother
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |