in reply to Regex bafflement

You're modifying $text, but looping over the initial set of matches. The way you described the problem, this might not matter, but I'd stick it in a while loop anyway to see if that makes a difference (eg. while (defined (my ($match) = $text =~ /(foo)/sn)) { ... }).

I also note though that the HTML you've shown us isn't necessarily invalid, if we assume a "transitional" doctype and a suitable container so that dangling paragraph gets auto-closed. If you haven't yet actually seen what an HTML parser will do with it, it might still be an option.