in reply to 2Re: Removing underline tags with regexp (is a good idea)
in thread Removing underline tags with regexp

Using an XML module to parse HTML is an exercise in long, excruciating, ultimately futile pain. There is a pehnomenal amount of HTML that just isn't well-formed that you just aren't going to get anywhere useful this way with anything other than HTML you generate yourself.
  • Comment on Re: 2Re: Removing underline tags with regexp (is a good idea)

Replies are listed 'Best First'.
4Re: Removing underline tags with regexp (is a good idea)
by jeffa (Bishop) on Sep 02, 2003 at 13:46 UTC
      That code'll break quite a few web pages, making them render incorrectly (Where "incorrectly" means they don't function any more) and in some cases change the semantics of the page. (Not that you could necessarily intuit the semantics without knowing the various broken ways that each browser interprets the page, but...) There are a depressing number of pages that are intentionally breaking standards because that's the only way to get them to render properly.