in reply to Re^2: Regular Expression
in thread Regular Expression

And how do you find the closing tag?

CountZero

"If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Replies are listed 'Best First'.
Re^4: Regular Expression
by Transient (Hermit) on Jun 28, 2005 at 19:36 UTC
    Even though I believe this is a rhetorical question...
    m!</body>!
    Yes.. with multiple closing body tags, it will be icky - yes, enclosed in comments it will be double super secret fudgy icky. But the OP wants a regexp to find the tag (and actually didn't even ask about the closing tag!), so like Burger King, "Have it Your Way".
      Mulitple closing body elements are illegal, so multiple closing body tags may not a problem. That problem is </body> could appear in a comment or in quotes.
      Nope, not rhetorical, just a trap!

      This is valid XHTML:

      <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w +3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta content="text/html; charset=ISO-8859-1" http-equiv="content-ty +pe" /> <title>test</title> </head> <body/> </html>

      No </body> tag, so a naive regex breaks again.

      Of course you are right, the OP only wanted to match the <body> tag not the body-element.

      CountZero

      "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

        Funny you should mention that (we were discussing this with the <p> tag) - it's not valid because body isn't defined with an EMPTY content model (as is br, hr, etc.)

        w3c specs shown here
Re^4: Regular Expression
by ikegami (Patriarch) on Jun 28, 2005 at 19:36 UTC
    The OP didn't ask for that. He asked to match the opening tag, no matter what it had for attributes.