in reply to Re^3: Keep quotes around numerical attributes after parsing with HTML::Treebuilder?
in thread Keep quotes around numerical attributes after parsing with HTML::Treebuilder?

Well "...ish".

You can't write HTML 4.01 that conforms to XML standards, but XHTML 1.0 is a language that reimplements HTML 4.01 in XML.

The obvious change to give as an example is that <br> in HTML is <br/> in XHTML - but if you tried to use <br/> in HTML then it would mean the same as <br/>> (or a line beak followed by a greater than symbol).

So you can't just output XHTML and then slap an HTML Doctype on it. (Heck, its not really safe to serve XHTML as text/html despite what it says in Appendix C of the XHTML 1.0 Spec.)

  • Comment on Re^4: Keep quotes around numerical attributes after parsing with HTML::Treebuilder?

Replies are listed 'Best First'.
Re^5: Keep quotes around numerical attributes after parsing with HTML::Treebuilder?
by CountZero (Bishop) on Jul 19, 2005 at 16:07 UTC
    if you tried to use <br/> in HTML then it would mean the same as <br/>> (or a line break followed by a greater than symbol)
    Are you sure? I tried <br/> in a small HTML-file and had it validated by the "official" W3C-validator at http://validator.w3.org.
    The uploaded file was tentatively found to be Valid. That means it would validate as HTML 4.01 Strict if you updated the source document to match the options used (typically this message indicates that you used either the Document Type override or the Character Encoding override). Source Listing Below is the source input I used for this validation:
    1: <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"> 2: <html> 3: <head><title>TITLE</title></head> 4: <body><p>TEST<br/>TEST</p></body> 5: </html>
    I don't think you can "escape" characters with special meaning in (X)HTML by using a slash. You must use entities for that.

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

      if you tried to use <br/> in HTML then it would mean the same as <br/>> (or a line break followed by a greater than symbol)
      Are you sure? I tried <br/> in a small HTML-file and had it validated by the "official" W3C-validator at http://validator.w3.org.

      Oh its valid - it just means something different - a line break followed by a greater than sign (rather than just a line break). Anywhere you can have a line break you can have character data, and a greater than character is character data.

      I did make a typo in my previous comment though.

      In HTML <br/> means the same as <br>>, not <br/>> (because <br/ means the same as <br> - SGML is complicated)

        <br/> doesn't show a line break followed by a greater than symbol in version 1.05 of FireFox. To do that I have to effectively double-up the > symbol.

        <br/ slurps everything until the next > into the line break, so it is definitely not the same as <br/>!

        CountZero

        "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law