Mine verify doctypes and even convert from a doctype to another (for instance from html 4.01 transitional to strict, or to xhtml 1.0 strict, etc). It also corrects character encodings IIRC.

Tidy does some nice things, but it doesn't do the checks that the W3C validator does - it's not doing a full parse of the HTML based on the docs.

For example the W3C Markup Validation Service will catch the error in this broken XHTML:

<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-s +trict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>A random test file</title> </head> <body> <P>Oops - XHTML tags must be in lower case</p> </body> </html>

but tidy thinks it's fine:

% tidy -e test.html Info: Doctype given is "-//W3C//DTD XHTML 1.0 Strict//EN" Info: Document content looks like XHTML 1.0 Strict No warnings or errors were found. To learn more about HTML Tidy see http://tidy.sourceforge.net Please send bug reports to html-tidy@w3.org HTML and CSS specifications are available from http://www.w3.org/ Lobby your company to join W3C, see http://www.w3.org/Consortium

I'm not saying tidy isn't useful. It's a great tool from basic checks and fixes on HTML. But it doesn't validate (X)HTML.


In reply to Re^5: Wanted, more simple tutorials on testing by adrianh
in thread Wanted, more simple tutorials on testing by punkish

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.