in reply to Re^2: How to parse HTML5?
in thread How to parse HTML5?

What kind of errors do you want? If you're after finding malformed HTML, HTML::Tidy is better, because HTML::TreeBuilder will automatically correct much of the HTML.

Replies are listed 'Best First'.
Re^4: How to parse HTML5?
by NRan (Novice) on Mar 08, 2016 at 12:20 UTC

    No I don't want auto correction

    I want i do it manual

    suppose <p> is missing, then it gives me only error log not correct it

    Thanks
    Nikhil Ranjan

      If you just want errors then maybe use tidy directly

      #!perl use strict; my $text = join '|',qw(DOCTYPE html meta header); my $re = qr/$text/; my $filename = 'd:/perl/test.xhtml'; my $tidy = '..../tidy/bin/tidy.exe'; # change to your path my @msg = qx"$tidy -eq -utf8 $filename 2>&1"; for (@msg){ print $_ unless /$re/; } # line 10 column 1 - Warning: missing </section>
      poj

        Thanks for this one

        But when i try then i got some error.

        Error massage is:-

        The program can't start because VCRUNTIME140.dll is missing from your computer. Try reinstalling the program to fix this problem.

        Then, I download "https://www.microsoft.com/en-us/download/details.aspx?id=48145" and install in my window7. But still same problem. Do you have any idea? Thanks
        Nikhil Ranjan