astronogun has asked for the wisdom of the Perl Monks concerning the following question:

Hi I would like to know on how to arrange parsed HTML file or align them so that they are in order

I have this "samplehtml.txt" and the contents as follows:

<hr><div > <h1>Dave</h1> <h2>PASS</h2> <table border=0> <tr><td><b>Remarks:</b></td><td></td></tr> <tr><td>Good Grade!</td></tr> </table> </div> <hr><div class="fail"> <h1>Bryan</h1> <h2>FAIL</h2> <table border=0> <tr><td><b>Remarks:</b></td><td></td></tr> <tr><td>Bad Grade!</td><td> </table> </div> <hr><div > <h1>Dan</h1> <h2>PASS</h2> <table border=0> <tr><td><b>Remarks:</b></td><td></td></tr> <tr><td>Good Grade!</td></tr> </table> </div> <hr><div class="fail"> <h1>Val</h1> <h2>FAIL</h2> <table border=0> <tr><td><b>Remarks:</b></td><td></td></tr> <tr><td>Bad Grade!</td><td> </table> </div>

Then my parsing code is this wherein it will only output the "FAIL" one

use HTML::TokeParser; open($a, "samplehtml.txt") or die("cannot open infile: $!"); $p = HTML::TokeParser->new($a); while (my $token = $p->get_tag("div")) { $text = $p->get_text("/div"); foreach ($text){ @body = grep /FAIL/, $text; print @body, "\n";} }

and the output in cmd will display the FAIL one with it's contents

Bryan FAIL Remarks: Bad Grade! Val FAIL Remarks: Bad Grade!

What I want to achieve is to align them for better viewing.. Hope you could help me achieve this

The result that I want is something like this:

Bryan FAIL Remarks: Bad Grade! Val FAIL Remarks: Bad Grade!

Thank you monks!

Replies are listed 'Best First'.
Re: How to arrange or align parsed HTML
by tobyink (Canon) on May 23, 2012 at 06:57 UTC
    use HTML::HTML5::Parser; use HTML::HTML5::ToText; print HTML::HTML5::ToText -> new -> process( HTML::HTML5::Parser->load_html(IO => \*DATA) ); __DATA__ <hr><div > <h1>Dave</h1> <h2>PASS</h2> <table border=0> <tr><td><b>Remarks:</b></td><td></td></tr> <tr><td>Good Grade!</td></tr> </table> </div> <hr><div class="fail"> <h1>Bryan</h1> <h2>FAIL</h2> <table border=0> <tr><td><b>Remarks:</b></td><td></td></tr> <tr><td>Bad Grade!</td><td> </table> </div> <hr><div > <h1>Dan</h1> <h2>PASS</h2> <table border=0> <tr><td><b>Remarks:</b></td><td></td></tr> <tr><td>Good Grade!</td></tr> </table> </div> <hr><div class="fail"> <h1>Val</h1> <h2>FAIL</h2> <table border=0> <tr><td><b>Remarks:</b></td><td></td></tr> <tr><td>Bad Grade!</td><td> </table> </div>
    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
Re: How to arrange or align parsed HTML
by Anonymous Monk on May 23, 2012 at 04:50 UTC

      Hi

      the "trim leading & trailing whitespace" did the trick thanks for the help