monkfan has asked for the wisdom of the Perl Monks concerning the following question:
The HTML file (myfile.html) that I want to parse and obtain the TOTAL result looks like this:#!/usr/bin/perl -w use strict; use Data::Dumper; use Carp; use HTML::TableExtract; my $temp_file = do { open my $in, '<', 'myfile.html' or carp "Can't open in $!\n"; local $/ = undef; <$in>; }; #-------------------------------------------------- # Extract Element of HTML Table #-------------------------------------------------- #print Dumper $temp_file ; ( my $id ) = $temp_file =~ /([\w]+\.[\w\d]+)/ms; print "$id\n"; my $te = HTML::TableExtract->new( headers => [ 'Data set','nTP','nFP', 'nFN','nTN','sTP', 'sFP','sFN',' ','nSn', 'nPPV','nSp','nPC', 'nCC','sSn','sPPV', 'sASP', ] ); $te->parse($temp_file); my @all_table_content = $te->tables; # Here to extract the 'last' row my @total = @{ $all_table_content[0]->[-1] }; print Dumper \@all_table_content ;
<html> <head> <title> scrPage </title> </head> <!-- --> <!-- jsp:setProperty name="manager" property="*" /--> <body bgcolor="#ffffff"> <h1> Assessment Score </h1> <b> Here is your confirmation ID: SP.A91389F67D1C79B4157818A8EDF2A6C2 </b> <br> <form method="get" action="http://wingless.cs.washington.edu:8080/asse +ssment/servlet"> <input type="hidden" value="submission/SP.A91389F67D1C79B4157818A8EDF2 +A6C2" name="filenameID"/> <input type="hidden" name="pageType" value="visualizationForm"/> <br> <INPUT TYPE=submit name="action" value="Visualize It"> <input type=submit name="action" value="Get Excel Spreadsheet"/> <a href=http://bio.cs.washington.edu/assessment/statistics.html>statis +tics explanation </form> <Table border = 3> <tr><th>Data set<td>nTP<td>nFP<td>nFN<td>nTN<td>sTP<td>sFP<td>sFN<td> +<td>nSn<td>nPPV<td>nSp<td>nPC<td>nCC<td>sSn<td>sPPV<td>sASP<tr><th>dm +01g<td>0<td>80<td>125<td>5795<td>0<td>8<td>7<td> <td>0<td>0<td>0.9863 +83<td>0<td>-0.0169565<td>0<td>0<td>0 <tr><th> <tr><th>Fly <td>0<td>80<td>125<td>5795<td>0<td>8<td>7<td> <td>0<td>0<td>0.986383<t +d>0<td>-0.0169565<td>0<td>0<td>0 <tr><th>Human <td>0<td>0<td>0<td>0<td>0<td>0<td>0<td> <td>NaN<td>NaN<td>NaN<td>NaN<t +d>NaN<td>NaN<td>NaN<td>NaN <tr><th>Mouse <td>0<td>0<td>0<td>0<td>0<td>0<td>0<td> <td>NaN<td>NaN<td>NaN<td>NaN<t +d>NaN<td>NaN<td>NaN<td>NaN <tr><th>Yeast <td>0<td>0<td>0<td>0<td>0<td>0<td>0<td> <td>NaN<td>NaN<td>NaN<td>NaN<t +d>NaN<td>NaN<td>NaN<td>NaN <tr><th>Total <td>0<td>80<td>125<td>5795<td>0<td>8<td>7<td> <td>0<td>0<td>0.986383<t +d>0<td>-0.0169565<td>0<td>0<td>0 </table> </body> </html>
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Problem Parsing with HTML::TableExtract
by Fang (Pilgrim) on Dec 07, 2005 at 09:18 UTC | |
|
Re: Problem Parsing with HTML::TableExtract
by johnnywang (Priest) on Dec 07, 2005 at 06:32 UTC | |
by monkfan (Curate) on Dec 07, 2005 at 07:11 UTC | |
|
Re: Problem Parsing with HTML::TableExtract
by kulls (Hermit) on Dec 07, 2005 at 06:48 UTC |