If what you are tryuing to do is extract the data from the table then the following code using HTML::TreeBuilder and HTML::ElementTable may be a good starting point for you:

use strict; use warnings; use LWP::Simple; use HTML::TreeBuilder; use HTML::ElementTable; my $page = get ('http://www.ovt.ncsu.edu/cotton_soy/2004/table_11.html +'); my $root = HTML::TreeBuilder->new_from_content ($page); my $theTable = $root->find ('table'); die "Table not found" if ! defined $theTable; $theTable = HTML::ElementTable->new_from_tree($theTable); for my $row (1..$theTable->maxrow()-2) { for (0..$theTable->maxcol()) { my $cellText = $theTable->cell ($row, $_)->as_text (); print "$cellText "; } print "\n"; }
LINT PLANT PERCENT UHM VARIETY OR YIELD LINT HEIGHT BOLLS S.L. UNIFORMITY T1 BRAND VARIETY LB/ACRE % INCHES OPENED (IN.) INDEX (G/TEX) MIKE ELONGAT +ION FiberMax 991BR 855** 38.8 32 31 1.17 85.0 34.8 5.3 4.3 Stoneville ST5599BR 848* 39.9 31 30 1.14 82.7 31.5 5.0 3.6 Deltapine DP555BG/RR 816* 43.0 34 25 1.19 83.1 32.9 4.6 4.0 Deltapine DP449BG/RR 815* 38.2 32 38 1.16 83.4 32.0 4.8 4.4 Deltapine DP488BG/RR 778* 40.3 31 36 1.25 85.5 34.6 4.9 4.8 Deltapine DP 445 BG/RR 775* 41.8 28 22 1.18 84.4 32.7 5.2 6.0 Deltapine DP 543 BGII/RR 772* 39.5 32 36 1.15 83.3 31.7 5.0 4.0 FiberMax 989B2R 761* 39.3 25 21 1.17 83.2 33.4 5.2 3.3 Deltapine DP 455 BG/RR 751* 40.4 36 34 1.17 84.8 34.3 4.3 4.1 FiberMax 989BR 721 38.7 32 12 1.17 83.2 31.9 5.0 3.8 Stoneville ST5454B2R 667 36.9 30 35 1.12 82.7 30.5 5.2 5.5 FiberMax 991B2R 665 36.5 27 33 1.22 85.0 35.8 4.6 3.9 Stoneville ST5242BR 630 40.3 25 27 1.13 85.5 28.3 4.9 5.6 Deltapine DP451B/RR 629 36.2 29 60 1.18 85.1 30.5 4.8 5.2 Stoneville ST6636BR 594 37.5 30 34 1.18 84.4 32.8 4.8 4.3 Deltapine DP493 492 40.2 36 46 1.18 83.9 33.1 4.2 4.1 Stoneville ST5303R 477 39.1 33 61 1.10 85.2 32.4 4.7 4.8 Deltapine DP 5415RR 458 38.4 35 33 1.17 85.0 31.6 4.1 5.3 BCG 24R 445 39.7 34 36 1.12 85.4 29.6 4.7 6.4 BCG 295 428 38.4 26 41 1.22 84.5 32.1 4.6 4.2 Deltapine DP491 426 40.5 33 41 1.26 84.9 38.7 4.4 4.3 Stoneville ST6848R 379 37.9 33 38 1.19 86.0 35.5 4.5 4.4 +Deltapine DPLX02T57R 353 37.5 32 55 1.14 84.3 28.3 4.1 7.0 FiberMax 989R 345 39.4 30 27 1.20 87.2 36.3 4.8 4.5 Deltapine DP494RR 328 40.1 34 37 1.20 84.1 34.5 4.2 4.8 Deltapine DP 5690RR 307 37.2 34 44 1.19 84.9 34.2 4.7 5.0 Mean 581 39.2 31 36 1.17 84.5 32.9 4.7 4.6 Adj.R2 (%) 78 C.V.(%) 19 BLSD(K-50) 115 s.e. 51 Error d.f. 108

Note that $theTable->maxrow()-2 ignores the last two rows to avoid a problem with missing cells in those rows and the first row is skipped for the same reason.


DWIM is Perl's answer to Gödel

In reply to Re: Should I use; Html Parser, table extract, Extractor by GrandFather
in thread Should I use; Html Parser, table extract, Extractor by a_non_moose

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.