skoney has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, I'm a newbie so please bear with me. I'm using HTML::TableExtract to get data out of some tables and I'm trying to figure out how to put it into a hash.

As an example, let's say that every day Bob, Carol, Ted, & Alice each publish what they plan on eating for breakfast, lunch, and dinner for the day in a table on their respective websites. I want to go to each website, get the data from each table and store it in a hash so I can then create a table on my own web page which will summarize what each person is eating that day. The reason for the hash is that it's going to be tied to a DBM to which I will be adding other data from other scripts, and this way everything is stored in one DBM for easy access.

For example, here's what Bob's table would look like:

<TABLE> <TR>><TH></TH><TH>Breakfast</TH><TH>Lunch</TH><TH>Dinner</TH></TR><TR> +<TH>Fruit</TH><TD>Banana</TD><TD>Orange</TD><TD>Apple</TD></TR><TR><T +H>Meat</TH><TD>Bacon</TD><TD>Baloney</TD><TD>Burger</TD></TR><TR><TH> +Beverage</TH><TD>Coffee</TD><TD>Soda</TD><TD>Beer</TD></TR> </TABLE>
And the hash keys would be:
bob_fruit_breakfast carol_fruit_breakfast bob_fruit_lunch carol_fruit_lunch bob_fruit_dinner carol_fruit_dinner bob_meat_breakfast carol_meat_breakfast bob_meat_lunch carol_meat_lunch bob_meat_dinner carol_meat_dinner bob_beverage_breakfast bob_beverage_lunch and so on.... bob_beverage_dinner
Using the sample code in the HTML::TableExtract documentation I can print the data on the screen:
use HTML::TableExtract; $te = new HTML::TableExtract (headers=>[qw(Breakfast Lunch Dinner)]); $te->parse ($html_string); foreach $ts ($te->tables) { foreach $row ($ts->rows) { print join(',', @$row), "\n"; } }
but I can't figure out how to store it in a hash. It's obvious to me that @$row contains the data but I haven't been successful in getting it into the hash so that everything lines up, ie. the key "bob_fruit_breakfast" will contain the value "banana", etc.

Any help would be greatly appreciated!

Replies are listed 'Best First'.
Re: Getting data from HTML::TableExtract into a hash
by Krambambuli (Curate) on Oct 13, 2007 at 18:58 UTC
    The following should help you out for now:
    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; use HTML::TableExtract; my $te = new HTML::TableExtract (); $te->parse ($html_string); my $name = 'bob'; my %results; foreach my $ts ($te->tables) { my ($header_row, @content_rows) = $ts->rows; my ($placeholder, @headers) = @$header_row; foreach my $data_row (@content_rows) { my( $type, @real_data ) = @$data_row; foreach my $i (0 .. $#headers) { my $hash_key = join( '_', lc($name), lc($headers[$i]), lc($t +ype) ); $results{ $hash_key } = $real_data[$i]; } } } print Dumper( \%results ); exit;
    There would be probably many things to add; I'd say for now only that if you learn quickly to use Data::Dumper, that might be an tremendous help in getting a good start.

    Good luck.
      Thanks for the quick response, Krambambuli! It took me a little time to see what you were doing with this code (I'm still very new at this) but now I get it. Creating the hash keys on the fly is a very elegant solution, and one that hadn't occurred to me! I was planning on starting with a pre-formed hash but I can see now there's really no need to. That's why I like coming here - so many great ideas!

      Thanks for the heads-up on Data::Dumper. I'm going to go download the POD for it and see if I can figure out how to use it.

      Thanks again!

      Tom