Re: Parser for Html

This is just kinda funny ... my current project (which is starting to take too long) is to grab a bunch of data from an email (Lotus Notes) and try to figure out from there a set of rules by which I can replicate a table. So I copied the email into a notes database on a domino server, connected to it via a browser, saved the HTML, and then started using HTML::Parser to extract the data (I wish there was an easier way, but Notes is on a Windows box, and, while there is perl on that box, I prefer my Linux environment, and I didn't want to start learning OLE for this ;-}). Then I remembered something I had seen months ago on PM, and then installed HTML::TableExtract. What a difference that module made.

Since I just need stuff from a couple of tables, this made it quite trivial. Now I'm massaging it this way and that, and have found a number of inconsistencies in the table because of it.

In the past, I've done something very similar - data in webpages (again with the Domino servers), used HTML::Parser to pull out the data, and loaded it all into a DB2 database. Had I known about HTML::TableExtract at that time, I would have probably saved about 4 hours of work. And it would have been much less fragile.

Comment on Re: Parser for Html