Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight

Re: another regex Question

by athomason (Curate)
on May 24, 2000 at 08:15 UTC ( [id://14493] : note . print w/replies, xml ) Need Help??

in reply to another regex Question

First read this faq. In a nutshell, HTML parsing, especially something like analyzing arbitrary tables, is pretty difficult. There are modules designed especially for this, though, so check out HTML::Parser and HTML::TokeParser. Also see answers to a similar question here.

Replies are listed 'Best First'.
RE: Re: another regex Question
by perlcgi (Hermit) on May 24, 2000 at 16:15 UTC
    There's a ready-made subclass of HTML::Parser which should help.
    Check out HTML::TableExtract
RE: Re: another regex Question
by Anonymous Monk on May 24, 2000 at 16:02 UTC
    Yes, you need to be very careful with HTML, as:

    cellpadding, cellPadDing, ceLLpaddinG are all the same. I actually saw a table-extract module which may be of use to you. Personally, I would not advise doing any HTML parsing yourself. Use modules -- their authors know their stuff!