Re: Parsing HTML tags with regex

Parsing HTML code (correctly) with hand-crafted regexes is not a feat to be undertaken lightly. It has been known to cause chronic headaches in hobbyists and professionals alike. And then, you have to worry about parsing erroneous HTML code...

There's a very good reason why people reccomend you use an HTML::* module. But I suppose you should go ahead. You'll learn more than just a bit about regexes, you'll learn why CPAN is so important to the community.

--
perl: code of the samurai

Comment on Re: Parsing HTML tags with regex

Replies are listed 'Best First'.
Re: Re: Parsing HTML tags with regex by tfrayner (Curate) on Oct 03, 2002 at 13:27 UTC
...and once you're tired of it, check out HTML::TableExtract, which was practically written with your exact problem in mind :-) HTH, Tim Update: Okay, I'm lazy, I didn't post the code to actually do the job (mainly because I think it really is that trivial). But the wonderful blakem submitted this node to another thread describing what I was thinking. So go upvote him instead :-)	[reply]