in reply to In HTML , I Want to process only Data and Not tags

No. No way. Never ever (well not often) try to use simple rexen for parsing markup - life is too short to spend the months you would likley need to get the bugs out when others have already done it for you. For HTML see HTML::TreeBuilder. For XHTML or XML see XML::Twig.

See some of the answers to how to eliminate all html tags in a given string ??, and in particular the sample code shown in Re: how to eliminate all html tags in a given string ?? for some sample code and other related suggestions.


DWIM is Perl's answer to Gödel
  • Comment on Re: In HTML , I Want to process only Data and Not tags

Replies are listed 'Best First'.
Re^2: In HTML , I Want to process only Data and Not tags
by revdiablo (Prior) on Jul 25, 2006 at 21:33 UTC

    I second the vote for HTML::TreeBuilder, but I also would like to recommend XML::TreeBuilder. It uses the same handy API, which just makes my life so much simpler. There are most likely cases where other modules -- such as XML::Twig -- make more sense, but I don't know of them off the top of my head.