in reply to Extracting a substring from HTML

The right way of doing almost anything with HTML is to use the appropriate module. The appropriate module depends somewhat on the task. In this case I'd guess HTML::TreeBuilder is what you want.

Life is too short to reinvent complicated wheels, and regexen for parsing HTML are complicated wheels indeed.

If you need any help using TreeBuilder show us what you have tried with a very small (but complete) code sample showing the issue and a very small data sample as required to show the issue.


DWIM is Perl's answer to Gödel

Replies are listed 'Best First'.
Re^2: Extracting a substring from HTML
by richill (Monk) on Sep 10, 2006 at 10:17 UTC
    Thank you. I'll look at the HTML::Treebuilder now.

    I know it was a basic queston but with so many ways of doing things in perl, the benefit of experience found on here is high.

    I could spend days on clumsy solution.

      The benefit of experience is actually in CPAN. Always look there first before coding anything yourself. Get the Perl Cookbook (ISBN 0-596-00313-7) to get _productive_ right away with Perl, it's going to be your best spent $50 if you are going to work with Perl, and many cool ideas not only on the use of Perl but on many modules for specific stuff.

      Then get the what I call the trilogy: Learning Perl, Intermediate Perl, and Advanced Perl.

      And then, of course, the Camel Book. But that's just to say you have it and have read it.
      Please be careful. Package names are case-sensitive in Perl. That's HTML::TreeBuilder