cayenne has asked for the wisdom of the Perl Monks concerning the following question:

In all the HTML::Parser examples I can find, the subroutines it calls print text or modify variables.

What I'd really like is to get it to return a value; for example if I had a parser that removed tags I'd like to be able to pass it a string with html in it and get back the string that resulted from removing the tags.

I can think of ways to achieve the same end results, but they all seem quite-not-elegant. I was wondering if anyone had any suggestions/knew of a good way to do this, so I can save myself the trouble of doing it one way and then deciding I did it all wrong and changing it and then later realizing that was all wrong etcetera...

Thanks,
Cayenne

Replies are listed 'Best First'.
Re: Getting strings from HTML::Parser
by gav^ (Curate) on Mar 11, 2002 at 04:05 UTC
    An example using HTML::Parser is:
    use HTML::Parser; print strip("<b>Hello</b>!\n"); sub strip { my $html = ''; my $p = new HTML::Parser( api_version => 3, text_h => [ sub { $html .= shift }, 'text'], ); $p->parse(shift); $p->eof; return $html; }
    Another example can be found here.

    gav^

(bbfu) (HTML::PullParser) Re: Getting strings from HTML::Parser
by bbfu (Curate) on Mar 11, 2002 at 03:24 UTC

    Try using HTML::PullParser, like so:

    #!/usr/bin/perl use warnings; use strict; use HTML::PullParser; print strip("<b>Hello</b>!\n"); sub strip { my $html = shift; my $parser = HTML::PullParser->new( doc => $html, text => 'text', ); my $result = ''; while(my $t = $parser->get_token) { $result .= $t->[0]; } return $result; };

    bbfu
    Seasons don't fear The Reaper.
    Nor do the wind, the sun, and the rain.
    We can be like they are.