Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: How to remove HTML tags from text

by gellyfish (Monsignor)
on Feb 04, 2005 at 12:22 UTC ( [id://428033]=note: print w/replies, xml ) Need Help??


in reply to How to remove HTML tags from text

Personally I would go with HTML::Parser:

#!/usr/bin/perl use strict; use warnings; use HTML::Parser; + my $data='abcd efgh<img src="http://test.com/image.gif">ijklmn'; my $parser = HTML::Parser->new( text_h => [ sub { $_[0]->{_data} .= $_ +[1]; },"self,dtext" ], start_document_h => [ sub { $_[0]->{_d +ata} = '';}, "self"]); $parser->parse($data); + print $parser->{_data};

/J\

Replies are listed 'Best First'.
Re^2: How to remove HTML tags from text
by holli (Abbot) on Feb 04, 2005 at 13:01 UTC
    Alternative using Html::Tokeparser:
    use strict; use HTML::TokeParser; # from file my $p = HTML::TokeParser->new("test.html") or die "Can't open: $!"; #from string #my $p = HTML::TokeParser->new(\"text1 <b> text2 </b> text3"); my $t; while (my $token = $p->get_token) { $t .= $token->[1] if $token->[0] eq "T"; } print $t;

    holli, regexed monk

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://428033]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (2)
As of 2024-04-25 06:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found