simple replace question

jck has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: simple replace question by graff (Chancellor) on Jul 06, 2009 at 13:55 UTC
If the string does not have `<p>` or `</p>` tags in the middle (or if it does and you want to get rid of those as well), you could do it like this: `$testtext =~ s{</?p>}{}g;` [download] (but when removing these tags from the middle of a string, it might be better to replace them with some sort of whitespace, to avoid creating run-on words)	[reply] [d/l] [select]
Re: simple replace question by JavaFan (Canon) on Jul 06, 2009 at 13:47 UTC
It removes `<p>` at the beginning, or `</p>` at the end of the string, but not both. You might want to use the `/g` modifier. Or what I would do: `$testtext =~ s/^<p>//; $testtext =~ s!</p>$!!;` [download]	[reply] [d/l] [select]
Re^2: simple replace question by jck (Scribe) on Jul 09, 2009 at 22:51 UTC
JavaFan - thanks! In your second line, the "!"s should be "/"s, right?	[reply]
Re: simple replace question by poolpi (Hermit) on Jul 06, 2009 at 19:32 UTC
If you need to deal with a HTML document: `#!/usr/bin/perl use strict; use warnings; use HTML::TokeParser; # Simple example : my $doc = <<END; <html> <head> </head> <body> <p>FooOooo barbar baz</p> <p>Babar Foofoo zba zba</p> <p>oofOOoof zzbb aarr</p> </body> </html> END my $p = HTML::TokeParser->new( \$doc ); while ( $p->get_tag("p") ) { my $text = $p->get_trimmed_text; print "Text: $text\n"; }` [download] hth, PooLpi 'Ebry haffa hoe hab im tik a bush'. Jamaican proverb	[reply] [d/l]
Re: simple replace question by morgon (Priest) on Jul 06, 2009 at 17:02 UTC
Rather then capturing the tags I would capture the content: `$testtext =~ s\|^<p>(.*?)</p>$\|$1\|;` [download]	[reply] [d/l]
Re^2: simple replace question by jck (Scribe) on Jul 09, 2009 at 23:21 UTC
morgon, this looks like the best suggestion!! given the greediness of grep, this approach would ignore any internal <p> tags? that's what i want - just to strip off the first and last tag. thanks - janaki	[reply]
Re: simple replace question by stevemayes (Scribe) on Jul 06, 2009 at 17:19 UTC
Text::Trim's function is to strip whitespace off the ends of strings (although I think that I really like morgan's approach).	[reply]
Re: simple replace question by Marshall (Canon) on Jul 07, 2009 at 08:27 UTC
There are a number of regex's that would work here. The basic idea is to replace stuff that looks like `<p> or <Xp>`with nothing via a match global. `#!/usr/bin/perl -w use strict; my $test = '<p>this is some<1p> paragraph</p>'; $test =~ s/<.?p>//g; #s/<.??p>//g; #also ok print "$test\n"; __END__ prints: this is some paragraph` [download]	[reply] [d/l] [select]