Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

simple replace question

by jck (Scribe)
on Jul 06, 2009 at 13:43 UTC ( [id://777545]=perlquestion: print w/replies, xml ) Need Help??

jck has asked for the wisdom of the Perl Monks concerning the following question:

I need to strip the paragraph tags from the beginning and end of a string. i thought this would work:
$testtext =~ s/(^<p>|<\/p>$)//;
it does replace the beginning tag, but not the ending tag. i know this should be incredibly simple, but can't figure out what i'm doing wrong....

Replies are listed 'Best First'.
Re: simple replace question
by graff (Chancellor) on Jul 06, 2009 at 13:55 UTC
    If the string does not have <p> or </p> tags in the middle (or if it does and you want to get rid of those as well), you could do it like this:
    $testtext =~ s{</?p>}{}g;
    (but when removing these tags from the middle of a string, it might be better to replace them with some sort of whitespace, to avoid creating run-on words)
Re: simple replace question
by JavaFan (Canon) on Jul 06, 2009 at 13:47 UTC
    It removes <p> at the beginning, or </p> at the end of the string, but not both.

    You might want to use the /g modifier. Or what I would do:

    $testtext =~ s/^<p>//; $testtext =~ s!</p>$!!;
      JavaFan - thanks! In your second line, the "!"s should be "/"s, right?
Re: simple replace question
by poolpi (Hermit) on Jul 06, 2009 at 19:32 UTC

    If you need to deal with a HTML document:

    #!/usr/bin/perl use strict; use warnings; use HTML::TokeParser; # Simple example : my $doc = <<END; <html> <head> </head> <body> <p>FooOooo barbar baz</p> <p>Babar Foofoo zba zba</p> <p>oofOOoof zzbb aarr</p> </body> </html> END my $p = HTML::TokeParser->new( \$doc ); while ( $p->get_tag("p") ) { my $text = $p->get_trimmed_text; print "Text: $text\n"; }


    hth,
    PooLpi

    'Ebry haffa hoe hab im tik a bush'. Jamaican proverb
Re: simple replace question
by morgon (Priest) on Jul 06, 2009 at 17:02 UTC
    Rather then capturing the tags I would capture the content:

    $testtext =~ s|^<p>(.*?)</p>$|$1|;
      morgon,

      this looks like the best suggestion!! given the greediness of grep, this approach would ignore any internal <p> tags? that's what i want - just to strip off the first and last tag.

      thanks -
      janaki
Re: simple replace question
by stevemayes (Scribe) on Jul 06, 2009 at 17:19 UTC
    Text::Trim's function is to strip whitespace off the ends of strings (although I think that I really like morgan's approach).
Re: simple replace question
by Marshall (Canon) on Jul 07, 2009 at 08:27 UTC
    There are a number of regex's that would work here. The basic idea is to replace stuff that looks like <p> or <Xp>with nothing via a match global.
    #!/usr/bin/perl -w use strict; my $test = '<p>this is some<1p> paragraph</p>'; $test =~ s/<.?p>//g; #s/<.??p>//g; #also ok print "$test\n"; __END__ prints: this is some paragraph

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://777545]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (6)
As of 2024-03-29 11:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found