ultranerds has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I'm probably going to get scalded for this, but I've gotta ask ;) I'm trying to remove the following contents from a HTML string:

<!-- #BeginEditable "body" --> <table width="100%" border="0">< +tr><td width="45%" align="center"><span class="title"></span></td> <td width="5%" rowspan="3" align="center"> +<img src="http://www.backyardgardener.com/images/spacer.gif" width="2 +0" height="20"></td> <td width="50%" rowspan="3" valign="top"> + <!-- Begin: Sponsor Table --> </td> </tr><tr><td width="45%" align="left"> <img +src="http://www.backyardgardener.com/images/annualpop1.jpg" alt="Mayb +e I can get those ducks to come to my water garden!" width="180" heig +ht="245" align="right"><p class="main"> &nbsp;&nbsp; <a href="../gard +en-themes/annual-garden/annual-flowers-and-garden-designs-25/#article +">Articles</a> :: <br> &nbsp;&nbsp; <a href="./#link">Links</ +a> ::<br> &nbsp;&nbsp; <a href="../garden-themes +/annual-garden/annual-flowers-and-garden-designs-25/#design">Design</ +a> ::<br> &nbsp;&nbsp; <a href="../garden-themes +/annual-garden/annual-flowers-and-garden-designs-25/#use">How to Use</a> ::</p> </td> </tr></table>
The contents can vary inside the table, so I'm trying to find it using a non greedy selector. The code is:
my $code = qq| <td width="100%" class="main12" bgcolor="#FFFFFF" valign="top"> </tr><tr><td width="50%" valign="top"> <table width="100%" border="0" cellspacing="0" cellp +adding="12"><tr><td><!-- #BeginEditable "body" --> <table widt +h="100%" border="0"><tr><td width="45%" align="center"><span class="t +itle"></span></td> <td width="5%" rowspan="3" align="center"> +<img src="http://www.backyardgardener.com/images/spacer.gif" width="2 +0" height="20"></td> <td width="50%" rowspan="3" valign="top"> + <!-- Begin: Sponsor Table --> </td> </tr><tr><td width="45%" align="left"> <img +src="http://www.backyardgardener.com/images/annualpop1.jpg" alt="Mayb +e I can get those ducks to come to my water garden!" width="180" heig +ht="245" align="right"><p class="main"> &nbsp;&nbsp; <a href="../gard +en-themes/annual-garden/annual-flowers-and-garden-designs-25/#article +">Articles</a> :: <br> &nbsp;&nbsp; <a href="./#link">Links</ +a> ::<br> &nbsp;&nbsp; <a href="../garden-themes +/annual-garden/annual-flowers-and-garden-designs-25/#design">Design</ +a> ::<br> &nbsp;&nbsp; <a href="../garden-themes +/annual-garden/annual-flowers-and-garden-designs-25/#use">How to Use</a> ::</p> </td> </tr></table><table width="100%" border="0"> +<tr><td colspan="2"><span class="main"><strong><a href="/garden-theme +s/annual-garden/annual-flower-information-what-to-know-and-how-to-use +/">Information on 50+ annual flowers</a></strong></span></td> <a name="article" id="article"></a><b>Ar +ticles </b> <p class="main">&nbsp;&nbsp; <a href="/g +arden-themes/annual-garden/118/">Try growing them form seed</a><br> &nbsp;&nbsp; <a href="/garden-themes/a +nnual-garden/growing-annuals-for-quick-color/">Grow annuals for quick color</a> </p> <p class="main"><a name="design" id="des +ign"></a><b>Design</b></p> <p class="main"> &nbsp;&nbsp; <a href="/ +garden-themes/annual-garden/annual-garden-design/">Design 1 </a> - <a href="/garden-themes/annua +l-garden/here-is-design-you-can-plant/">Design 2</a> - <a href="/garden-themes/annual-garde +n/annaul-flower-garden-design-8-plants/">Design 3 </a><br> &nbsp;&nbsp; <a href="http://www.backy +ardgardener.com/images/annual1a.gif" target="_blank">Design 4 </a> - <a href="http://www.backyardg +ardener.com/images/annual3.gif" target="_blank">Design 5 </a> - <a href="http://www.backyardg +ardener.com/images/annual4.gif" target="_blank">Design 6 </a><br> |; $code =~ s|\Q<!-- #BeginEditable "body" -->\E.*?\Q</table>||im; print "NEW: $code\n";


However, this doesn't seem to work :( I've gotta be missing something stupid. Can anyone point me in the right direction?

Thanks!

Andy

Replies are listed 'Best First'.
Re: Really basic regex question!
by AnomalousMonk (Archbishop) on Mar 24, 2016 at 06:13 UTC

    Your  $code string seems to have newlines in it, and by default,  . (dot) (update: in the  .*? regex expression) does not match newlines. Try using the  /s (dot matches all) regex modifier on your substitution.

    Update: Please see Modifiers;   s/// in Regexp Quote-Like Operators.


    Give a man a fish:  <%-{-{-{-<

      Ah man - now I feel stupid! That worked perfectly, thanks! :)