in reply to Regular Expression Problem

I think what you were trying to accomplish is:

$remainder =~ s/<a[^>]*>//i;

That won't remove the end tag or the stuff in between though.

You might want to look at HTML::LinkExtor, HTML::Parser, and HTML::TokeParser to do these kind of things reliably.

Update: due to the discussion below, it dawned on me that I need a \b in order to avoid removing abbr tags (and one or two others that start with an "a".)

$remainder =~ s/<a\b[^>]*>//i;

the_pusher_robot++ for the clue.

-sauoq
"My two cents aren't worth a dime.";

Replies are listed 'Best First'.
Re: Re: Regular Expression Problem
by the pusher robot (Monk) on Aug 29, 2002 at 03:15 UTC
    Actually, you want:
    $remainder =~ s/<a\s[^>]*>//i;
    (makes sure you only get a, not abbr, acronym, etc.)

    Update: now that I think about it, why not just: $remainder =~ s/<a\s.*?>//i; ?

    Update the second: d'oh... good catch, sauoq. How about $remainder =~ s/<a(>|\s.*?>)//i; ? Or is there a better way to do it?
      Actually, you want: . . .

      Actually, that's not what I want. Yours misses an anchor tag without attributes. As far as I know <A></A> is legal even if it isn't particularly useful. If you can show me that it isn't, I'll add the \s next time.

      I will concede that I need a \b though. :-)

      -sauoq
      "My two cents aren't worth a dime.";