in reply to Re: Regex grabs too much
in thread Regex grabs too much

It sould like you could express what you want as follows:
If one of my special (~ ... ~) tags is found inside an HTML tag, replace that whole HTML tag with my special tag. Otherwise leave it alone.
In that case the following code should do the trick:
$data =~ s/<[^<>]*?(\(~ .*? ~\))[^<>]*?>/$1/g;

comment added in response to the following message

[^<>]
That is a character-class consisting of not > or <. The perlre documentation has more details on how this works. My RE works by looking for the opening < of an HTML tag, then 0 or more non < > characters, then the special tag, then 0 or more non < > characters, then the > that closes the original HTML tag.

Replies are listed 'Best First'.
RE: RE: Re: Regex grabs too much
by raflach (Pilgrim) on Jun 05, 2000 at 21:02 UTC
    That worked perfectly! Can you explain what
    [^<>]
    actually is doing? That seems to be the only difference between your code and mine, and it confuses me considerably.
      It means "not one of those two characters"

      viva el perl libre

      I am not an lhoward and I do not play one on TV, but I can explain that bit of regex.

      lhoward has defined a set, as indicated by the [ ]. When perl's regex engine sees this, anything within the brackets will be considered a match. However, lhoward was tricky and made the first character a ^. When the first character of set is a ^, it negates the set ( mathematicians call it the complement, but I can't spell 'complement' ) and tells the regex to match everything but what is in the set.

      [^<>]
      is saying to match anything that isn't a < or a >

      HTH,
      mikfire