Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks
I have this to remove any tag between < > including the delimiters.
$question =~ s/<[^<>]*>//g;

I want to remove any comment between * including the delimiter. I tried the same regexp using /* instead of <> but just remove the * and no what is in between. what happen?
Thanks

Replies are listed 'Best First'.
Re: Remove comments between delimiters
by swiftone (Curate) on Sep 20, 2002 at 13:24 UTC
    First, do you realize that your <> regex may not work as intended if you have nested or broken pairs? For example: <This works>
    <<This will break>>
    <<As will this>

    Second, I'm not sure I understand your question. Are you trying to remove /*This*/
    Or turn it into //
    Or are you just trying to get rid of *This*?

    Assuming the latter, s/\*[^*]*\*//g should do what you want.

    The trick is that in a matched pair (<>) you can use the presence of a starting tag (<) to stop a greedy match from eating the rest of your regexp. In an unmatched pair (*foo*) a greedy regexp will consume everything until the final matching element. So we switch to non-greedy matching with a dot star, or greedy without consuming our match (the [^*])

    I hope that answers your question.

    Update: Death to Dot Star! is an excellent discussion of greediness, the greedy dot-star, and why non-greediness isn't always the best solution.
    Update2: Removed my non-greedy match that was in defiance of my own advice :)

Re: Remove comments between delimiters
by Anonymous Monk on Sep 20, 2002 at 13:20 UTC
    Thanks Monks, I did this and works
    $question =~ s/\*(.*?)\*//g;

      It might not matter in your case, but this will only catch comments that start and end on the same line, as the '.' does not match the newlines.

      You can try these:

      $question =~ s/\*.*?\*//sg; # the s modifier causes . to match newlines

      or

      question =~ s/\*[^*]*\*//sg; # [^*] matches the newlines too

      Update: added a * in the second regexp (thanks swiftone) and removed the (), you don't need to capture the comments.

Re: Remove comments between delimiters
by Anonymous Monk on Sep 20, 2002 at 13:45 UTC
    Thank you again for your suggestions, I already tried all :). Taking in account ->check for newlines.
Re: Remove comments between delimiters
by rinceWind (Monsignor) on Sep 21, 2002 at 10:51 UTC