in reply to Removing nested comments with regexes

The problem is that the dot "." does not match "\n". So after the first match, $1 = "--This Comment continues here" without the "\n" and following line. If you use the /s modifier, dot will match "\n", but then you have to be careful of matching too much, that is, the "^--.*\n" would then match many lines. There are several solutions to this, such as /^--.*?\n(.*)/s or /^--[^\n]*\n(.*)/s, but it might be simpler to do just this:
while ($string =~ s/^--.*\n$//m) { print "string = $string\nDeleted $1\ncount=$count\n"; $count++; }

Replies are listed 'Best First'.
Re: Re: Removing nested comments with regexes
by crenz (Priest) on May 19, 2003 at 00:31 UTC

    However, your code will stop working if you consider

    my $string = "--This is a comment\nNot a comment\n" . "--This Comment continues here\nNOT a comment";

    Comments do not necessarily occur at the beginning of the (rest of the) string. (At least that's how I understood the question.) I propose to simply do

    my $string = "..."; $string =~ s/(^|\n)--[^\n]+//g; print "No comments: $string\n";

    If you're perfectionistic, you will notice that this will leave a leading newline if the string starts with a comment. You can fix that by using

    my $string = "..."; $string =~ s/\n--[^\n]+//g; $string =~ s/^--[^\n]+\n//; print "No comments: $string\n";
Re: Re: Removing nested comments with regexes
by Anonymous Monk on May 18, 2003 at 21:12 UTC
    Thanks for the substitution idea!

    I removed the "$" and the from the regex since it wouldn't match the data correctly. Also in your example there is no longer a grouping on $1. Here is the fixed code:

    my $count = 1; my $string = "--This is a comment\n--This Comment continues here\nNOT +a comment;\n"; print "Original: $string\n"; while ($string =~ s/^--.*\n//m) { print "string = $string\ncount=$count\n"; $count++; } print "Final: $string\n"; exit;