in reply to strip perl comment lines

merlyn beat me with some of his comments (I'm only on my second cup of coffee) but I would add that if you think it unlikely that here-docs (or multiline quoted strings) will contain comments, I know I have such programs. Additionally, it is also conceivable that a # character could be used as a delimiter for one of the quoting or regex operators:

$string =~ m# (some pattern) #x;

If I were to do this I would take merlyn's suggestion of using '-' (but also allow the second argument to be optional), open the output handle up front (using $! in the error message) and take care of output right in the while loop so we don't need to build up the output in memory -- something along the lines of:

#!/usr/bin/perl -w use strict; die <<USAGE unless @ARGV and @ARGV <= 2; $0 strips comment lines beginning with # from perl code usage: perl $0 infile [outfile] (output to stdout if no outfile given) USAGE my $infile = shift; my $outfile = shift || '-'; open(IN,"< $infile") or die "Couldn't open $infile: $!"; open(OUT, ">$outfile") or die "Couldn't open $outfile: $!"; my ($code, $comments) = (0,0); while(<IN>) { $comments++ and next if /^\s*#[^!]/; print OUT; $code++ } close IN; close OUT; my $total = $code + $comments; print<<SUMMARY; $total lines read from $infile $comments comment lines detected in $infile $code lines written to $outfile SUMMARY

But, in reality, I wouldn't really do this because it is destined to fail on some Perl code for reasons already given, and we haven't even mentioned accidentally stripping things that look like comments in POD sections.

Replies are listed 'Best First'.
Re: Re: strip perl comment lines
by quinkan (Monk) on Mar 05, 2001 at 16:14 UTC
    Meanwhile, back at the ranch, there's a one-liner to be built around:
    use Regexp::Common;
    and
    s/$RE{comment}{Perl}//;
    if you want one.. Another Conway special.

      On the off chance that you aren't merely kidding around and haven't looked at the module in question, I feel compelled to point out a couple of things. Not only does Regexp::Common's Perl comment matcher (which is essentially this re: /#[^\n]*\n/) suffer from the various problems listed previously, but your example use also doesn't come close to what epoptai was originally trying to do (which was to just strip lines beginning with optional whitespace and a # character, with the exception of the shebang line). Your example of:

      s/$RE{comment}{Perl}//;

      would delete *any* # character (comment or not) to the end of a line (including the newline character) and turn the following code:

      #!/usr/bin/perl -w use strict; $_ = "blah # blah"; s#blah #boog#gx; print;

      into:

      use strict; $_ = "blah s print;

      Which is useful only useful insofar as it demonstrates the problems inherent with simple attempts to strip Perl comments.