Try use B::Deparse (perldoc B::Deparse to find out how it works) to compile and decompile the Perl in a form that doesn't have comments any more. The output source-code will not match the input but should be equivalent. | [reply] |
They say that the only thing that can parse Perl is the Perl parser itself. Why? Well, you can use # inside quoted strings and regular expressions, but you can also redefine the wrappers around these things with q(), qq(), m(), s()(), tr()() and so on. My suggestion is to try the deparser:
perl -MO=Deparse file.pl
| [reply] [d/l] |
You'll discover this soon enough, but you will probably have to leave at least one # line alone:
#!/usr/bin/perl -w
| [reply] [d/l] |
I'm not gonna write regex here, but I'll give my .02. Create a script that opens a file, runs some regex's over it and spit it out to a different file (in case we break something)
If it starts with #!, leave it alone. If the line starts with a #, its a comment so delete it. If you find a # elsewhere, and is preceded by a ; (possibly with whitespace in between), hack off the end of the line. That should take care of 99% of your problems. I'd write it up, but I'm not at home now and don't have access to perl to test, so I won't confuse with bad code. Maybe I'll drop back in and write it tomorrow though.... | [reply] |
If it starts with #!, leave it alone. If the line starts with a #, its a comment so delete it. If you find a # elsewhere, and is preceded by a ; (possibly with whitespace in between), hack off the end of the line. That should take care of 99% of your problems.
This doesn't even begin to solve the problem. That won't handle this very legal example:
#!perl -w
use strict; # Always!
# Call method foo like this:
# my @results = foo( $arg1, \%hash ); # foo in array context
sub foo
{
my ( $arg, $hashref ) @_;
$arg =~ m/some regex # with true embedded comments
on multiple lines/x;
$arg =~ m/some regex with a # sign in it./;
$arg =~ m/some regex with a ; # combo in it./;
my $result1 = "a string with; # in it";
my $result2 = q; # nasty!;;
# this comment has ' ' as the first char.
return ( $result1, $result2 ) # no semi-colon!
}
And it gets worse, much worse. I didn't even mention the backslash escape problems. | [reply] [d/l] |
I now have a similar problem in front of me - to recognize comments within perl scripts. I'm reading my own Perl code, and only a small fraction of my comments (I have many comments, I make sure my code is readable) are preceded by a ';'
| [reply] |