Re: run-time syntax checking

As you asked about syntax checking and not security, I'll assume that you have the security aspect covered.

There's probably a gotcha I haven't thought about, but if you issue a return immediately at the front of the code to be checked, that will prevent any furthur code being executed--except of course for BEGIN{...} and END{...} blocks, see later--but still allow syntax checking to occur.

I believe (quite possibly wrongly), that if you remove any newlines, that BEGIN and END blocks can be match with a regex of /(?:BEGIN|END)\s*\{/ Using this to prefix them with something like sub syntax_check_ will prevent them from being run, but will allow them to also be syntax checked.

#! perl -slw
use strict;

while(<DATA>) {
    chomp;
    my $code = $_;
    tr/\n//d;
    s[((?:BEGIN|END)\s*{)][sub syntax_check_$1]g;
    eval 'return;' . $_;
    print "'$code' \n\t: ", $@ ? "Fails with\n$@" : 'Passes syntax che
+ck';
}
__END__
INIT { print '*** GOTCHA!!! ***'; }
BEGIN{ print '*** GOTCHA!!! ***'; }
END  { print '*** GOTCHA!!! ***'; }
my $a = 1;
my $a = cool;
my $
[download]

Whether this level of syntax checking is enough to satisfy your requirements only you will know.

Examine what is said, not who speaks.

The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

Comment on Re: run-time syntax checking Select or Download Code

Replies are listed 'Best First'.
Re: Re: run-time syntax checking by ihb (Deacon) on Feb 01, 2003 at 16:57 UTC
Beware of that `BEGIN`, `END`, `CHECK`, `INIT`, `AUTOLOAD`, and `DESTROY` can have a sub keyword in front of them. They can also be prototyped. `AUTOLOAD` and `DESTROY` are not relevant for this post though. However, `CHECK` and `INIT` are relevant! Try this one-liner: `perl -wle'BEGIN { require "browserUk.pl" }'` where browserUk.pl is your program above, but with `__END__` changed to `__DATA__` (why did you use `__END__` over `__DATA__` anyway?), some code in `DATA` changed, and with a true return value: `use strict; while(<DATA>) { chomp; my $code = $_; tr/\n//d; s[((?:BEGIN\|END)\s{)][sub syntax_check_$1]g; eval 'return;' . $_; print "'$code' \n\t: ", $@ ? "Fails with\n$@" : 'Passes syntax che +ck'; } 1; __DATA__ INIT { print ' GOTCHA!!! '; } # Executes. CHECK { print ' GOTCHA!!! '; } # Executes. sub BEGIN { print ' GOTCHA!!! '; } # Fails to compile. END () { print ' GOTCHA!!! *'; } # Executes. my $a = 1; my $a = cool; my $` [download] Another issue would be to take care of `use()` statements as they're compile-time statements. But if you get rid of use statements, then you might also create compilation errors, since the `use()` statement might import prototyped subroutines. For instance, perhaps a `&cool` subroutine prototyped with `()` was imported in the code above. And I wouldn't be surprised if there are more related issues... `ihb`	[reply] [d/l] [select]
Re: Re: Re: run-time syntax checking by BrowserUk (Patriarch) on Feb 01, 2003 at 20:44 UTC
I knew there were other INIT/BEGIN/END type compile-time subs but couldn't remember what they were, and both perldoc and a grep of the html files failed to turn up were these are documented before I posted. Adding the CHECK (and the others if need be to the regex is trivial. As is handling those same keywords prefixed with sub. Dealing with use statements seems to be equally simple. Switch use for no. The statement remains syntactically valid but does not cause any processing of the module. Eg. For the following test I replaced the `1;` at the end of POSIX.pm with `print 'POSIX.pm processed'.$/;` `C:\test>perl -c use POSIX qw[ceil floor]; print ceil($a), floor($b); POSIX.pm processed no POSIX qw[ceil floor]; print ceil($a), floor($b); ^Z - syntax OK C:\test>` [download] As you can see, this changed allow the syntax of the statement to be checked without the module it referes to being processed. However, the sub defined in and exported from a module with a prototype of `()` is a problem. `C:\test>perl -c sub cool(){ print 'cool',$/; } my $a = cool; ^Z - syntax OK` [download] I can't see a manover around that one other than a adding a restriction to the code that subs must be invoked with either & or (). Depending on what the OP was trying to achieve, that might be acceptable, but as a general facility, would suck a lot. With regard to the __DATA__ versus __END__. Dunno, sometimes I use one, sometimes the other. For all 'normal' uses it seems to make no difference. On which note:), why did you use `perl -e'BEGIN{ require "prog.pl" }'` instead of `prog.pl` or `perl prog.pl`? I thunk and thunk and thunk some more and can't see the logic behind that one:). test prog as it currently stands Read more... (1249 Bytes) Output Read more... (2 kB) Examine what is said, not who speaks. The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.	[reply] [d/l] [select]
Re: Re: Re: Re: run-time syntax checking by ihb (Deacon) on Feb 02, 2003 at 01:28 UTC
a grep of the html files failed to turn up were these are documented before I posted perlmod - Package Constructors and Destructors Dealing with use statements seems to be equally simple. Switch use for no. The statement remains syntactically valid but does not cause any processing of the module. Since `no` is supposed to call method `&unimport` on package, then it must be loaded. Why your example seem to prove the opposite is that the module is already loaded through the `use` statement. Make a `use` again, and you'll see that it's not loaded a second time. Or remove the `use` statement and you'll see that it indeed is loaded by the `no` statement if it hasn't been loaded before. Changing `use` to `no` can also have nasty side-effects. E.g. `use 5.006;` or `use charnames ':short'; print "\N{greek:Sigma}";`, and of course, all stricture will be turned off, so the fact that constants are no longer constants but barewords shouln't cause any trouble anyway, same with `no vars qw/.../`, etc, etc. You get the point. With regard to the __DATA__ versus __END__. Dunno, sometimes I use one, sometimes the other. For all 'normal' uses it seems to make no difference I occasionally find myself putting data in the module itself, e.g. a Parse::RecDescent grammar. I sometimes use the `DATA` filehandle for that. I can't use `__END__` since that opens a `DATA` filehandle only if it's in the top-level file. That's why I had to change `__END__` to `__DATA__` when doing `require` on your program file. This is documented in perldata. why did you use perl -e'BEGIN{ require "prog.pl" }' instead of prog.pl or perl prog.pl? If you do `perl prog.pl` the `CHECK` and `INIT` blocks will be defined after top-level compilation, and thus it will be too late to run them. But by doing `BEGIN { require 'prog.pl' }` I define the `CHECK` and `INIT` blocks before top-level run-time, and thus make them execute. You've updated your pattern, but it still doesn't cover `BEGIN () { }`. (Carefully note that the parentheses in the prototype are balanced. The prototype `((){)` will compile.) The pattern has some other issues. It'll make `subsub BEGIN { }` compile, and `sub myBEGIN { }` not compile, etc. And what about attributes? My point with this and the previous post is that it's not as trivial to do this as it might seem. There's a lot of Perl quirks to remember -- or more likely -- forget. `ihb`	[reply] [d/l] [select]
Re: Re: Re: Re: Re: run-time syntax checking by BrowserUk (Patriarch) on Feb 02, 2003 at 04:33 UTC
Re: Re: Re: Re: Re: Re: run-time syntax checking by ihb (Deacon) on Feb 28, 2003 at 23:43 UTC