in reply to run-time syntax checking

As you asked about syntax checking and not security, I'll assume that you have the security aspect covered.

There's probably a gotcha I haven't thought about, but if you issue a return immediately at the front of the code to be checked, that will prevent any furthur code being executed--except of course for BEGIN{...} and END{...} blocks, see later--but still allow syntax checking to occur.

I believe (quite possibly wrongly), that if you remove any newlines, that BEGIN and END blocks can be match with a regex of /(?:BEGIN|END)\s*\{/ Using this to prefix them with something like sub syntax_check_ will prevent them from being run, but will allow them to also be syntax checked.

#! perl -slw use strict; while(<DATA>) { chomp; my $code = $_; tr/\n//d; s[((?:BEGIN|END)\s*{)][sub syntax_check_$1]g; eval 'return;' . $_; print "'$code' \n\t: ", $@ ? "Fails with\n$@" : 'Passes syntax che +ck'; } __END__ INIT { print '*** GOTCHA!!! ***'; } BEGIN{ print '*** GOTCHA!!! ***'; } END { print '*** GOTCHA!!! ***'; } my $a = 1; my $a = cool; my $

Whether this level of syntax checking is enough to satisfy your requirements only you will know.


Examine what is said, not who speaks.

The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

Replies are listed 'Best First'.
Re: Re: run-time syntax checking
by ihb (Deacon) on Feb 01, 2003 at 16:57 UTC

    Beware of that BEGIN, END, CHECK, INIT, AUTOLOAD, and DESTROY can have a sub keyword in front of them. They can also be prototyped. AUTOLOAD and DESTROY are not relevant for this post though.

    However, CHECK and INIT are relevant! Try this one-liner:   perl -wle'BEGIN { require "browserUk.pl" }' where browserUk.pl is your program above, but with __END__ changed to __DATA__ (why did you use __END__ over __DATA__ anyway?), some code in DATA changed, and with a true return value:

    use strict; while(<DATA>) { chomp; my $code = $_; tr/\n//d; s[((?:BEGIN|END)\s*{)][sub syntax_check_$1]g; eval 'return;' . $_; print "'$code' \n\t: ", $@ ? "Fails with\n$@" : 'Passes syntax che +ck'; } 1; __DATA__ INIT { print '*** GOTCHA!!! ***'; } # Executes. CHECK { print '*** GOTCHA!!! ***'; } # Executes. sub BEGIN { print '*** GOTCHA!!! ***'; } # Fails to compile. END () { print '*** GOTCHA!!! ***'; } # Executes. my $a = 1; my $a = cool; my $
    Another issue would be to take care of use() statements as they're compile-time statements. But if you get rid of use statements, then you might also create compilation errors, since the use() statement might import prototyped subroutines. For instance, perhaps a &cool subroutine prototyped with () was imported in the code above.

    And I wouldn't be surprised if there are more related issues...

    ihb

      I knew there were other INIT/BEGIN/END type compile-time subs but couldn't remember what they were, and both perldoc and a grep of the html files failed to turn up were these are documented before I posted. Adding the CHECK (and the others if need be to the regex is trivial. As is handling those same keywords prefixed with sub.

      Dealing with use statements seems to be equally simple. Switch use for no. The statement remains syntactically valid but does not cause any processing of the module. Eg. For the following test I replaced the 1; at the end of POSIX.pm with print 'POSIX.pm processed'.$/;

      C:\test>perl -c use POSIX qw[ceil floor]; print ceil($a), floor($b); POSIX.pm processed no POSIX qw[ceil floor]; print ceil($a), floor($b); ^Z - syntax OK C:\test>

      As you can see, this changed allow the syntax of the statement to be checked without the module it referes to being processed.

      However, the sub defined in and exported from a module with a prototype of () is a problem.

      C:\test>perl -c sub cool(){ print 'cool',$/; } my $a = cool; ^Z - syntax OK

      I can't see a manover around that one other than a adding a restriction to the code that subs must be invoked with either & or (). Depending on what the OP was trying to achieve, that might be acceptable, but as a general facility, would suck a lot.

      With regard to the __DATA__ versus __END__. Dunno, sometimes I use one, sometimes the other. For all 'normal' uses it seems to make no difference.

      On which note:), why did you use perl -e'BEGIN{ require "prog.pl" }' instead of prog.pl or perl prog.pl?

      I thunk and thunk and thunk some more and can't see the logic behind that one:).

      test prog as it currently stands

      Output


      Examine what is said, not who speaks.

      The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

        a grep of the html files failed to turn up were these are documented before I posted

        perlmod - Package Constructors and Destructors

        Dealing with use statements seems to be equally simple. Switch use for no. The statement remains syntactically valid but does not cause any processing of the module.

        Since no is supposed to call method &unimport on package, then it must be loaded. Why your example seem to prove the opposite is that the module is already loaded through the use statement. Make a use again, and you'll see that it's not loaded a second time. Or remove the use statement and you'll see that it indeed is loaded by the no statement if it hasn't been loaded before.

        Changing use to no can also have nasty side-effects. E.g. use 5.006; or use charnames ':short'; print "\N{greek:Sigma}";, and of course, all stricture will be turned off, so the fact that constants are no longer constants but barewords shouln't cause any trouble anyway, same with no vars qw/.../, etc, etc. You get the point.

        With regard to the __DATA__ versus __END__. Dunno, sometimes I use one, sometimes the other. For all 'normal' uses it seems to make no difference

        I occasionally find myself putting data in the module itself, e.g. a Parse::RecDescent grammar. I sometimes use the DATA filehandle for that. I can't use __END__ since that opens a DATA filehandle only if it's in the top-level file. That's why I had to change __END__ to __DATA__ when doing require on your program file. This is documented in perldata.

        why did you use perl -e'BEGIN{ require "prog.pl" }' instead of prog.pl or perl prog.pl?

        If you do perl prog.pl the CHECK and INIT blocks will be defined after top-level compilation, and thus it will be too late to run them. But by doing BEGIN { require 'prog.pl' } I define the CHECK and INIT blocks before top-level run-time, and thus make them execute.

        You've updated your pattern, but it still doesn't cover BEGIN () { }. (Carefully note that the parentheses in the prototype are balanced. The prototype ((){) will compile.) The pattern has some other issues. It'll make subsub BEGIN { } compile, and sub myBEGIN { } not compile, etc. And what about attributes?

        My point with this and the previous post is that it's not as trivial to do this as it might seem. There's a lot of Perl quirks to remember -- or more likely -- forget.

        ihb