Pardus has asked for the wisdom of the Perl Monks concerning the following question:

Greetings Oh holy men

Is it possible to do a syntax check on a block of perl code from within a perl program?

Of course one could do qx/perl -wc "my code"/ but I don't want to shell out. Another possebility is to eval the code and check for $@, but then it is allready evalled, and that could have unforseen consequences.

Basicly I'm looking for a kind of local "-c" switch.

Any suggestions ?
--
Jaap Karssenberg || Pardus (Larus)? <pardus@cpan.org>
>>>> Zoidberg: So many memories, so many strange fluids gushing out of patients' bodies.... <<<<

Replies are listed 'Best First'.
Re: run-time syntax checking
by PodMaster (Abbot) on Jan 31, 2003 at 10:53 UTC
    It's not really possible. The question has come up before, and your best bet is to use Safe or simply shell out.

    Note that there are ways to get around Safe, but there are no examples (Is Safe.pm unsafe?)

    Also, the -c switch is not perfect, BEGIN{} block will be executed.

    update:

    Auditing BEGIN blocks? might be of interest


    MJD says you can't just make shit up and expect the computer to know what you mean, retardo!
    ** The Third rule of perl club is a statement of fact: pod is sexy.

      And my conclusion from Auditing BEGIN blocks? is that it won't be possible to do this until perl6 when the grammar can be subverted.


      Added On further thought - what you can do is use the Safe module and then go some varying lengths to protect yourself from side effects. I might prepend CHECK { return undef $@; } to the string to be tested. It's guaranteed to run before anything else in the block except a BEGIN block. You then just test $@ as normal to see if it's got a value. Also, when using Safe either be sure it's the fixed version or that you create a new compartment each time you want to test. Also use one the ARLM tricks from the perlipc page. If you get caught in an infinite or absurdly long loop you'll still get execution back. I don't think you can do this and actually be paranoid because the code could also allocate a whole lot of RAM or do something else to violate a user limit. My OS (OpenBSD) has limits preventing user processes from going hog wild so that's the sort of user-limit I'm thinking of.

      If you want to hear more then respond.


      Seeking Green geeks in Minnesota

Re: run-time syntax checking
by integral (Hermit) on Jan 31, 2003 at 13:03 UTC
    If this is run in a safe environment an easy way to eval the code without it actually running could be to put into a closure (this technique is done by HTML::Mason, which can load a component separately from executing it):
    eval "sub { $code }"; die "Compilation failure: $@" if $@;
    Of course this simple piece of code could be improved, since a comment at the end of $code could break things.
    my $closure = eval "sub {\n$code\n}";
    You call then invoke &$closure (or $closure->()) when you want to actually execute the code.

    --
    integral, resident of freenode's #perl
    

      Some more thought is required, as one could pass

      } do nasty stuff; ... my $foo = <<"}";
      as $code.

      This way, one could close the sub, execute whatever and keep the syntax correct by using the outer } as heredoc terminator.

      Antonio

      The stupider the astronaut, the easier it is to win the trip to Vega - A. Tucket

        There is no protection against malicious code without Safe and even then it can have unwanted side-effects: never returning and other DoS conditions.


        Seeking Green geeks in Minnesota

      You're in for a surprise if $code contains a closing curly too much, though.

      Makeshifts last the longest.

        root@Captain:/home/pardus# perl my $string = "print 'some string'; }"; my $sub = eval "sub { $string }"; print $@; Unmatched right curly bracket at (eval 1) line 1, at end of line syntax error at (eval 1) line 1, near "} }"
        So where is the surprise ?
        --
        Jaap Karssenberg || Pardus (Larus)? <pardus@cpan.org>
        >>>> Zoidberg: So many memories, so many strange fluids gushing out of patients' bodies.... <<<<
Re: run-time syntax checking
by BrowserUk (Patriarch) on Feb 01, 2003 at 08:33 UTC

    As you asked about syntax checking and not security, I'll assume that you have the security aspect covered.

    There's probably a gotcha I haven't thought about, but if you issue a return immediately at the front of the code to be checked, that will prevent any furthur code being executed--except of course for BEGIN{...} and END{...} blocks, see later--but still allow syntax checking to occur.

    I believe (quite possibly wrongly), that if you remove any newlines, that BEGIN and END blocks can be match with a regex of /(?:BEGIN|END)\s*\{/ Using this to prefix them with something like sub syntax_check_ will prevent them from being run, but will allow them to also be syntax checked.

    #! perl -slw use strict; while(<DATA>) { chomp; my $code = $_; tr/\n//d; s[((?:BEGIN|END)\s*{)][sub syntax_check_$1]g; eval 'return;' . $_; print "'$code' \n\t: ", $@ ? "Fails with\n$@" : 'Passes syntax che +ck'; } __END__ INIT { print '*** GOTCHA!!! ***'; } BEGIN{ print '*** GOTCHA!!! ***'; } END { print '*** GOTCHA!!! ***'; } my $a = 1; my $a = cool; my $

    Whether this level of syntax checking is enough to satisfy your requirements only you will know.


    Examine what is said, not who speaks.

    The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

      Beware of that BEGIN, END, CHECK, INIT, AUTOLOAD, and DESTROY can have a sub keyword in front of them. They can also be prototyped. AUTOLOAD and DESTROY are not relevant for this post though.

      However, CHECK and INIT are relevant! Try this one-liner:   perl -wle'BEGIN { require "browserUk.pl" }' where browserUk.pl is your program above, but with __END__ changed to __DATA__ (why did you use __END__ over __DATA__ anyway?), some code in DATA changed, and with a true return value:

      use strict; while(<DATA>) { chomp; my $code = $_; tr/\n//d; s[((?:BEGIN|END)\s*{)][sub syntax_check_$1]g; eval 'return;' . $_; print "'$code' \n\t: ", $@ ? "Fails with\n$@" : 'Passes syntax che +ck'; } 1; __DATA__ INIT { print '*** GOTCHA!!! ***'; } # Executes. CHECK { print '*** GOTCHA!!! ***'; } # Executes. sub BEGIN { print '*** GOTCHA!!! ***'; } # Fails to compile. END () { print '*** GOTCHA!!! ***'; } # Executes. my $a = 1; my $a = cool; my $
      Another issue would be to take care of use() statements as they're compile-time statements. But if you get rid of use statements, then you might also create compilation errors, since the use() statement might import prototyped subroutines. For instance, perhaps a &cool subroutine prototyped with () was imported in the code above.

      And I wouldn't be surprised if there are more related issues...

      ihb

        I knew there were other INIT/BEGIN/END type compile-time subs but couldn't remember what they were, and both perldoc and a grep of the html files failed to turn up were these are documented before I posted. Adding the CHECK (and the others if need be to the regex is trivial. As is handling those same keywords prefixed with sub.

        Dealing with use statements seems to be equally simple. Switch use for no. The statement remains syntactically valid but does not cause any processing of the module. Eg. For the following test I replaced the 1; at the end of POSIX.pm with print 'POSIX.pm processed'.$/;

        C:\test>perl -c use POSIX qw[ceil floor]; print ceil($a), floor($b); POSIX.pm processed no POSIX qw[ceil floor]; print ceil($a), floor($b); ^Z - syntax OK C:\test>

        As you can see, this changed allow the syntax of the statement to be checked without the module it referes to being processed.

        However, the sub defined in and exported from a module with a prototype of () is a problem.

        C:\test>perl -c sub cool(){ print 'cool',$/; } my $a = cool; ^Z - syntax OK

        I can't see a manover around that one other than a adding a restriction to the code that subs must be invoked with either & or (). Depending on what the OP was trying to achieve, that might be acceptable, but as a general facility, would suck a lot.

        With regard to the __DATA__ versus __END__. Dunno, sometimes I use one, sometimes the other. For all 'normal' uses it seems to make no difference.

        On which note:), why did you use perl -e'BEGIN{ require "prog.pl" }' instead of prog.pl or perl prog.pl?

        I thunk and thunk and thunk some more and can't see the logic behind that one:).

        test prog as it currently stands

        Output


        Examine what is said, not who speaks.

        The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.