run-time syntax checking

Pardus has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.

Re: run-time syntax checking
by PodMaster (Abbot) on Jan 31, 2003 at 10:53 UTC

Note that there are ways to get around Safe, but there are no examples (Is Safe.pm unsafe?)

Also, the -c switch is not perfect, BEGIN{} block will be executed.

update:

Auditing BEGIN blocks? might be of interest

MJD says you can't just make shit up and expect the computer to know what you mean, retardo!
** The Third rule of perl club is a statement of fact: pod is sexy.

[reply]

Re^2: run-time syntax checking

by diotalevi (Canon) on Jan 31, 2003 at 14:56 UTC

And my conclusion from Auditing BEGIN blocks? is that it won't be possible to do this until perl6 when the grammar can be subverted.

Added On further thought - what you can do is use the Safe module and then go some varying lengths to protect yourself from side effects. I might prepend CHECK { return undef $@; } to the string to be tested. It's guaranteed to run before anything else in the block except a BEGIN block. You then just test $@ as normal to see if it's got a value. Also, when using Safe either be sure it's the fixed version or that you create a new compartment each time you want to test. Also use one the ARLM tricks from the perlipc page. If you get caught in an infinite or absurdly long loop you'll still get execution back. I don't think you can do this and actually be paranoid because the code could also allocate a whole lot of RAM or do something else to violate a user limit. My OS (OpenBSD) has limits preventing user processes from going hog wild so that's the sort of user-limit I'm thinking of.

If you want to hear more then respond.

Seeking Green geeks in Minnesota

[reply]
[d/l]

Re: run-time syntax checking
by integral (Hermit) on Jan 31, 2003 at 13:03 UTC

eval "sub { $code }";
die "Compilation failure: $@" if $@;
[download]

my $closure = eval "sub {\n$code\n}";
[download]

--
integral, resident of freenode's #perl

[reply]
[d/l]
[select]

Re: Re: run-time syntax checking

by abell (Chaplain) on Jan 31, 2003 at 13:55 UTC

Some more thought is required, as one could pass

}
do nasty stuff;
...
my $foo = <<"}";
[download]

This way, one could close the sub, execute whatever and keep the syntax correct by using the outer } as heredoc terminator.

The stupider the astronaut, the easier it is to win the trip to Vega - A. Tucket

[reply]
[d/l]

Re^3: run-time syntax checking

by diotalevi (Canon) on Jan 31, 2003 at 19:49 UTC

There is no protection against malicious code without Safe and even then it can have unwanted side-effects: never returning and other DoS conditions.

Seeking Green geeks in Minnesota

[reply]

Re^2: run-time syntax checking

by Aristotle (Chancellor) on Jan 31, 2003 at 13:12 UTC

$code

Makeshifts last the longest.

[reply]

Re: Re^2: run-time syntax checking

by Pardus (Pilgrim) on Feb 01, 2003 at 14:17 UTC

root@Captain:/home/pardus# perl
my $string = "print 'some string'; }";
my $sub = eval "sub { $string }";
print $@;

Unmatched right curly bracket at (eval 1) line 1, at end of line
syntax error at (eval 1) line 1, near "} }"
[download]

[reply]
[d/l]

Re^4: run-time syntax checking

by Aristotle (Chancellor) on Feb 01, 2003 at 16:28 UTC

Re^5: run-time syntax checking

by Pardus (Pilgrim) on Feb 01, 2003 at 16:42 UTC

Some notes below your chosen depth have not been shown here

Re: run-time syntax checking
by BrowserUk (Patriarch) on Feb 01, 2003 at 08:33 UTC

As you asked about syntax checking and not security, I'll assume that you have the security aspect covered.

There's probably a gotcha I haven't thought about, but if you issue a return immediately at the front of the code to be checked, that will prevent any furthur code being executed--except of course for BEGIN{...} and END{...} blocks, see later--but still allow syntax checking to occur.

I believe (quite possibly wrongly), that if you remove any newlines, that BEGIN and END blocks can be match with a regex of /(?:BEGIN|END)\s*\{/ Using this to prefix them with something like sub syntax_check_ will prevent them from being run, but will allow them to also be syntax checked.

#! perl -slw
use strict;

while(<DATA>) {
    chomp;
    my $code = $_;
    tr/\n//d;
    s[((?:BEGIN|END)\s*{)][sub syntax_check_$1]g;
    eval 'return;' . $_;
    print "'$code' \n\t: ", $@ ? "Fails with\n$@" : 'Passes syntax che
+ck';
}
__END__
INIT { print '*** GOTCHA!!! ***'; }
BEGIN{ print '*** GOTCHA!!! ***'; }
END  { print '*** GOTCHA!!! ***'; }
my $a = 1;
my $a = cool;
my $
[download]

Whether this level of syntax checking is enough to satisfy your requirements only you will know.

Examine what is said, not who speaks.

The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

[reply]
[d/l]
[select]

Re: Re: run-time syntax checking

by ihb (Deacon) on Feb 01, 2003 at 16:57 UTC

Beware of that BEGIN, END, CHECK, INIT, AUTOLOAD, and DESTROY can have a sub keyword in front of them. They can also be prototyped. AUTOLOAD and DESTROY are not relevant for this post though.

However, CHECK and INIT are relevant! Try this one-liner: perl -wle'BEGIN { require "browserUk.pl" }' where browserUk.pl is your program above, but with __END__ changed to __DATA__ (why did you use __END__ over __DATA__ anyway?), some code in DATA changed, and with a true return value:

use strict;

while(<DATA>) {
    chomp;
    my $code = $_;
    tr/\n//d;
    s[((?:BEGIN|END)\s*{)][sub syntax_check_$1]g;
    eval 'return;' . $_;
    print "'$code' \n\t: ", $@ ? "Fails with\n$@" : 'Passes syntax che
+ck';
}

1;

__DATA__
INIT      { print '*** GOTCHA!!! ***'; } # Executes.
CHECK     { print '*** GOTCHA!!! ***'; } # Executes.
sub BEGIN { print '*** GOTCHA!!! ***'; } # Fails to compile.
END ()    { print '*** GOTCHA!!! ***'; } # Executes.
my $a = 1;
my $a = cool;
my $
[download]

use()

&cool

()

And I wouldn't be surprised if there are more related issues...

ihb

[reply]
[d/l]
[select]

Re: Re: Re: run-time syntax checking

by BrowserUk (Patriarch) on Feb 01, 2003 at 20:44 UTC

I knew there were other INIT/BEGIN/END type compile-time subs but couldn't remember what they were, and both perldoc and a grep of the html files failed to turn up were these are documented before I posted. Adding the CHECK (and the others if need be to the regex is trivial. As is handling those same keywords prefixed with sub.

Dealing with use statements seems to be equally simple. Switch use for no. The statement remains syntactically valid but does not cause any processing of the module. Eg. For the following test I replaced the 1; at the end of POSIX.pm with print 'POSIX.pm processed'.$/;

C:\test>perl -c
use POSIX qw[ceil floor]; print ceil($a), floor($b);
POSIX.pm processed
no POSIX qw[ceil floor]; print ceil($a), floor($b);
^Z
- syntax OK
C:\test>
[download]

As you can see, this changed allow the syntax of the statement to be checked without the module it referes to being processed.

However, the sub defined in and exported from a module with a prototype of () is a problem.

C:\test>perl -c
sub cool(){ print 'cool',$/; }
my $a = cool;
^Z
- syntax OK
[download]

I can't see a manover around that one other than a adding a restriction to the code that subs must be invoked with either & or (). Depending on what the OP was trying to achieve, that might be acceptable, but as a general facility, would suck a lot.

With regard to the __DATA__ versus __END__. Dunno, sometimes I use one, sometimes the other. For all 'normal' uses it seems to make no difference.

On which note:), why did you use perl -e'BEGIN{ require "prog.pl" }' instead of prog.pl or perl prog.pl?

I thunk and thunk and thunk some more and can't see the logic behind that one:).

test prog as it currently stands