Running user-provided JavaScript code

Sometimes you have to allow the end user to provide some (server side) program code, but you don't want them to allow system access. This could be anything, from custom formatting stuff, smart contracts, whatever. The solution is usually a sandbox. Now, Perl itself is a little to powerful and flexible to allow you to do that somewhat safely, but you can use something like the Duktape javascript engine.

In our case, let's take a look at JavaScript::Duktape. One thing i wanted to implement is "simulated persistance", meaning the JavaScript would be coded as if it is kept in memory, yet the perl program can unload and load it whenever needed. For this, we will define a "memory" object, which the JavaScript can use to keep data in memory.

Our Javascript program looks like this:

    function initMemory() {
        memory.counter = 0;
    }

    function incCounter(amount) {
        memory.counter = memory.counter + amount;
    }

    function printCounter() {
        log("Current count is " + memory.counter);
    }
[download]

To test this, let's write a small test program that uses PageCamel::Helpers::JavaScript. Don't worry about the "PageCamel" part, it's just a helper function in my framework and i didn't have the time to pull it out into a standalone thing. It's pretty much self contained though, code included in this post. Because the PageCamel framework requires a ReportingHandler, for simplicity reasons out test program will just bring it's own.

#!/usr/bin/env perl

package ReportingHandler;
use strict;
use warnings;

sub new {
    my ($proto, %config) = @_;
    my $class = ref($proto) || $proto;

    my $self = bless \%config, $class;

    return $self;
}

sub debuglog {
    my ($self, @debugtext) = @_;

    print "DEBUG: ", join(' ', @debugtext), "\n";
}


package main;
use strict;
use warnings;

use PageCamel::Helpers::JavaScript;
use Data::Dumper;

my $reph = ReportingHandler->new();
my $memory;

#print Dumper($reph);

my $jscode = qq{
    function initMemory() {
        memory.counter = 0;
    }

    function incCounter(amount) {
        memory.counter = memory.counter + amount;
    }

    function printCounter() {
        log("Current count is " + memory.counter);
    }
};

{ # First run
    print "First run\n";
    my $jsh = PageCamel::Helpers::JavaScript->new(reph => $reph, timeo
+ut => 10, code => $jscode);
    $jsh->initMemory();
    $jsh->call('printCounter');
    $jsh->call('incCounter', 42);
    $jsh->call('printCounter');
    $memory = $jsh->getMemory();
}

print "Memory contents: ", $memory, "\n";

{ # Second run
    print "Second run\n";
    my $jsh = PageCamel::Helpers::JavaScript->new(reph => $reph, timeo
+ut => 10, code => $jscode);
    $jsh->setMemory($memory);
    $jsh->call('printCounter');
    $jsh->call('incCounter', 23);
    $jsh->call('printCounter');
}
[download]

Basically, in the "first run", it initializes a new instance of the script, increases the counter, reads out the memory into a variable, then throws away the handler. Then (in "second run"), makes a new instance, sets it to the previous memory content and does some more counter incring to show that persistant memory works.

And here is the PageCamel::Helpers::JavaScript module:

package PageCamel::Helpers::JavaScript;
#---AUTOPRAGMASTART---
use 5.030;
use strict;
use warnings;
use diagnostics;
use mro 'c3';
use English;
use Carp qw[carp croak confess cluck longmess shortmess];
our $VERSION = 4.0;
use autodie qw( close );
use Array::Contains;
use utf8;
use Data::Dumper;
use PageCamel::Helpers::UTF;
#---AUTOPRAGMAEND---

BEGIN {
    mkdir '/tmp/pagecamel_helpers_javascript_inline';
    $ENV{PERL_INLINE_DIRECTORY} = '/tmp/pagecamel_helpers_javascript_i
+nline';
};
use JavaScript::Duktape;
use JSON::XS;

sub new {
    my ($class, %config) = @_;
    my $self = bless \%config, $class;

    if(!defined($self->{reph})) {
        croak('PageCamel::Helpers::JavaScript needs reph reporting han
+dler');
    }

    if(!defined($self->{timeout})) {
        croak('PageCamel::Helpers::JavaScript needs timeout (default t
+imeout value)');
    }

    my $js = JavaScript::Duktape->new(timeout => $self->{timeout});
    $self->{js} = $js;

    $self->{js}->set('log' => sub {
        $self->_logfromjs($_[0]);
    });

    $self->{js}->eval(qq{
        var memory = new Object;
        function __encode(obj) {
            return JSON.stringify(obj);
        }

        function __decode(txt) {
            return JSON.parse(txt);
        }

        function __setmemory(txt) {
            memory = __decode(txt);
        }

        function __getmemory() {
            return __encode(memory);
        }

        function __getKeys(obj) {
            var keys = Object.keys(obj);
            return keys;
        }
        
    });

    if(defined($self->{code})) {
        $self->load();
    }


    return $self;
}

sub _logfromjs {
    my ($self, $text) = @_;

    $self->{reph}->debuglog($text);

    return;
}

sub load {
    my ($self, $code) = @_;

    if(defined($code)) {
        $self->{code} = $code;
    }
 
    $self->{js}->eval($self->{code});

    return;
}

sub call {
    my ($self, $name, @arguments) = @_;

    my $func = $self->{js}->get_object($name);
    if(!defined($func)) {
        print STDERR "Function $func does not exist!\n";
        return;
    }
    return $func->(@arguments);
}

sub registerCallback {
    my ($self, $name, $func) = @_;

    $self->{js}->set($name, $func);

    return;
}

sub encode {
    my ($self, $data) = @_;

    return encode_json $data;
}

sub decode {
    my ($self, $json) = @_;

    return decode_json $json;
}

sub toArray {
    my ($self, $object) = @_;

    my @arr;
    $object->forEach(sub {
        my ($value, $index, $ar) = @_;
        push @arr, $value;
    });

    return @arr;


}

sub getKeys {
    my ($self, $object) = @_;

    my $rval = $self->call('__getKeys', $object);
    
    return $self->toArray($rval);
}

sub toHash {
    my ($self, $object) = @_;

    my @keys = $self->getKeys($object);
    my %hash;

    foreach my $key (@keys) {
        $hash{$key} = $object->$key;
    }

    return %hash;
}

sub setMemory {
    my ($self, $memory) = @_;

    $self->call('__setmemory', $memory);

    return;
}

sub getMemory {
    my ($self) = @_;

    return $self->call('__getmemory');
}

sub initMemory {
    my ($self) = @_;

    $self->call('initMemory');
    return;
}

1;
[download]

On startup, this defines some additional JavaScript functions to handle the memory object. The rest is mostly wrapper functions to make life a bit easier.

Output of the program:

First run
DEBUG: Current count is 0
DEBUG: Current count is 42
Memory contents: {"counter":42}
Second run
DEBUG: Current count is 42
DEBUG: Current count is 65
[download]

perl -e 'use Crypt::Digest::SHA256 qw[sha256_hex]; print substr(sha256_hex("the Answer To Life, The Universe And Everything"), 6, 2), "\n";'

Comment on Running user-provided JavaScript code Select or Download Code

Replies are listed 'Best First'.
Re: Running user-provided JavaScript code by aitap (Curate) on Apr 17, 2022 at 09:00 UTC
Years ago, when I was an active participant in the `#linux` channel of the local IRC network, we had a bot that could evaluate arbitrary shell commands, really handy to test one-liners. I don't know exactly how it was implemented, but one of the operators told me it was running on a custom Linux build inside QEMU running inside another QEMU instance because occasionally, despite the commands running as `nobody` and with various `ulimit`s set, people would find a way to crash something or escape one sandbox... but never all of them. Running user-provided JavaScript safely is hard. Browser vendors pour a lot of money in it, and JavaScript is still one of the leading exploitation vectors, the first step from navigating to a maliciously crafted web page to arbitrary code execution. Duktape has the benefit of not having to run JavaScript really fast and so sidesteps a whole class of JIT-related vulnerabilities (not needing to allocate writeable memory and turn it executable later helps a lot), but it does have its share of issues labelled "security". Lua is a much simpler language than JavaScript, with a whole chapter on sandboxing in PIL, and yet another double-free has been found in it just a few days ago, which might potentially lead to type confusion and sandbox escape. SQLite is probably some of the finest C code there can be, and yet they still fix crashing bugs, mostly from corrupted database files, but occasionally from SQL input too. Ethereum itself has an occasional vulnerability in its virtual machine too. For Perl, there's Safe, but with the warning at the end of the documentation page, I wouldn't want to use it alone to run arbitrary code over the Internet, either. Admittedly, not all crashing bugs can be exploited, and not all issues labelled "security" will be exploited in the wild, and yet it feels like a fundamental problem: once we get a computer to run some code, it becomes hard to prevent it from running some other code that we don't want it to run but attackers do. I don't have a good solution, but I wish you good luck, and to proceed carefully!	[reply] [d/l] [select]
Re^2: Running user-provided JavaScript code by cavac (Parson) on Apr 19, 2022 at 07:59 UTC
I'm aware that there are security considerations with any of these solutions. But i don't see much of an alternative these days, unless i want to write everything on my own and then hope&pray that i don't have a stupid fail somewhere in my code that let's the user exploit it anyway. Frankly, these days a user with enough smarts can probably exploit something like unicode handling. Oh wait, that stuff is already happening. Realistically, i don't have any answers, too. `perl -e 'use Crypt::Digest::SHA256 qw[sha256_hex]; print substr(sha256_hex("the Answer To Life, The Universe And Everything"), 6, 2), "\n";'`	[reply] [d/l]
Re^3: Running user-provided JavaScript code by afoken (Chancellor) on Apr 21, 2022 at 07:23 UTC
Using a turing-complete language like Javascript for user-proided code begs for trouble. Two of the most trivial ways to cause trouble are allocating all available memory and infinite loops. Yes, there are counter measures for both, but why allow getting into trouble at all? A reduced, domain-specific language that is intentionally not turing-complete might do the trick. Alexander -- Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)	[reply]
Re^4: Running user-provided JavaScript code by cavac (Parson) on Apr 21, 2022 at 09:32 UTC
Re: Running user-provided JavaScript code by etj (Deacon) on Apr 13, 2022 at 21:09 UTC
This looks to me a bit like a different approach to a slightly similar problem in this post? Making a reloadable module, allowing "live" edits	[reply]
Re^2: Running user-provided JavaScript code by cavac (Parson) on Apr 14, 2022 at 07:10 UTC
It's a different problem entirely, i'm afraid, although the solution is somewhat related. The JS is is designed to run in a sandbox, something that would be very hard to pull off with Perl. You'd basically have to run some sort of virtualization (KVM, Virtualbox, Containers, etc) for every single script that not only allows you to limit file access but also network access. Now image you have hundreds or even thousands of scripts that need to run every few minutes. One of the reasons i'm playing around with Duktape is for my XPD project: XPD - Do more with your PerlMonks XP. Basically, i want to enable the users to create "smart contracts". Those could, for example, buy or bid on NFTs and then try to resell them at a profit. So they would have to run every so often for a very short amount of time, just to check the market and make any decision based on that. If i would use Perl, i'd be constantly starting and stopping Containers or Virtualboxes. But with a language interpreter that's designed to be a sandbox, all i'd be doing is loading scripts. You might be able to pull this of with Perl if you have enough hardware resources. But i'm not Jeff Bezos with my own data centers. I own a single 10 year old quad core server, so i'm somewhat more limited in my choices. `perl -e 'use Crypt::Digest::SHA256 qw[sha256_hex]; print substr(sha256_hex("the Answer To Life, The Universe And Everything"), 6, 2), "\n";'`	[reply] [d/l]


Clear questions and runnable code get the best and fastest answer
	PerlMonks