Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Running user-provided JavaScript code

by cavac (Parson)
on Apr 11, 2022 at 15:31 UTC ( [id://11142919]=CUFP: print w/replies, xml ) Need Help??

Sometimes you have to allow the end user to provide some (server side) program code, but you don't want them to allow system access. This could be anything, from custom formatting stuff, smart contracts, whatever. The solution is usually a sandbox. Now, Perl itself is a little to powerful and flexible to allow you to do that somewhat safely, but you can use something like the Duktape javascript engine.

In our case, let's take a look at JavaScript::Duktape. One thing i wanted to implement is "simulated persistance", meaning the JavaScript would be coded as if it is kept in memory, yet the perl program can unload and load it whenever needed. For this, we will define a "memory" object, which the JavaScript can use to keep data in memory.

Our Javascript program looks like this:

function initMemory() { memory.counter = 0; } function incCounter(amount) { memory.counter = memory.counter + amount; } function printCounter() { log("Current count is " + memory.counter); }

To test this, let's write a small test program that uses PageCamel::Helpers::JavaScript. Don't worry about the "PageCamel" part, it's just a helper function in my framework and i didn't have the time to pull it out into a standalone thing. It's pretty much self contained though, code included in this post. Because the PageCamel framework requires a ReportingHandler, for simplicity reasons out test program will just bring it's own.

#!/usr/bin/env perl package ReportingHandler; use strict; use warnings; sub new { my ($proto, %config) = @_; my $class = ref($proto) || $proto; my $self = bless \%config, $class; return $self; } sub debuglog { my ($self, @debugtext) = @_; print "DEBUG: ", join(' ', @debugtext), "\n"; } package main; use strict; use warnings; use PageCamel::Helpers::JavaScript; use Data::Dumper; my $reph = ReportingHandler->new(); my $memory; #print Dumper($reph); my $jscode = qq{ function initMemory() { memory.counter = 0; } function incCounter(amount) { memory.counter = memory.counter + amount; } function printCounter() { log("Current count is " + memory.counter); } }; { # First run print "First run\n"; my $jsh = PageCamel::Helpers::JavaScript->new(reph => $reph, timeo +ut => 10, code => $jscode); $jsh->initMemory(); $jsh->call('printCounter'); $jsh->call('incCounter', 42); $jsh->call('printCounter'); $memory = $jsh->getMemory(); } print "Memory contents: ", $memory, "\n"; { # Second run print "Second run\n"; my $jsh = PageCamel::Helpers::JavaScript->new(reph => $reph, timeo +ut => 10, code => $jscode); $jsh->setMemory($memory); $jsh->call('printCounter'); $jsh->call('incCounter', 23); $jsh->call('printCounter'); }

Basically, in the "first run", it initializes a new instance of the script, increases the counter, reads out the memory into a variable, then throws away the handler. Then (in "second run"), makes a new instance, sets it to the previous memory content and does some more counter incring to show that persistant memory works.

And here is the PageCamel::Helpers::JavaScript module:

package PageCamel::Helpers::JavaScript; #---AUTOPRAGMASTART--- use 5.030; use strict; use warnings; use diagnostics; use mro 'c3'; use English; use Carp qw[carp croak confess cluck longmess shortmess]; our $VERSION = 4.0; use autodie qw( close ); use Array::Contains; use utf8; use Data::Dumper; use PageCamel::Helpers::UTF; #---AUTOPRAGMAEND--- BEGIN { mkdir '/tmp/pagecamel_helpers_javascript_inline'; $ENV{PERL_INLINE_DIRECTORY} = '/tmp/pagecamel_helpers_javascript_i +nline'; }; use JavaScript::Duktape; use JSON::XS; sub new { my ($class, %config) = @_; my $self = bless \%config, $class; if(!defined($self->{reph})) { croak('PageCamel::Helpers::JavaScript needs reph reporting han +dler'); } if(!defined($self->{timeout})) { croak('PageCamel::Helpers::JavaScript needs timeout (default t +imeout value)'); } my $js = JavaScript::Duktape->new(timeout => $self->{timeout}); $self->{js} = $js; $self->{js}->set('log' => sub { $self->_logfromjs($_[0]); }); $self->{js}->eval(qq{ var memory = new Object; function __encode(obj) { return JSON.stringify(obj); } function __decode(txt) { return JSON.parse(txt); } function __setmemory(txt) { memory = __decode(txt); } function __getmemory() { return __encode(memory); } function __getKeys(obj) { var keys = Object.keys(obj); return keys; } }); if(defined($self->{code})) { $self->load(); } return $self; } sub _logfromjs { my ($self, $text) = @_; $self->{reph}->debuglog($text); return; } sub load { my ($self, $code) = @_; if(defined($code)) { $self->{code} = $code; } $self->{js}->eval($self->{code}); return; } sub call { my ($self, $name, @arguments) = @_; my $func = $self->{js}->get_object($name); if(!defined($func)) { print STDERR "Function $func does not exist!\n"; return; } return $func->(@arguments); } sub registerCallback { my ($self, $name, $func) = @_; $self->{js}->set($name, $func); return; } sub encode { my ($self, $data) = @_; return encode_json $data; } sub decode { my ($self, $json) = @_; return decode_json $json; } sub toArray { my ($self, $object) = @_; my @arr; $object->forEach(sub { my ($value, $index, $ar) = @_; push @arr, $value; }); return @arr; } sub getKeys { my ($self, $object) = @_; my $rval = $self->call('__getKeys', $object); return $self->toArray($rval); } sub toHash { my ($self, $object) = @_; my @keys = $self->getKeys($object); my %hash; foreach my $key (@keys) { $hash{$key} = $object->$key; } return %hash; } sub setMemory { my ($self, $memory) = @_; $self->call('__setmemory', $memory); return; } sub getMemory { my ($self) = @_; return $self->call('__getmemory'); } sub initMemory { my ($self) = @_; $self->call('initMemory'); return; } 1;

On startup, this defines some additional JavaScript functions to handle the memory object. The rest is mostly wrapper functions to make life a bit easier.

Output of the program:

First run DEBUG: Current count is 0 DEBUG: Current count is 42 Memory contents: {"counter":42} Second run DEBUG: Current count is 42 DEBUG: Current count is 65

perl -e 'use Crypt::Digest::SHA256 qw[sha256_hex]; print substr(sha256_hex("the Answer To Life, The Universe And Everything"), 6, 2), "\n";'

Replies are listed 'Best First'.
Re: Running user-provided JavaScript code
by aitap (Curate) on Apr 17, 2022 at 09:00 UTC

    Years ago, when I was an active participant in the #linux channel of the local IRC network, we had a bot that could evaluate arbitrary shell commands, really handy to test one-liners. I don't know exactly how it was implemented, but one of the operators told me it was running on a custom Linux build inside QEMU running inside another QEMU instance because occasionally, despite the commands running as nobody and with various ulimits set, people would find a way to crash something or escape one sandbox... but never all of them.

    Running user-provided JavaScript safely is hard. Browser vendors pour a lot of money in it, and JavaScript is still one of the leading exploitation vectors, the first step from navigating to a maliciously crafted web page to arbitrary code execution. Duktape has the benefit of not having to run JavaScript really fast and so sidesteps a whole class of JIT-related vulnerabilities (not needing to allocate writeable memory and turn it executable later helps a lot), but it does have its share of issues labelled "security". Lua is a much simpler language than JavaScript, with a whole chapter on sandboxing in PIL, and yet another double-free has been found in it just a few days ago, which might potentially lead to type confusion and sandbox escape. SQLite is probably some of the finest C code there can be, and yet they still fix crashing bugs, mostly from corrupted database files, but occasionally from SQL input too. Ethereum itself has an occasional vulnerability in its virtual machine too. For Perl, there's Safe, but with the warning at the end of the documentation page, I wouldn't want to use it alone to run arbitrary code over the Internet, either.

    Admittedly, not all crashing bugs can be exploited, and not all issues labelled "security" will be exploited in the wild, and yet it feels like a fundamental problem: once we get a computer to run some code, it becomes hard to prevent it from running some other code that we don't want it to run but attackers do. I don't have a good solution, but I wish you good luck, and to proceed carefully!

      I'm aware that there are security considerations with any of these solutions. But i don't see much of an alternative these days, unless i want to write everything on my own and then hope&pray that i don't have a stupid fail somewhere in my code that let's the user exploit it anyway.

      Frankly, these days a user with enough smarts can probably exploit something like unicode handling. Oh wait, that stuff is already happening.

      Realistically, i don't have any answers, too.

      perl -e 'use Crypt::Digest::SHA256 qw[sha256_hex]; print substr(sha256_hex("the Answer To Life, The Universe And Everything"), 6, 2), "\n";'

        Using a turing-complete language like Javascript for user-proided code begs for trouble. Two of the most trivial ways to cause trouble are allocating all available memory and infinite loops. Yes, there are counter measures for both, but why allow getting into trouble at all? A reduced, domain-specific language that is intentionally not turing-complete might do the trick.

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: Running user-provided JavaScript code
by etj (Deacon) on Apr 13, 2022 at 21:09 UTC

      It's a different problem entirely, i'm afraid, although the solution is somewhat related. The JS is is designed to run in a sandbox, something that would be very hard to pull off with Perl. You'd basically have to run some sort of virtualization (KVM, Virtualbox, Containers, etc) for every single script that not only allows you to limit file access but also network access. Now image you have hundreds or even thousands of scripts that need to run every few minutes.

      One of the reasons i'm playing around with Duktape is for my XPD project: XPD - Do more with your PerlMonks XP. Basically, i want to enable the users to create "smart contracts". Those could, for example, buy or bid on NFTs and then try to resell them at a profit. So they would have to run every so often for a very short amount of time, just to check the market and make any decision based on that. If i would use Perl, i'd be constantly starting and stopping Containers or Virtualboxes. But with a language interpreter that's designed to be a sandbox, all i'd be doing is loading scripts.

      You might be able to pull this of with Perl if you have enough hardware resources. But i'm not Jeff Bezos with my own data centers. I own a single 10 year old quad core server, so i'm somewhat more limited in my choices.

      perl -e 'use Crypt::Digest::SHA256 qw[sha256_hex]; print substr(sha256_hex("the Answer To Life, The Universe And Everything"), 6, 2), "\n";'

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://11142919]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (2)
As of 2024-04-20 06:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found