|Pathologically Eclectic Rubbish Lister
Refactoring Perl5 with Luaby rje (Deacon)
|on Oct 21, 2014 at 18:31 UTC
WARNING: It may be that I'm simply thinking about Parrot in a different way...
If you've read my previous post on microperl, then you're sufficiently prepared to take this post with a grain of salt. As a brief summary, I'll re-quote something Chromatic wrote to start me thinking about this problem in general:
"If I were to implement a language now, I'd write a very minimal core suitable for bootstrapping. ... Think of a handful of ops. Think very low level. (Think something a little higher than the universal Turing machine and the lambda calculus and maybe a little bit more VMmy than a good Forth implementation, and you have it.) If you've come up with something that can replace XS, stop. You're there. Do not continue. That's what you need." (Chromatic, January 2013)
Warning: I've never written a VM or a bytecode interpreter. I have written interpreters and worked with bytecodes before (okay, a 6502 emulator, but that's basically a bytecode interpreter, right?) Just remember that I'm not posting from a position of strength.
So I found the Lua opcode set, and it seems a good starting point for talking about a small, though perhaps not minimal, Turing machine that seems to do much of what Chromatic was thinking about... except for XS, which I still haven't wrapped my head around.
Lua has a register-based 35 opcode VM with flat closures, threads, coroutines, incremental garbage collection... and manages to shoehorn in a tail call, a "for" loop, and a CLOSURE for goodness' sake. And some of those opcodes could be "macros" built on top of other opcodes, rather than atomic opcodes (only if speed were unimportant): SUB, MUL, DIV, POW, LE.
Again, a disclaimer: I haven't been in a compiler construction class for 25 years, and my career has typically been enterprise coding, data analysis, and tool scripting. Regardless, a small opcode set seems to me to be important for portability. And... 35 codes... well, that's dinky.
I don't assume that Lua's codes are sufficient for Perl... things are likely missing or just not quite right for Perl. But I have to start somewhere, right? And I figure some of you have the right Domain Knowledge to shed some light on the subject. Right?
There's lots of neat notes in the aforementioned Lua design doc, written in a clear and concise manner. And now for a brief glance at Lua's opcodes:
Interesting note: "Tables" were originally just hashes. Arrays are Tables with integer keys. With Lua 5, the Table is a multi-part structure: a hash part AND an array part, either of which may be empty, but are coordinated. This is for the sake of pure array representation: with an array, no keys are necessarily needed, so sequential values indexed from 0 get thrown into the array. Then when someone puts something ridiculous in the table, like an index of 50_000_000, Lua may elect to put that in the hash half. This is all hidden from the user, though.
"if a table is being used as an array, it performs as an array, as long as its integer keys are dense."
UPDATE: Compare with Lorito's proposed opcode set:
WARNING. I've interpolated how *I* think each opcode might perform its function. Caveat emptor.