I've been noticing that in a number of japhy's posts, he gives opcode listings to support his points. I think that's an amazing thing to do. However, I am clueless when it comes to the opcodes, what they do, and how they're generated. Yes, I know I should go look at the interpreter code, but I'm lazy. :)

So, I was wondering the following:

  1. Is there was a master list of opcodes as well as a design for the algorithm(s) used to generate those listings?
  2. Is there a program that just does the execution phase? In other words, if I generate my own opcode listing, is there something that could run it?
  3. How on earth do people generate those opcode listings?

------
We are the carpenters and bricklayers of the Information Age.

Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

Replies are listed 'Best First'.
(Ovid) Re: Opcodes explained ... ?
by Ovid (Cardinal) on Sep 08, 2001 at 00:09 UTC

    To understand the opcodes, you can pick up a copy of Advanced Perl Programming (the panther book) and read chapter 20. If you want to generate opcode listing of a Perl program, you'll need to recompile it with the <nobr>-DDEBUGGING</nobr> option. Then, to dump the syntax tree (with the opcodes), you'd do something like the following:

    perl -Dx somescript.pl

    On japhy's posts, I have generally noticed that he is listing the regex opcodes. You can get these with the re pragma.

    use strict; use Data::Dumper; use re 'debug'; 'abc123def456' =~ /(?<=f)(\d+)/; print "$1\n";

    The above code will output how the regular expression compiled and will print out the exact steps the regex engine takes to match. Here's the output from 5.6.1 (ActiveState):

    Compiling REx `(?<=f)(\d+)' size 13 first at 1 synthetic stclass `ANYOF[0-9]'. 1: IFMATCH[-1](7) 3: EXACT <f>(5) 5: SUCCEED(0) 6: TAIL(7) 7: OPEN1(9) 9: PLUS(11) 10: DIGIT(0) 11: CLOSE1(13) 13: END(0) stclass `ANYOF[0-9]' minlen 1 Matching REx `(?<=f)(\d+)' against `abc123def456' Setting an EVAL scope, savestack=3 3 <abc> <123def456> | 1: IFMATCH[-1] 2 <ab> <c123def456> | 3: EXACT <f> failed... failed... Setting an EVAL scope, savestack=3 4 <abc1> <23def456> | 1: IFMATCH[-1] 3 <abc> <123def456> | 3: EXACT <f> failed... failed... Setting an EVAL scope, savestack=3 5 <abc12> <3def456> | 1: IFMATCH[-1] 4 <abc1> <23def456> | 3: EXACT <f> failed... failed... Setting an EVAL scope, savestack=3 9 <abc123def> <456> | 1: IFMATCH[-1] 8 <abc123de> <f456> | 3: EXACT <f> 9 <abc123def> <456> | 5: SUCCEED could match... 9 <abc123def> <456> | 7: OPEN1 9 <abc123def> <456> | 9: PLUS DIGIT can match 3 times out of 32767... Setting an EVAL scope, savestack=3 12 <abc123def456> <> | 11: CLOSE1 12 <abc123def456> <> | 13: END Match successful! 456 Freeing REx: `(?<=f)(\d+)'

    The first part (lines numbered 1 to 13) are a breakdown of the regex.

    Cheers,
    Ovid

    Vote for paco!

    Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Re: Opcodes explained ... ?
by japhy (Canon) on Sep 08, 2001 at 00:23 UTC
    Most of my opcode talk is about regexes. Here's a regex opcode primer for you:
    • regnodes.h lists all the regex opcodes, their "names" (like STAR and SUSPEND and IFMATCH), their size (how many "nodes" does each opcode take up), and some other useful info
    • pod/perldebguts.pod has a section on regex debugger output (available via -Dr if you compiled Perl with debugging, and via use re 'debug' otherwise)
    • regcomp.c and regexec.c hold the meat of the regex engine -- these two beasts are the compiler and executor of regexes, so they're probably very daunting (trust me)

    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker.
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;