Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I happened to go through the Perl 5 interpreter code at github, and was completely overwhelmed. What is the best way to get started with that? I don't have a degree in CS so I'm not quite familiar with compilers and interpreters, Are there any resources or some reading that will help, Something like a book?

Replies are listed 'Best First'.
Re: Understanding the Perl Interpreter
by jethro (Monsignor) on Apr 08, 2010 at 10:12 UTC

    Looking at a compiler without some CS knowledge would be almost impossible I suspect, but an interpreter is somewhat easier

    Still there is the parser where some knowledge about grammars(the CS kind), LR-parser, LL-parser, LALR-parser and so on could be useful. Maybe reading about it in wikipedia might give some background. Then you could take a look at lex/yacc and bison (open source software packages to parse computer languages, lex or yacc for the syntax, bison for the semantic) to see how a parser works and even do a simple example with them. But I suspect that perls "Do what I mean" combined with highly optimized code left the perl parser far removed from theory.

    Another introducory text you might read would be http://compilers.iecc.com/crenshaw/ even though it is about compilers, you get a lot of background information

    Then you should take a look at the internal representation/intermediate language used in perl from the outside. Take a minimal script and look at it with different debugging parameters "-Dx", see perlrun. For example "perl -D1 <yourminimalscript>" would show you how the tokens are translated to terms, expressions, scalars, blocks.... -D8 shows every step of the execution of the intermediate language.

    With that background knowledge the source should make much more sense now

      But I suspect that perls "Do what I mean" combined with highly optimized code left the perl parser far removed from theory.
      It is my understanding that the scary part isn't in the parsing - Perl is parsed by yacc/bison.

      The scary part is the context aware tokenizer (tokenizing is the part that comes before parsing, and is hardly looked at when introducing parsers and compilers in CS studies).

      Thanks, I will take it one at a time and will go a step below on what I don't understand. I just realized how easy it is to get overwhelmed looking at depth and breadth of the topic.
Re: Understanding the Perl Interpreter
by moritz (Cardinal) on Apr 08, 2010 at 08:50 UTC
Re: Understanding the Perl Interpreter
by ikegami (Patriarch) on Apr 08, 2010 at 15:20 UTC

    Devel::Peek shows you the internal details of scalars.

    >perl -e"use Devel::Peek; $x='123'; Dump($x); 0+$x; Dump($x);" SV = PV(0x2369cc) at 0x182a22c REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x23fee4 "123"\0 CUR = 3 LEN = 4 SV = PVIV(0x182005c) at 0x182a22c REFCNT = 1 FLAGS = (IOK,POK,pIOK,pPOK) IV = 123 PV = 0x23fee4 "123"\0 CUR = 3 LEN = 4

    illguts explains this area.

    And then there's -MO=Concise to show you the opcode tree.

    >perl -MO=Concise -e"for (1..2) { print qq{Hello World\n} }" g <@> leave[1 ref] vKP/REFC ->(end) 1 <0> enter ->2 2 <;> nextstate(main 2 -e:1) v ->3 f <2> leaveloop vK/2 ->g 7 <{> enteriter(next->c last->f redo->8) lKS/8 ->d - <0> ex-pushmark s ->3 - <1> ex-list lK ->6 3 <0> pushmark s ->4 4 <$> const[IV 1] s ->5 5 <$> const[IV 2] s ->6 6 <#> gv[*_] s ->7 - <1> null vK/1 ->f e <|> and(other->8) vK/1 ->f d <0> iter s ->e - <@> lineseq vK ->- 8 <;> nextstate(main 1 -e:1) v ->9 b <@> print vK ->c 9 <0> pushmark s ->a a <$> const[PV "Hello World\n"] s ->b c <0> unstack v ->d -e syntax OK >perl -MO=Concise,-exec -e"for (1..2) { print qq{Hello World\n} }" 1 <0> enter 2 <;> nextstate(main 2 -e:1) v 3 <0> pushmark s 4 <$> const[IV 1] s 5 <$> const[IV 2] s 6 <#> gv[*_] s 7 <{> enteriter(next->c last->f redo->8) lKS/8 d <0> iter s e <|> and(other->8) vK/1 8 <;> nextstate(main 1 -e:1) v 9 <0> pushmark s a <$> const[PV "Hello World\n"] s b <@> print vK c <0> unstack v goto d f <2> leaveloop vK/2 g <@> leave[1 ref] vKP/REFC -e syntax OK

    You can locate the code for each op by searching for pp_op_name. It'll be in one of the pp*.c files. For example, the code for enteriter is in pp_enteriter in pp_ctl.c.

    This provides a good entry into perl since you're looking at code whose function you already know.

Re: Understanding the Perl Interpreter
by Anonymous Monk on Apr 08, 2010 at 08:42 UTC
Re: Understanding the Perl Interpreter
by Marshall (Canon) on Apr 08, 2010 at 10:40 UTC
    This distinction between compiled vs interpreted computer languages shouldn't matter to you one bit. Basically you instruct the computer what to do in a "language", like Perl, C, Java, etc. When you use a spreadsheet like Excel and you say add column A to column B, you are giving a program instructions about what to do. Excel is a very high level "language".

    Perl is not a "beginner language". For a CS bound student, I would recommend 'C' as a first language. If not, then I would start with BASIC or JAVA. I guess this has to do with where you are headed and what you want to be able to do with software. Often what is needed is just knowledge of say SQL for database queries and some spreadsheet voodoo.

    Start with the idea of what you what to accomplish. If you can get this down to: "I wanna be able to do X". Then you have a goal. What is your first goal?

A reply falls below the community's threshold of quality. You may see it by logging in.