Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Perl Internals - references and symbol table

by nothingmuch (Priest)
on Nov 16, 2002 at 06:45 UTC ( [id://213366]=perlquestion: print w/replies, xml ) Need Help??

nothingmuch has asked for the wisdom of the Perl Monks concerning the following question:

I was wondering... How does perl know where variables are in memory?

My assumption on package variables, or dynamics, is that it looks up, in the symbol table (known location is pre assumed), the reference for the name. I assume references are a structure containing a counter of refs, and a pointer. The reference's location in memory does not change until it is destroyed, and when the variable is reallocated, the reference pointer is updated.

If i'm right (the above is really but a guess), how does this work on lexical variables? A seperate symbol table? And if not, how does it work?

Thanks in advance

-nuffin
zz zZ Z Z #!perl
  • Comment on Perl Internals - references and symbol table

Replies are listed 'Best First'.
Re: Perl Internals - references and symbol table
by sauoq (Abbot) on Nov 16, 2002 at 07:12 UTC
    I assume references are a structure containing a counter of refs, and a pointer.

    Internally, variables are associated with their own refcounts. This is true of lexicals as well as global (i.e. package) variables. If you have $variable and later assign \$variable to $reference then the refcount associated with $variable increases. The refcount is not associated with $reference. (Unless, of course, there is a reference to the reference floating around somewhere.) A variable's storage is not deallocated until its refcount has gone to zero.

    -sauoq
    "My two cents aren't worth a dime.";
    
      I assumes lexical refs and dynamic refs are not different, as they are just a means of finding the data and freeing it's space when unused.... What I ment by reference is actually not the referencing part of it, more the resolving to a value. I should have not used the word reference, as it is not correct...

      In terms of my question - I guess i need to be more clear - How are lexical variables found via their name, and does the symbol table -> value path work like I assumes?

      -nuffin
      zz zZ Z Z #!perl

        I think you're conflating the two senses of "reference" here. What we call a "variable" is a name for a structure in memory which contains a C pointer which contains the actual location in memory of the variable's value. [1]. So $user = "diotalevi" has two components - the variable "$user" and the value it points to. Normally we don't ever care about this but if you want more on this you should read Gisle Aas' PerlGuts Illustrated, perlguts, broquaint's Of Symbol Tables and Globs and the other handful of man pages followed by the headers specifically hitting sv.h.

        Reference variables are a variation on that in that they are still variables in the sense I just mentioned. The difference is that instead of pointing to a C array they point to another variable. Perl keeps track of what sort of things a variable points to via a FLAGS bitmask.

        The other difference you seem to hung up on is between package and lexical variables. If you'd read some of the documents I mentioned you'd know that your symbol table is just a hash of reference variables. In that case it's explicitly name->variable. shotgunefx brought up the lexical variables scratch pad. Those AoAs (known as a PADLIST) are pointed to by various code references (glossing on CV for those of you that know better) and while the compiler has the opportunity to convert the uses of the $foo variable from my $foo to a direct usage of the variable in the array (ie, no looking up of the variable by name) some other extensions like PadWalker exploit the fact that the variable's name is kept around (and probably used by things that are beyond my ken) and can go get the various lists of lexical names to create a user-friendly hash. Clearer?

        [1] "Number" variables can store their own value without needing to point to somewhere else.

        __SIG__ use B; printf "You are here %08x\n", unpack "L!", unpack "P4", pack "L!", B::svref_2object(sub{})->OUTSIDE;
Re: Perl Internals - references and symbol table
by shotgunefx (Parson) on Nov 16, 2002 at 08:11 UTC
    Paraphrasing.. but the gist is lexicals are stored in scratchpads which is basically an AoA. Scratchpads are assoiciated with { scopes }

    The first element is an array of the lexical names in that pad
    The second element is an array of the values
    If the subroutine recurses, it populates subsequent elements with new values so each sub has it's own set of lexical values. Because most of this is known at compile time, I believe perl usually optimizes the names away and goes directly to the lexicals's index instead of searching through the pads lexicals names for the index. Of course perlguts has a much more gory (and correct) description of the workings.

    -Lee

    "To be civilized is to deny one's nature."
      Scratchpads are assoiciated with { scopes }
      Not quite. Scratchpads are associated with subs, not with blocks. Yes, from a language level it looks like there's a scratchpad per block, but under the hood there really isn't.
        Thanks Elian, I'm trying to improve me understanding of internals (Just got Embedding and Extending Perl). In the following example where do the lexicals for the bare block live then?
        #!/usr/bin/perl use PadWalker qw(peek_my); use strict; use warnings; use Data::Dumper; sub peeker { my $l = shift; print Dumper(peek_my(++$l)) }; my $outmost = "outmost"; { my $inner = "outer"; print peeker(0); } print peeker(0);


        -Lee

        "To be civilized is to deny one's nature."
Re: Perl Internals - references and symbol table
by Elian (Parson) on Nov 16, 2002 at 20:45 UTC
    That's pretty simple. The perl compiler phase is what translates your sorce to an internal representation that gets executed--if it lost track of your variables as it did that it'd be a pretty poor compiler. ;)

    Seriously, variables are in two places, as folks have more or less already pointed out. First are the global variables, the ones you can look up by name. They're stored in the %main:: hash, or a hash hanging off of it. Perl's got a handle on the %main:: hash stuck away in the interpreter structure, and can find it whenever it wants.

    Subs have a scratchpad associated with them, where the lexicals are stored. A pointer to this structure is stored in the subroutine's internal bits, and when you enter a sub the interpreter instantiates the scratchpad for the sub.

    Finding lexicals is actually easy. The compiler knows what lexicals you're using, as they're compile time constant things. When it goes and builds a sub's scratchpad it just stuffs all the unique lexicals for the sub into it. Since it keeps track of which lexical's gone in which slot, when it generates the "fetch a lexical" code for the interpreter, it just fills in which lexical slot the variable needs to come from.

    There's nothing particularly revolutionary, or even cutting-edge, about perl's compiler code. You might want to find a book on the fundamentals of compilers (but not the Dragon book! It sucks) and read through it for a good idea of how this stuff all works.

    Update: File-scoped lexicals are, interestingly enough, not a special case. Somewhere in the docs there's an off-hand comment about the entire file being wrapped in a set of braces. Well, as far as the compiler's concerned that's true--the whole file is, if there are any file-scoped lexicals, living inside an anonymous subroutine. (Which makes all the subs in a file with file scoped lexicals really a closure) Makes everything work out nicely.

    If anyone's really interested in the way things are stored internally, I'd recommend starting in gv.c and going from there. Otherwise just assume it's all magic, which is as good an explanation as any.

      If you're wondering how code like:
      sub bar { my $foo; { my $foo; } }
      can possibly have only one scratchpad, that's easy--perl knows the two $foo slots are different, and makes sure it gets the right one when it goes looking. (When lexicals are accessed they're all accessed by slot number, not by name, so the actual name is pretty irrelevant at runtime)

      The only place that this could get interesting is with string eval, but since perl also tracks the lines that a particular lexical is active for, it's not a problem.

      book on the fundamentals of compilers (but not the Dragon book! It sucks)

      Why? Some people think that it is the "classic compiler text", what are they missing? Is this another holy war I'm not aware of?

      Let me clarify that I'm not challenging you (You're writing Parrot for goodness sake.), but I like to understand differences.

        The Dragon book is a classic because it was the first book in the field of any significance, not because it's actually any good. (Consider it a classic compiler book in the same way that "Plan 9 from Outer Space" is a classic SciFi movie) Everything you need for a simple compiler is in there, but you'll get it out with far more pain and confusion than the subject warrants. Newer books in the field, such as Modern Compiler Design by Grune, Bal, Jacobs & Langendoen (just off the top of my bookshelf), are much better and far easier to read.
Re: Perl Internals - references and symbol table
by Aristotle (Chancellor) on Nov 16, 2002 at 21:11 UTC
    You will probably enjoy Gisle Aas' PerlGuts Illustrated - which, btw, I find a very worthwhile read for any abitious Perl programmer who doesn't yet know how perl works on the inside. Beware, it's a pretty hefty read, but will answer this and many more questions.

    Makeshifts last the longest.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://213366]
Approved by BrowserUk
Front-paged by wil
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (2)
As of 2024-04-20 06:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found