in reply to Circular reference testing.

SVs are in two parts: the head, which is fixed size and is 12 bytes on a typical 32-bit processor, and which may contain a pointer to the body, which is variable sized depending on the type (integer, string, reference, hash, etc). The address of the head is what is displayed by ref() etc. SV heads are allocated from arenas: malloc'd chunks of memory approx 1K in size. Within arenas, the addresses of SVs will be offset from neighbours by 12 bytes (again, only for a typical 32-bit system), but you can't make any assumptions about SVs allocated from different arenas.

Of course this might change completely in a future release of perl.

Dave.

Replies are listed 'Best First'.
Re^2: Circular reference testing.
by BrowserUk (Patriarch) on Mar 09, 2005 at 00:59 UTC

    Thanks for the supplying the full SP on SVs :)

    ...but you can't make any assumptions about SVs allocated from different arenas.

    Would it be accurate to say that reference values stringified within Perl

  • Are always quad-aligned?
  • Will never be closer than 12 bytes apart?
  • Even if two SV headers are within different arenas, and the arenas are adjacent, the last SV in the lower address arena will still be at least 12 bytes from the end and therefore at least 12 bytes from the first one in the higher addressed arena?
  • If we're on a 64-bit machine, the SV headers are always more than 12 bytes, so using 12 as my divisor is safe?

    (Do they becomes 16 or 24 bytes on 64-bit systems?)

  • A future release of Perl 5 is unlikely to reduce the size of an SV header?

    Any thoughts on getting a rough value for the lowest and highest address that can be a reference at any given moment in time?

    For the lowest, I've been looking through the source trying discover when the various system globals come into being. The thought being that maybe say $^X or $^O would be likely to be allocated pretty early and addresses below that are unlikely to come up in user datastructures, leastwise not as self references? Suggestions for the best choice?

    For the top end, I've been thinking along the lines that any datastructure I am going to traverse already exists, so if I could force the allocation of a new SV, it's address would form an upper bound for my circular reference testing.

    The fly in that ointment is that if I simple declare a new variable, I am quite likely to re-use an old one that has gone out of scope.

    Do you know of any way that I could force the allocation of a completely new SV (header)?

    I thought about allocating a largish array, big enough that it would force a new allocation from the OS and then use the address of the highest as my upper bound. I also thought that if the largish array was immediately undef'd and I then pre-allocated my bit-vector, it might get to re-use the same space.

    From what you are saying, and what little I understand about the way Perl allocates memory, the space allocated for a long string is unlikely to re-use space allocated for a large array because they would be allocated from different pools?


    Examine what is said, not who speaks.
    Silence betokens consent.
    Love the truth but pardon error.
      If we're on a 64-bit machine, the SV headers are always more than 12 bytes, so using 12 as my divisor is safe?

      (Do they becomes 16 or 24 bytes on 64-bit systems?)

      SVs (and their friends AVs, HVs, etc) are defined in sv.h. They look like:

      struct STRUCT_SV { /* struct sv { */ void* sv_any; /* pointer to something */ U32 sv_refcnt; /* how many references to us */ U32 sv_flags; /* what we are */ };
      (AVs, HVs, etc, look the same, except they have a differently typed pointer as the first member of the struct). So their size is 8 bytes (assuming 8 bit bytes), plus the size of a pointer. On a system with 64 bits of addressable memory, pointers will be 8 bytes, giving a struct size of 16 bytes.

      As for your other memory related questions, I think you have to dive into malloc.c in the Perl source to get your answer. I looked in the file, and it looks like it's written by someone not familiar with the rest of the perl source code: it actually has large comment sections.

        Thanks for the info on the 64-bit headers.

        With respect to malloc.h. I've been there before, but the picture gets a little confused if you are not able to use Perl's malloc(), as is the case if you want USE_IMP_SYS, which is a pre-requisite for USE_MULTI which is a pre-requisite for both threads and fork. Which basically translates into: if you want threads, you have to use the CRT malloc (on win32 at least).

        That gets further confused by the presence of the routines in win32\vmem.h, which use the CRT memory routines, but then wrap other stuff around them which I won't pretend to understand.

        For my lower bound, I'm still looking to find a system global that gets allocated early and is unlikely to change.

        For the upper bound, I'm still looking at the idea of allocating a largish array and then hoping that undefing it and immediately allocating my bitstring will re-use the space.

        Earlier I was afraid that an array and a string would be allocated from different pools, but I now remember that Perl's arrays are not linked lists, but a continuous block of SVs. So, in theory, once the array is freed, it's space would be eligable for reuse by the string--but there is a lot of supposition in that. I was looking for confirmation/disuassion.


        Examine what is said, not who speaks.
        Silence betokens consent.
        Love the truth but pardon error.