in reply to Re: Circular reference testing.
in thread Circular reference testing.

Thanks for the supplying the full SP on SVs :)

...but you can't make any assumptions about SVs allocated from different arenas.

Would it be accurate to say that reference values stringified within Perl

  • Are always quad-aligned?
  • Will never be closer than 12 bytes apart?
  • Even if two SV headers are within different arenas, and the arenas are adjacent, the last SV in the lower address arena will still be at least 12 bytes from the end and therefore at least 12 bytes from the first one in the higher addressed arena?
  • If we're on a 64-bit machine, the SV headers are always more than 12 bytes, so using 12 as my divisor is safe?

    (Do they becomes 16 or 24 bytes on 64-bit systems?)

  • A future release of Perl 5 is unlikely to reduce the size of an SV header?

    Any thoughts on getting a rough value for the lowest and highest address that can be a reference at any given moment in time?

    For the lowest, I've been looking through the source trying discover when the various system globals come into being. The thought being that maybe say $^X or $^O would be likely to be allocated pretty early and addresses below that are unlikely to come up in user datastructures, leastwise not as self references? Suggestions for the best choice?

    For the top end, I've been thinking along the lines that any datastructure I am going to traverse already exists, so if I could force the allocation of a new SV, it's address would form an upper bound for my circular reference testing.

    The fly in that ointment is that if I simple declare a new variable, I am quite likely to re-use an old one that has gone out of scope.

    Do you know of any way that I could force the allocation of a completely new SV (header)?

    I thought about allocating a largish array, big enough that it would force a new allocation from the OS and then use the address of the highest as my upper bound. I also thought that if the largish array was immediately undef'd and I then pre-allocated my bit-vector, it might get to re-use the same space.

    From what you are saying, and what little I understand about the way Perl allocates memory, the space allocated for a long string is unlikely to re-use space allocated for a large array because they would be allocated from different pools?


    Examine what is said, not who speaks.
    Silence betokens consent.
    Love the truth but pardon error.
  • Replies are listed 'Best First'.
    Re^3: Circular reference testing.
    by Anonymous Monk on Mar 09, 2005 at 09:32 UTC
      If we're on a 64-bit machine, the SV headers are always more than 12 bytes, so using 12 as my divisor is safe?

      (Do they becomes 16 or 24 bytes on 64-bit systems?)

      SVs (and their friends AVs, HVs, etc) are defined in sv.h. They look like:

      struct STRUCT_SV { /* struct sv { */ void* sv_any; /* pointer to something */ U32 sv_refcnt; /* how many references to us */ U32 sv_flags; /* what we are */ };
      (AVs, HVs, etc, look the same, except they have a differently typed pointer as the first member of the struct). So their size is 8 bytes (assuming 8 bit bytes), plus the size of a pointer. On a system with 64 bits of addressable memory, pointers will be 8 bytes, giving a struct size of 16 bytes.

      As for your other memory related questions, I think you have to dive into malloc.c in the Perl source to get your answer. I looked in the file, and it looks like it's written by someone not familiar with the rest of the perl source code: it actually has large comment sections.

        Thanks for the info on the 64-bit headers.

        With respect to malloc.h. I've been there before, but the picture gets a little confused if you are not able to use Perl's malloc(), as is the case if you want USE_IMP_SYS, which is a pre-requisite for USE_MULTI which is a pre-requisite for both threads and fork. Which basically translates into: if you want threads, you have to use the CRT malloc (on win32 at least).

        That gets further confused by the presence of the routines in win32\vmem.h, which use the CRT memory routines, but then wrap other stuff around them which I won't pretend to understand.

        For my lower bound, I'm still looking to find a system global that gets allocated early and is unlikely to change.

        For the upper bound, I'm still looking at the idea of allocating a largish array and then hoping that undefing it and immediately allocating my bitstring will re-use the space.

        Earlier I was afraid that an array and a string would be allocated from different pools, but I now remember that Perl's arrays are not linked lists, but a continuous block of SVs. So, in theory, once the array is freed, it's space would be eligable for reuse by the string--but there is a lot of supposition in that. I was looking for confirmation/disuassion.


        Examine what is said, not who speaks.
        Silence betokens consent.
        Love the truth but pardon error.
          For the upper bound, I'm still looking at the idea of allocating a largish array and then hoping that undefing it and immediately allocating my bitstring will re-use the space.

          That should be easy to try and find out? Devel::Peek will tell you the memory locations.