in reply to Circular reference testing.

Interesting technique. In Data::Compare I use something very different - I just keep a hash whose keys are stringified representations of everywhere that I have been. Your method wins if you have huge amounts of data, and I might adopt it myself, but I'd be worried that it would break: It looks awfully fragile.

Replies are listed 'Best First'.
Re^2: Circular reference testing.
by BrowserUk (Patriarch) on Mar 09, 2005 at 11:02 UTC

    I'm specifically trying to address the memory consumption of existing solutions when applied to huge data structures. Data::Dumper also uses a hash, and regularly uses upwards of 50% of the memory consumed by the datastructure being traversed, for a combination of reference checking and output accumulation. That's okay for small and even medium-sized structures, but it sucks for big stuff.

    This technique reduces the memory required for the checking (without all the extra information for all the other things that Data::Dumper does) to 1% (1-byte represents 96 bytes).

  • if the code was moved from a 32-bit to a 64-bit machine
  • on different versions of perl, especially 5.0 vs 5.8
  • using the system malloc() vs perl's own malloc()
  • It looks awfully fragile.
  • These are the very things that I am attempting to find out through this thread. My current belief is that it should be solid across all those variations if I stick to the current formulation of 1-bit : 12-bytes and make no attempt to further trim by applying the bounds.

    My (current) understanding is that no two references visible to Perl will ever coexist within a given 12-byte space regardless of the build/system/platform/malloc() employed. My understanding is obviously open to correction, but that is certainly the way it looks to me at this point.

    Thorough testing will obviously be necessary, but I think that the speed, simplicity and small footprint are worth pursuing.

    It has crossed my mind that if it works as I think it will, then it could be used internally by Perl to alleviate the "circular references preventing release of memory" problem. At least where these are self referential rather than co-referential.


    Examine what is said, not who speaks.
    Silence betokens consent.
    Love the truth but pardon error.
      You are of course not interesting in all the memory locations used by perl, just the SVs (AVs, HVs, etc), because that's where the reference counting is, and if they get cleaned up, they take away what's behind them as well.

      Is it possible to write a piece of XS code that gives you the memory pools used for the SVs? You could then use a bitvector per SV memory pool.

        Indeed. It's just the headers that I need to consider.

        You could then use a bitvector per SV memory pool.

        That would make sense for an internal implementation, but it's way beyond my needs.


        Examine what is said, not who speaks.
        Silence betokens consent.
        Love the truth but pardon error.