in reply to Re: 2GB limit to vecs
in thread 2GB limit to vecs

Did you? It appears to me that Bit::Vector's new() takes a number of bits via an argument of type N_int which is defined as "unsigned int" which is likely 32 bits which means 2**32 bits or 2**32/8 bytes or 512MB, not even 2GB.

I went to try Bit::Vector but ended up killing it before it finished allocating the (it appears) 512MB of memory.

- tye        

Replies are listed 'Best First'.
Re^3: 2GB limit to vecs (Bit::Vector?)
by samtregar (Abbot) on Jun 23, 2008 at 17:10 UTC
    Too bad it didn't work. I think at the very least you'll need to be on a 64-bit platform where an int is 64-bits. That might let even vec() work since Perl's "32 bit" type might end up being 64 bits anyway.

    I suppose the obvious fix here is to split your data into chunks and make access a two step process - pick the right chunk and then vec() into it.

    -sam

      That might let even vec() work since Perl's "32 bit" type might end up being 64 bits anyway.

      Seems the I32 type will still be 32 bits even if the int is 64 bits. (almut mentioned this earlier in this thread - though, admittedly, I don't know if it's actually *proven* to be so.)

      I suppose the obvious fix here is to split your data into chunks and make access a two step process

      Yes, I think that should work. In this particular case it would be necessary to increase the chunk size to 2 only.

      The OP's problem could also be handled with Math::GMP or Math::GMPz, since the GMP library measures the bit position with an unsigned long - thus allowing a max bit position value of 2 ** 32 - 1 (for 32-bit longs). The tricky bit there is in getting the string into the GMP object without doubling the memory requirement of the program. The GMP library does have functions that allow read/write directly from/to file - thus, I assume, avoiding that "doubling up" problem. Math::GMP doesn't provide access to those functions but Math::GMPz-0.26 does (in the form of Rmpz_out_raw and Rmpz_inp_raw).

      Note: I'm not actually recommending that Math::GMPz-0.26 (which I submitted to CPAN just last night) be the solution that the OP ought to adopt. I'm just pointing out that it could be used. (Earlier versions of Math::GMPz don't provide Rmpz_out_raw and Rmpz_inp_raw .... due to oversight :-)

      Cheers,
      Rob