Re: 2GB limit to vecs
by almut (Canon) on Jun 20, 2008 at 07:14 UTC
|
The prototype of the respective function (doop.c, line 741) suggests the
offset is in fact 32-bit:
UV Perl_do_vecget(pTHX_ SV *sv, I32 offset, I32 size)
at least if the type I32 always is 32-bit — which it seems to be... as it
turned out in a recent thread.
(As it's a signed int, the largest positive value is 231-1, or 2,147,483,647.) | [reply] [d/l] [select] |
Re: 2GB limit to vecs
by salva (Canon) on Jun 20, 2008 at 07:32 UTC
|
Perl_do_vecget(pTHX_ SV *sv, I32 offset, I32 size)
I32 is perl type alias for 32 bit signed integers.
I don't know if there is any reason to be using I32 for indexes instead of IVs (native ints or bigger). Vectors are not the unique data structure with that limitation, the I32 type is used pervasively inside perl code for index values, for instance, arrays or substr also use 32 bit indexes.
The work around is to create your own XS vector module with support for 64bit indexes, but anyway, fill a bug report with perlbug, and it may be fixed for 5.12! | [reply] [d/l] [select] |
|
|
| [reply] |
|
|
Eliminating all the I32 indexes on the interpreter would break binary compatibility, so it would be very unlikely to happen on the 5.10 branch. Though, just fixing vec could be.
| [reply] [d/l] |
|
|
Re: 2GB limit to vecs
by BrowserUk (Patriarch) on Jun 20, 2008 at 10:03 UTC
|
Until a better version of vec is available, you might try something like this:
sub myvec(\$$$) :lvalue {
use constant TWO_GB => 2**31;
my( $ref, $offset, $bits ) = @_;
if( $offset > TWO_GB - 1 ) {
$offset -= TWO_GB;
$ref = \substr $$ref, ( TWO_GB * $bits ) / 8;
}
CORE::vec( $$ref, $offset, $bits );
}
Which should be reasonably efficient as it avoids copying the huge string. If you wanted to get fancy in anticipation of the fixed version, you could stick it in a module and export it as CORE::GLOBAL::vec.
The above is untested at the transition limit as I don't have enough memory to create strings that big. You might want to look closely at that TWO_GB - 1...
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
|
|
Dear Monks,
thank you for all your help! I used the same workaround (working with different chunks), albeit nowhere near as elegant as that - I will use this from now on.
(I am on a 64 bit machine - I just had thought that when using a 64 bit version of perl everything would be running on 64 bits.)
thank you,
bop
| [reply] |
|
|
sub myvec(\$$$) :lvalue {
use constant TWO_GB => 2**31;
my( $ref, $offset, $bits ) = @_;
while( $offset > TWO_GB - 1 ) {
$offset -= TWO_GB;
$ref = \substr $$ref, ( TWO_GB * $bits ) / 8;
}
CORE::vec( $$ref, $offset, $bits );
}
but again that's untested, so check it and convince yourself.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
|
|
Re: 2GB limit to vecs
by samtregar (Abbot) on Jun 22, 2008 at 18:01 UTC
|
| [reply] |
|
|
Did you? It appears to me that Bit::Vector's new() takes a number of bits via an argument of type N_int which is defined as "unsigned int" which is likely 32 bits which means 2**32 bits or 2**32/8 bytes or 512MB, not even 2GB.
I went to try Bit::Vector but ended up killing it before it finished allocating the (it appears) 512MB of memory.
| [reply] |
|
|
| [reply] |
|
|
Re: 2GB limit to vecs
by salva (Canon) on Jun 26, 2008 at 13:46 UTC
|
| [reply] |