Theoretically, but given that you'll never be able to allocate a single string >2**31 on a 32-bit machine, it seems to be overkill.
I was not aware of that. What about other machines? Does it generalize to "the high/sign bit of IV is never used for valid string lengths"? If so, the code can definitely be simplified.
| [reply] |
I've been searching. Do you have any documentation on that? STRLEN is Size_t, which is size_t in my build, and size_t is (usually? always? in my case) an unsigned type.
| [reply] [d/l] [select] |
Do you have any documentation on that?
Across all OSs and hardware, no. I cannot give that guarantee.
However, even if there are 32-bit hardware/OS combinations that allow the allocation of a single contiguous entity of greater than 2GB, I don't believe that perl memory allocation routines would allow it because of the math that is done in a macro (something like MEMORY_WRAP_CHECK(*) or similar). That's from memory [sic], subject to my having interpreted the code correctly; and could have changed subsequent to my last looking at it.
What I can say is that normally,
- Win32 only allows user space processes access to 2GB of ram;
- Linux (without kernel patches; circa 2.4.23) only allows 1 GB. With patches this can be extended to 2GB.
There are two methods (for either OS on x86) for extending this reach.
- /LARGEADDRESSAWARE & /3GB (called ZONE_HIGHMEM on linux (I believe!)).
In my experiments with this on my old machine, whilst I could allocate up to 3GB per process, I could not allocate any single entity greater than 2GB.
The way LAA works, is that it maps chunks of the memory above the 2GB limit through a "window" in the process' normal address space. Hence, no single allocation greater (or even close once you take the process' normal code, data and stack requirements into consideration) is possible,
- Page Address Extension.
This works by mapping multiple physical addresses, in the 36-bit physical address space, into a single window within the process' 32-bit address space. Again, this has to be mapped within the process' 2GB (1GB linux) user space limits, so no single entity of greater than 2GB is possible,
I admit that this doesn't cover off all the exotica (hardware and software) where Perl can run, but I would very much opt for the pragmatic solution of avoiding slowing down the common place, in order to cater for the potential of unknown exotica.
That is, I would code for the assumption that no single string can be greater than 2GB and allow those porting to such exotica to handle the case of >2GB if, as, and when the need arises.
References: win32 & Linux
(*)Update: MEM_WRAP_CHECK(), MEM_WRAP_CHECK_1(), MEM_WRAP_CHECK_2() etc.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |