http://qs1969.pair.com?node_id=11143548


in reply to Re^5: Reduce RAM required
in thread Reduce RAM required

Is it "wrapping in-place"? Even if, because of COW, no data is moved in memory with this assignment:

${ $pdl-> get_dataref } = $scalar_eg_5_Gb_long

it still requires pre-existance of 5 Gb, zero or garbage-filled, ndarray, or is it wrong? At some point during execution, RAM usage would peak to 10 Gb.

Replies are listed 'Best First'.
Re^7: Reduce RAM required
by etj (Deacon) on May 03, 2022 at 21:16 UTC
    That's a great point! Currently, there would indeed be an instant where both the ndarray and the input SV needed to have the full amount allocated, in part because get_dataref physicalises its ndarray (i.e. allocates its memory). COW semantics do open a bit of a can of worms, because PDL currently assumes it has full ownership of the block of memory pointed at by the PV. I am going to just hope that all works fine.

    PDL's File::Map-using code (via PDL::IO::FastRow) is an alternative approach, which would avoid the use of RAM entirely.

    This being a long-standing issue suggests to me there isn't a huge demand for it. However, I am very open to adding a PDL->from_sv method that does what your code does (and would also set the datasv member to the passed-in SV rather than use a char * and SvREFCNT_inc it) - it would also need to deal with the COW situation correctly, which I don't know how to do. Would that help? I think it would actually provide a more generalised implementation of the File::Map stuff in any case.