in reply to Memory efficient way to deal with really large arrays?

> Now in C, to do this I only need 5GB of ram: For the pairs: 500M * 4 (32bit int) * 2 (pairs) = 4GB. For the occurances: 1G * 1 (8bit int) = 1GB

that's an illusion, if you use an array of bytes in C, you'll have 2**32 potential points. That's already 4.3GB for the occurrence counts in @d, not 1GB.

But if you think it's that easy in C, you'll easily reproduce it's arrays with pack , substr and unpack in Perl.

Of course slower, but you haven't told us yet what your goal is.

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

  • Comment on Re: Memory efficient way to deal with really large arrays?

Replies are listed 'Best First'.
Re^2: Memory efficient way to deal with really large arrays?
by salva (Canon) on Dec 13, 2020 at 18:18 UTC
    That makes me thing that one can actually calculate the counters without using any memory at all. You only have to sort the array (or arrays) and then traverse it counting consecutive equal elements.

    ... well, obviously, unless you need to keep the counters in memory for some further processing, but as you have said, the OP hasn't told us yet what his final goal is!

      If you keep the pairs, i.e. 2 dim points, how would you want to sort them in one dimension?

      > the OP hasn't told us yet what his final goal is!

      Exactly!

      For instance, if he wanted to process the data sequentially he could also sort them into HoHs or AoHs, since Perl is swapping out unused data.

      Like this only the currently processed second level structure would be in RAM and the performance overhead for swapping would be negligible.

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery