in reply to Re^5: Unpacking and converting
in thread Unpacking and converting

My assumption is that the data comes initially from text; a file or other external source.

Thus, despite that it is numeric data, when split to an array, each SV in the array is an SvPV with the numbers stored as ascii strings.

But, if he runs over the array forcing the ascii to be stored in the IV or NV slot, then when Storable packs it, it only packs the IV/NV and not the PV, and so the Storeable image size is reduced.

Of course, if the size of the transmission is really significant, then transmitting the original source text would probably be more efficient. Especially if he zipped it first.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^7: Unpacking and converting
by ikegami (Patriarch) on Feb 16, 2011 at 17:35 UTC
    Yeah. I understood that. That's why I made the same suggestion (minus zipping). Zipping is a good idea, though. It would slow down pre-processing, but it might speed things up overall by reducing the amount of packets to send.

      Zipping is crucial and it really solves the problem. I have already tested it with Net::OpenSSH and it works beautiful with default SSH compression. Gzip algorithm turned out to be fiendishly clever with this report-like data; 26 mb of sample report it was able shrink down to about 640 kb. That's what I call efficiency.

      Theoretical maximum size of the data that system could give out is approx 486 mb per second, compressed down to 5% it would only take ~25 mb. 1 Gbit/s link can reliably transfer about 100 mb per second, thus allowing a decent spare capacity. Now, processing all this stuff is completely different question altogether, but this side I can control.

      Thanks for the input anyway!

      Regards,
      Alex.