Re: Fast(er) serialization in Perl

Replies are listed 'Best First'.
Re^2: Fast(er) serialization in Perl by mrguy123 (Hermit) on Apr 11, 2010 at 12:23 UTC
Thanks Because most of the logic of the program is based on the hashes (and I'm not sure I want to change it just yet) I want to keep the hashes functional, so I'm not sure if if I can use the array option Regarding the web time, because of University guidelines the web based part is actually in PHP :(. However, it is not the bottleneck that needs fixing most urgently The question is, can I keep my giant hashes, and still save time (mostly on loading them)?	[reply]
Re^3: Fast(er) serialization in Perl by The Perlman (Scribe) on Apr 11, 2010 at 12:41 UTC
>I want to keep the hashes functional as I told you further down you can use Tie::Hash to keep the interface functional. IMHO there are no general solutions faster than storable ! * You need to provide more infos. If your php is calling your perlscript you should check if you can hold the data structure in memory. You may also check if you're not running into RAM problems causing massive swappings. I once speeded up a program just by trasforming a huge hash into a hash of hashes (by halving the keys). Since the system only retrieved the sub-hashes actually needed from disk swap, I had a fantastic speed gain. If this is transferable to your case is unknown since you do not provide enough infos... Footnote: (*) from Storable "SPEED The heart of Storable is written in C for decent speed. Extra low-level optimizations have been made when manipulating perl internals, to sacrifice encapsulation for the benefit of greater speed." IMHO it's evident that you need to invest brain to achieve further speed gains!	[reply]
Re^4: Fast(er) serialization in Perl by mrguy123 (Hermit) on Apr 11, 2010 at 13:33 UTC
This is how the hash basically looks (it goes on for about 6 million lines (genes)): $VAR1 = { 'microT' => { 'mmu-miR-704' => { 'NM_009309' => '1', 'NM_133983' => '1', 'NM_175563' => '1', 'NM_010889' => '1', 'NM_008302' => '1', 'NM_022023' => '1', 'NM_009567' => '1', 'NM_172938' => '1', 'NM_029777' => '3', 'NM_134189' => '1', 'NM_175025' => '1', 'NM_177327' => '1', 'NM_026807' => '1', 'NM_178779' => '3', 'NM_010770' => '1', 'NM_031998' => '1', 'NM_145584' => '2', 'NM_207682' => '1', 'NM_001005525' => '1', 'NM_080853' => '1', 'NM_145519' => '1', 'NM_031249' => '1', 'NM_172923' => '1', 'NM_001008700' => '1', 'NM_198617' => '1', 'NM_027400' => '1', 'NM_026406' => '2', 'NM_021296' => '2', 'NM_027652' => '1', 'NM_001045530' => '1', 'NM_018830' => '1', 'NM_025314' => '1', 'NM_009041' => '1', 'NM_026829' => '3', 'NM_026618' => '1', 'NM_027472' => '1', 'NM_027870' => '1', 'NM_001033239' => '1', 'NM_026348' => '1', 'NM_008223' => '1', 'NM_009595' => '2', 'NM_146094' => '1', 'NM_144945' => '1', 'NM_019510' => '1', 'NM_001033251' => '1', 'NM_001081213' => '3', 'NM_008031' => '1', 'NM_028719' => '1', 'NM_133352' => '1', 'NM_008133' => '1', 'NM_008317' => '1', 'NM_021327' => '1', 'NM_178751' => '1', 'NM_010260' => '1', 'NM_025683' => '1', 'NM_026383' => '1', 'NM_001081367' => '1', 'NM_001033354' => '2', 'NM_026034' => '1', 'NM_173395' => '1', 'NM_010762' => '1', 'NM_024432' => '1', 'NM_175113' => '1', 'NM_001077425' => '1', 'NM_026374' => '1', 'NM_026655' => '1', 'NM_177345' => '1', 'NM_027412' => '1', 'NM_183187' => '1', 'NM_016687' => '1', 'NM_175640' => '1', 'NM_007559' => '1', 'NM_011269' => '1', 'NM_010252' => '1', 'NM_019657' => '1', [download] I'm not very familiar with Tie::Hash, so if you can give me a quick heads up on how I can use it to save time it would be great	[reply] [d/l]
Re^5: Fast(er) serialization in Perl by The Perlman (Scribe) on Apr 11, 2010 at 13:47 UTC
Re^5: Fast(er) serialization in Perl by The Perlman (Scribe) on Apr 11, 2010 at 14:11 UTC
Re^6: Fast(er) serialization in Perl by mrguy123 (Hermit) on Apr 11, 2010 at 15:16 UTC
Some notes below your chosen depth have not been shown here
Re^3: Fast(er) serialization in Perl by The Perlman (Scribe) on Apr 11, 2010 at 13:17 UTC
the web based part is actually in PHP you should benchmark if storable is really your bottleneck, starting a non persistent perl-process takes some time ...	[reply]
Re^4: Fast(er) serialization in Perl by mrguy123 (Hermit) on Apr 11, 2010 at 13:38 UTC
The retrieval of the hash takes 12 seconds (out of less than 20 secs overall) so if I can take that number down a bit, I'm happy	[reply]
Re^5: Fast(er) serialization in Perl by The Perlman (Scribe) on Apr 11, 2010 at 13:49 UTC
Re^5: Fast(er) serialization in Perl by Marshall (Canon) on Apr 12, 2010 at 22:29 UTC

>I want to keep the hashes functional

as I told you further down you can use Tie::Hash to keep the interface functional.

IMHO there are no general solutions faster than storable ! *

You need to provide more infos.

If your php is calling your perlscript you should check if you can hold the data structure in memory.

You may also check if you're not running into RAM problems causing massive swappings.

I once speeded up a program just by trasforming a huge hash into a hash of hashes (by halving the keys). Since the system only retrieved the sub-hashes actually needed from disk swap, I had a fantastic speed gain.

If this is transferable to your case is unknown since you do not provide enough infos...

Footnote: (*)

from Storable
"SPEED
The heart of Storable is written in C for decent speed. Extra low-level optimizations have been made when manipulating perl internals, to sacrifice encapsulation for the benefit of greater speed."

[reply]

$VAR1 = {
          'microT' => {
                        'mmu-miR-704' => {
                                           'NM_009309' => '1',
                                           'NM_133983' => '1',
                                           'NM_175563' => '1',
                                           'NM_010889' => '1',
                                           'NM_008302' => '1',
                                           'NM_022023' => '1',
                                           'NM_009567' => '1',
                                           'NM_172938' => '1',
                                           'NM_029777' => '3',
                                           'NM_134189' => '1',
                                           'NM_175025' => '1',
                                           'NM_177327' => '1',
                                           'NM_026807' => '1',
                                           'NM_178779' => '3',
                                           'NM_010770' => '1',
                                           'NM_031998' => '1',
                                           'NM_145584' => '2',
                                           'NM_207682' => '1',
                                           'NM_001005525' => '1',
                                           'NM_080853' => '1',
                                           'NM_145519' => '1',
                                           'NM_031249' => '1',
                                           'NM_172923' => '1',
                                           'NM_001008700' => '1',
                                           'NM_198617' => '1',
                                           'NM_027400' => '1',
                                           'NM_026406' => '2',
                                           'NM_021296' => '2',
                                           'NM_027652' => '1',
                                           'NM_001045530' => '1',
                                           'NM_018830' => '1',
                                           'NM_025314' => '1',
                                           'NM_009041' => '1',
                                           'NM_026829' => '3',
                                           'NM_026618' => '1',
                                           'NM_027472' => '1',
                                           'NM_027870' => '1',
                                           'NM_001033239' => '1',
                                           'NM_026348' => '1',
                                           'NM_008223' => '1',
                                           'NM_009595' => '2',
                                           'NM_146094' => '1',
                                           'NM_144945' => '1',
                                           'NM_019510' => '1',
                                           'NM_001033251' => '1',
                                           'NM_001081213' => '3',
                                           'NM_008031' => '1',
                                           'NM_028719' => '1',
                                           'NM_133352' => '1',
                                           'NM_008133' => '1',
                                           'NM_008317' => '1',
                                           'NM_021327' => '1',
                                           'NM_178751' => '1',
                                           'NM_010260' => '1',
                                           'NM_025683' => '1',
                                           'NM_026383' => '1',
                                           'NM_001081367' => '1',
                                           'NM_001033354' => '2',
                                           'NM_026034' => '1',
                                           'NM_173395' => '1',
                                           'NM_010762' => '1',
                                           'NM_024432' => '1',
                                           'NM_175113' => '1',
                                           'NM_001077425' => '1',
                                           'NM_026374' => '1',
                                           'NM_026655' => '1',
                                           'NM_177345' => '1',
                                           'NM_027412' => '1',
                                           'NM_183187' => '1',
                                           'NM_016687' => '1',
                                           'NM_175640' => '1',
                                           'NM_007559' => '1',
                                           'NM_011269' => '1',
                                           'NM_010252' => '1',
                                           'NM_019657' => '1',
[download]

[reply]
[d/l]

the web based part is actually in PHP

you should benchmark if storable is really your bottleneck, starting a non persistent perl-process takes some time ...

[reply]

The retrieval of the hash takes 12 seconds (out of less than 20 secs overall) so if I can take that number down a bit, I'm happy

[reply]