For instance, removing everything not in the hot path for your script:
while( @fhs ) { printf "\r%d\t", scalar( @fhs ); my $r = int rand @fhs; 57 unless( length( $bufs[ $r ] ) ); my $nextRec = substr( $bufs[ $r ], -16, 16, '' ); }
Perl compiles it into 40 OPs, 35 inside the loop:
My estimation is that every OP performs between 50 and 100 machine instructions with a good proportion of performance-unfriendly conditional branches. So, roughly, every run of the loop above would require at least 2000 instructions and may be reaching the capacity of the L1 code cache.$ perl -MO=Concise,-exec buk.pl 1 <0> enter 2 <;> nextstate(main 5 buk.pl:1) v:{ 3 <{> enterloop(next->z last->13 redo->4) v 10 <#> gv[*fhs] s 11 <1> rv2av[t2] sK/1 12 <|> and(other->4) vK/1 4 <;> nextstate(main 1 buk.pl:2) v 5 <0> pushmark sM 6 <$> const[PV "\r%d\t"] sM 7 <#> gv[*fhs] s 8 <1> rv2av[t5] sK/1 9 <@> prtf vK a <;> nextstate(main 1 buk.pl:3) v b <#> gv[*fhs] s c <1> rv2av[t8] sK/1 d <1> rand[t9] sK/1 e <1> int[t10] sK/1 f <0> padsv[$r:1,3] sRM*/LVINTRO g <2> sassign vKS/2 h <;> nextstate(main 2 buk.pl:4) v:{ i <#> gv[*bufs] s j <1> rv2av sKR/1 k <0> padsv[$r:1,3] s l <2> aelem sK/2 m <1> length[t12] sKP/1 n <|> or(other->o) vK/1 o <;> nextstate(main 2 buk.pl:5) v:{ p <#> gv[*bufs] s q <1> rv2av sKR/1 r <0> padsv[$r:1,3] s s <2> aelem sKM/2 t <$> const[IV -16] s/FOLD u <$> const[IV 16] s v <$> const[PV ""] s w <@> substr[t16] sK/4 x <0> padsv[$nextRec:2,3] sRM*/LVINTRO y <2> sassign vKS/2 z <0> unstack v goto 10 13 <2> leaveloop vKP/2 14 <@> leave[1 ref] vKP/REFC
On the other hand, an N-way merge in C, on the hot path of the inner loop, requires just to perform a couple of operations plus the key comparisons log2N times. That may sum up to 50 instructions for every loop cycle (=sorted record)!
update: Oops, I forgot to say, in theory!
1 hour to sort 100GB, actually, seems too fast to me. But I think the analysis is sound, and unless I am overlooking some hidden bottleneck, the order or magnitude should be right!
In reply to Re^5: Can you improve upon my algorithm.
by salva
in thread Can you improve upon my algorithm.
by BrowserUk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |