in reply to Re^4: Our perl/xs/c app is 30% slower with 64bit 5.24.0, than with 32bit 5.8.9. Why?
in thread Our perl/xs/c app is 30% slower with 64bit 5.24.0, than with 32bit 5.8.9. Why?

Are you sure these speedups are due to Perl's hash improvements, and not improvements in Perl's IO handling? Because that latter would have been my first guess. A more interesting comparison might be to time the script under two modes, one with a simple counter increment and one with the hash addition. The difference between these two running times would be more illuminating, I think.
  • Comment on Re^5: Our perl/xs/c app is 30% slower with 64bit 5.24.0, than with 32bit 5.8.9. Why?

Replies are listed 'Best First'.
Re^6: Our perl/xs/c app is 30% slower with 64bit 5.24.0, than with 32bit 5.8.9. Why?
by dave_the_m (Monsignor) on Dec 23, 2016 at 13:59 UTC
    There's a tool in the perl src repository which uses cachegrind behind the scenes to accurately measure how many CPU instructions, data reads etc a small snippet of code uses. With the following initial setup (so the hash already exists and has some keys):
    my %h = qw(a 1 b 2 c 3 d 4); my $key = "foo";
    Running the following benchmark (using a non-constant key so the key's hash gets recalculated each time):
    $h{$key} = 1; delete $h{$key}
    Shows the following results on various perls:
    Key: Ir Instruction read Dr Data read Dw Data write COND conditional branches IND indirect branches _m branch predict miss _m1 level 1 cache miss _mm last cache (e.g. L3) miss - indeterminate percentage (e.g. 1/0) The numbers represent raw counts per loop iteration. perl589o perl5101o perl5125o perl5144o perl5163o perl5184o perl +5203o perl5222o perl5240o perl5258o -------- --------- --------- --------- --------- --------- ---- +----- --------- --------- --------- Ir 1348.0 1340.4 1378.0 1383.0 1423.0 1453.0 1 +466.0 1368.0 1356.0 1300.0 Dr 414.0 403.0 411.0 404.0 408.0 403.0 +411.0 379.0 373.0 362.0 Dw 226.0 214.0 222.0 227.0 228.0 231.0 +231.0 208.0 206.0 196.0 COND 202.0 210.1 210.0 204.0 213.0 204.0 +210.0 199.0 197.0 188.0 IND 16.0 16.0 17.0 18.0 18.0 18.0 + 17.0 14.0 12.0 14.0 COND_m 2.0 1.0 4.0 2.0 3.0 3.0 + 1.0 2.0 2.0 3.0 IND_m 9.0 9.0 11.0 9.0 9.0 11.0 + 9.0 5.0 5.0 5.0 Ir_m1 0.0 0.0 0.0 0.0 0.0 0.0 + 0.0 -0.1 0.0 0.0 Dr_m1 0.0 0.0 0.0 0.0 0.0 0.0 + 0.0 0.0 0.0 0.0 Dw_m1 0.0 0.0 0.0 0.0 0.0 0.0 + 0.0 0.0 0.0 0.0 Ir_mm 0.0 0.0 0.0 0.0 0.0 0.0 + 0.0 0.0 0.0 0.0 Dr_mm 0.0 0.0 0.0 0.0 0.0 0.0 + 0.0 0.0 0.0 0.0 Dw_mm 0.0 0.0 0.0 0.0 0.0 0.0 + 0.0 0.0 0.0 0.0
    Which shows everything being much the same before 5.22 (and in particular no significant slowdown in 5.16), and things getting better since.

    Dave.