in reply to Re^3: Our perl/xs/c app is 30% slower with 64bit 5.24.0, than with 32bit 5.8.9. Why?
in thread Our perl/xs/c app is 30% slower with 64bit 5.24.0, than with 32bit 5.8.9. Why?

I believe (perhaps wrongly) that the first is a space for speed tradeoff
Except for edge cases and possible bugs, COW is intended on average to use less memory and less CPU.

The second is an (IMO) unnecessary fix for a non-problem
A non-problem that allows you to trivially DoS any web server where input from the client (such as headers or parameters) are fed into a perl hash.

Anyway, perl's hash handling has been getting faster, not slower in recent years. This trivial code (read 0.5M words from a dictionary file and store in a hash):

open my $fh, "</usr/share/dict/words" or die; my %h; $h{$_}++ while <$fh>;
consumes the following number of CPU Mcycles under various perls:
5.8.9 1,245 5.18.0 1,143 5.20.0 1,113 5.22.0 1,163 5.24.0 1,089

Dave.

Replies are listed 'Best First'.
Re^5: Our perl/xs/c app is 30% slower with 64bit 5.24.0, than with 32bit 5.8.9. Why?
by BrowserUk (Patriarch) on Dec 22, 2016 at 10:25 UTC

    First: I did say "Beyond those guesses,". The information provided by the OP so far consists solely of the build parameters for the two builds. I compared those two sets and attempted to reason about possibilities.

    A non-problem that allows you to trivially DoS any web server where input from the client

    Hm. That problem was addressed way back in 2003/5.8.1 with something akin to this:

    From the 5.8.1 delta:

    Mainly due to security reasons, the "random ordering" of hashes has been made even more random. Previously while the order of hash elements from keys(), values(), and each() was essentially random, it was still repeatable. Now, however, the order varies between different runs of Perl.

    Perl has never guaranteed any ordering of the hash keys, and the ordering has already changed several times during the lifetime of Perl 5. Also, the ordering of hash keys has always been, and continues to be, affected by the insertion order.

    The added randomness may affect applications.

    One possible scenario is when output of an application has included hash data. For example, if you have used the Data::Dumper module to dump data into different files, and then compared the files to see whether the data has changed, now you will have false positives since the order in which hashes are dumped will vary. In general the cure is to sort the keys (or the values); in particular for Data::Dumper to use the Sortkeys option. If some particular order is really important, use tied hashes: for example the Tie::IxHash module which by default preserves the order in which the hash elements were added.

    More subtle problem is reliance on the order of "global destruction". That is what happens at the end of execution: Perl destroys all data structures, including user data. If your destructors (the DESTROY subroutines) have assumed any particular ordering to the global destruction, there might be problems ahead. For example, in a destructor of one object you cannot assume that objects of any other class are still available, unless you hold a reference to them. If the environment variable PERL_DESTRUCT_LEVEL is set to a non-zero value, or if Perl is exiting a spawned thread, it will also destruct the ordinary references and the symbol tables that are no longer in use. You can't call a class method or an ordinary function on a class that has been collected that way.

    The hash randomisation is certain to reveal hidden assumptions about some particular ordering of hash elements, and outright bugs: it revealed a few bugs in the Perl core and core modules.

    To disable the hash randomisation in runtime, set the environment variable PERL_HASH_SEED to 0 (zero) before running Perl (for more information see PERL_HASH_SEED in the perlrun manpage), or to disable the feature completely in compile time, compile with -DNO_HASH_SEED (see INSTALL).

    So what new problem was addressed by the 5.17 changes? (And has anyone ever seen a plausible demonstration of that "new problem"? Has there ever been a reported sighting of anyone exploiting that new problem in the field? If the change is so critical, why wasn't it back-ported to 5.10 and other earlier versions that are still being shipped with 95% of *nix distributions?)

    Anyway, perl's hash handling has been getting faster, not slower in recent years.

    Agreed. Not just hash handling, but just about every aspect of Perl (save maybe string handling) has gotten faster in recent builds. Congratulations.

    However, over the years there have been some weird behaviours that only affected windows builds.

    Once again I'll remind you that I was attempting to help the OP on the basis of the minimal information supplied; whilst asking him to provide more.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
    In the absence of evidence, opinion is indistinguishable from prejudice.
      So what new problem was addressed by the 5.17 changes?
      I can't remember the full details off the top of my head, but amongst others issues, there was a bug in the 5.8.1 implementation that, with a suitably crafted set of keys, could trigger the hash code into doubling the bucket size for every added key, making it trivial to exhaust a web server's memory. It was also shown that the ordering of keys extracted from a hash (like a web server returning unsorted headers) could be used to determine the server's hash seed.
      And has anyone ever seen a plausible demonstration of that "new problem"?
      On the security list I've seen simple code (that puts a particular sequence of keys into hash) that can crash the perl process.
      Has there ever been an reported sighting of anyone exploiting that new problem in the field?
      That shouldn't be the criteria for fixing security issues.
      If the change is so critical, why wasn't it back-ported to 5.10 and other earlier versions that are still being shipped with 95% of *nix distributions?)
      We backported the relevant changes to all maintained perl versions. It's up to vendors whether they patch old unsupported perl versions if they still ship them.

      Dave.

        It was also shown that the ordering of keys extracted from a hash (like a web server returning unsorted headers) could be used to determine the server's hash seed.

        That's a demonstration I would like to see. As in, someone actually deducing it from the returned headers of a system they otherwise have no visibility to; rather than a just a theoretical speculation that it might be possible.

        Basically, I don't believe that this theoretical possibility could ever be actually exploited. (But I did also say: (IMO) above.)


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re^5: Our perl/xs/c app is 30% slower with 64bit 5.24.0, than with 32bit 5.8.9. Why?
by dcmertens (Scribe) on Dec 23, 2016 at 12:03 UTC
    Are you sure these speedups are due to Perl's hash improvements, and not improvements in Perl's IO handling? Because that latter would have been my first guess. A more interesting comparison might be to time the script under two modes, one with a simple counter increment and one with the hash addition. The difference between these two running times would be more illuminating, I think.
      There's a tool in the perl src repository which uses cachegrind behind the scenes to accurately measure how many CPU instructions, data reads etc a small snippet of code uses. With the following initial setup (so the hash already exists and has some keys):
      my %h = qw(a 1 b 2 c 3 d 4); my $key = "foo";
      Running the following benchmark (using a non-constant key so the key's hash gets recalculated each time):
      $h{$key} = 1; delete $h{$key}
      Shows the following results on various perls:
      Key: Ir Instruction read Dr Data read Dw Data write COND conditional branches IND indirect branches _m branch predict miss _m1 level 1 cache miss _mm last cache (e.g. L3) miss - indeterminate percentage (e.g. 1/0) The numbers represent raw counts per loop iteration. perl589o perl5101o perl5125o perl5144o perl5163o perl5184o perl +5203o perl5222o perl5240o perl5258o -------- --------- --------- --------- --------- --------- ---- +----- --------- --------- --------- Ir 1348.0 1340.4 1378.0 1383.0 1423.0 1453.0 1 +466.0 1368.0 1356.0 1300.0 Dr 414.0 403.0 411.0 404.0 408.0 403.0 +411.0 379.0 373.0 362.0 Dw 226.0 214.0 222.0 227.0 228.0 231.0 +231.0 208.0 206.0 196.0 COND 202.0 210.1 210.0 204.0 213.0 204.0 +210.0 199.0 197.0 188.0 IND 16.0 16.0 17.0 18.0 18.0 18.0 + 17.0 14.0 12.0 14.0 COND_m 2.0 1.0 4.0 2.0 3.0 3.0 + 1.0 2.0 2.0 3.0 IND_m 9.0 9.0 11.0 9.0 9.0 11.0 + 9.0 5.0 5.0 5.0 Ir_m1 0.0 0.0 0.0 0.0 0.0 0.0 + 0.0 -0.1 0.0 0.0 Dr_m1 0.0 0.0 0.0 0.0 0.0 0.0 + 0.0 0.0 0.0 0.0 Dw_m1 0.0 0.0 0.0 0.0 0.0 0.0 + 0.0 0.0 0.0 0.0 Ir_mm 0.0 0.0 0.0 0.0 0.0 0.0 + 0.0 0.0 0.0 0.0 Dr_mm 0.0 0.0 0.0 0.0 0.0 0.0 + 0.0 0.0 0.0 0.0 Dw_mm 0.0 0.0 0.0 0.0 0.0 0.0 + 0.0 0.0 0.0 0.0
      Which shows everything being much the same before 5.22 (and in particular no significant slowdown in 5.16), and things getting better since.

      Dave.