in reply to Re^2: Performance oddity when splitting a huge file into an AoA
in thread Performance oddity when splitting a huge file into an AoA

Even stranger, cos I'm using AS1004 also. The only difference is that I'm using the 64-bit version:

c:\test>perl -V Summary of my perl5 (revision 5 version 10 subversion 0) configuration +: Platform: osname=MSWin32, osvers=5.2, archname=MSWin32-x64-multi-thread ... Characteristics of this binary (from libperl): Compile-time options: MULTIPLICITY PERL_DONT_CREATE_GVSV PERL_IMPLICIT_CONTEXT PERL_IMPLICIT_SYS PERL_MALLOC_WRAP PL_OP_SLAB_ALLOC USE_64_BIT_I +NT USE_ITHREADS USE_LARGE_FILES USE_PERLIO USE_SITECUSTOMIZE Locally applied patches: ActivePerl Build 1004 [287188] 33741 avoids segfaults invoking S_raise_signal() (on Linux) 33763 Win32 process ids can have more than 16 bits 32809 Load 'loadable object' with non-default file extension 32728 64-bit fix for Time::Local Built under MSWin32 Compiled at Sep 3 2008 12:22:07 @INC: C:/Perl64/site/lib C:/Perl64/lib .

And I don't see the problem with 5.8.9/32-bit either:

c:\test>\perl32\bin\perl5.8.9.exe junk.pl Records: 400000 in 6.732 seconds Records: 300000 in 4.329 seconds Records: 400000 in 4.537 seconds Records: 300000 in 3.365 seconds Records: 400000 in 4.495 seconds

I think you should look closer at what is going on there. (I'll try to grab a copy of AS1004 32-bit and run it for comparison purposes.)


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Replies are listed 'Best First'.
Re^4: Performance oddity when splitting a huge file into an AoA
by Xenofur (Monk) on May 05, 2009 at 22:32 UTC
    The -V of my ASPerl:
    d:\Web-Dev\arrays>perl -V Summary of my perl5 (revision 5 version 10 subversion 0) configuration +: Platform: osname=MSWin32, osvers=5.00, archname=MSWin32-x86-multi-thread Characteristics of this binary (from libperl): Compile-time options: MULTIPLICITY PERL_DONT_CREATE_GVSV PERL_IMPLICIT_CONTEXT PERL_IMPLICIT_SYS PERL_MALLOC_WRAP PL_OP_SLAB_ALLOC USE_ITHREADS USE_LARGE_FILES USE_PERLIO USE_SITECUSTOMIZE Locally applied patches: ActivePerl Build 1004 [287188] 33741 avoids segfaults invoking S_raise_signal() (on Linux) 33763 Win32 process ids can have more than 16 bits 32809 Load 'loadable object' with non-default file extension 32728 64-bit fix for Time::Local Built under MSWin32 Compiled at Sep 3 2008 13:16:37 @INC: C:/Perl/site/lib C:/Perl/lib .
    I'm not sure as to how i can look closer here. Suggestions as to what tests i can run are welcome. Meanwhile, here's snapshots of both, done with Procmon and NYTProf: http://drop.io/perl_performance
      Suggestions as to what tests i can run are welcome.

      The profiling you've done doesn't get into enough detail in the critical areas.

      The first thing I would try, is isolating whether the extra time is spent reading from the file or shuffling memory. To that end, I'd see what happens to the timings if I just read the data but didn't store it:

      #! perl -slw #use 5.010; use strict; use Time::HiRes qw[ time ];; sub x{ open my $fh, '<', shift or die $!; # my @AoA; my $dummy = [ split ',' ] while <$fh>; close $fh; return $.; } for ( 1 .. 5 ) { my $start = time; printf "Records: %d in %.3f seconds\n", x( sprintf 'junk%d.dat', 1+ ($_ & 1) ), time() - $start; }

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Alright, broke the script up a bit and ran 3 different benchmarks in ActivePerl, Cygwin and Strawberry Perl. Here's the results: http://drop.io/perl_performance/asset/ap-vs-cw-vs-sb-rar

        It's really weird. If it pushes the data into the AoA, it takes a long time on the splitting. However if it doesn't push, then the splits go fast.