in reply to Performance oddity when splitting a huge file into an AoA

You're running this under mod_perl or fastcgi? Cos I can't reproduce your findings using straight perl.

#! perl -sw use 5.010; use strict; use Time::HiRes qw[ time ];; sub x{ open my $fh, '<', shift or die $!; my @AoA; push @AoA, [ split ',' ] while <$fh>; close $fh; return scalar @AoA;; } for ( 1 .. 5 ) { my $start = time; printf "Records: %d in %.3f seconds\n", x( sprintf 'junk%d.dat', 1+ ($_ & 1) ), time() - $start; } __END__ c:\test>junk Records: 400000 in 5.884 seconds Records: 300000 in 4.752 seconds Records: 400000 in 4.599 seconds Records: 300000 in 3.473 seconds Records: 400000 in 4.569 seconds c:\test>junk Records: 400000 in 4.826 seconds Records: 300000 in 3.408 seconds Records: 400000 in 4.613 seconds Records: 300000 in 3.481 seconds Records: 400000 in 4.557 seconds

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Replies are listed 'Best First'.
Re^2: Performance oddity when splitting a huge file into an AoA
by Xenofur (Monk) on May 05, 2009 at 21:05 UTC
    You're right. It is dependant on what version of Perl is used. I'm utterly confused now:

    ActivePerl:
    d:\Web-Dev\arrays>perl -v This is perl, v5.10.0 built for MSWin32-x86-multi-thread (with 5 registered patches, see perl -V for more detail) Copyright 1987-2007, Larry Wall Binary build 1004 [287188] provided by ActiveState http://www.ActiveSt +ate.com Built Sep 3 2008 13:16:37 [snip] D:\Web-Dev\arrays>perl test.pl Records: 308273 in 5.641 seconds Records: 279997 in 98.281 seconds Records: 308273 in 128.656 seconds Records: 279997 in 96.953 seconds Records: 308273 in 129.188 seconds
    Cygwin:
    bash-3.2$ /bin/perl -v This is perl, v5.10.0 built for cygwin-thread-multi-64int (with 6 registered patches, see perl -V for more detail) Copyright 1987-2007, Larry Wall [snip] bash-3.2$ /bin/perl test.pl Records: 308273 in 6.719 seconds Records: 279997 in 5.875 seconds Records: 308273 in 6.484 seconds Records: 279997 in 5.906 seconds Records: 308273 in 6.515 seconds

      Even stranger, cos I'm using AS1004 also. The only difference is that I'm using the 64-bit version:

      c:\test>perl -V Summary of my perl5 (revision 5 version 10 subversion 0) configuration +: Platform: osname=MSWin32, osvers=5.2, archname=MSWin32-x64-multi-thread ... Characteristics of this binary (from libperl): Compile-time options: MULTIPLICITY PERL_DONT_CREATE_GVSV PERL_IMPLICIT_CONTEXT PERL_IMPLICIT_SYS PERL_MALLOC_WRAP PL_OP_SLAB_ALLOC USE_64_BIT_I +NT USE_ITHREADS USE_LARGE_FILES USE_PERLIO USE_SITECUSTOMIZE Locally applied patches: ActivePerl Build 1004 [287188] 33741 avoids segfaults invoking S_raise_signal() (on Linux) 33763 Win32 process ids can have more than 16 bits 32809 Load 'loadable object' with non-default file extension 32728 64-bit fix for Time::Local Built under MSWin32 Compiled at Sep 3 2008 12:22:07 @INC: C:/Perl64/site/lib C:/Perl64/lib .

      And I don't see the problem with 5.8.9/32-bit either:

      c:\test>\perl32\bin\perl5.8.9.exe junk.pl Records: 400000 in 6.732 seconds Records: 300000 in 4.329 seconds Records: 400000 in 4.537 seconds Records: 300000 in 3.365 seconds Records: 400000 in 4.495 seconds

      I think you should look closer at what is going on there. (I'll try to grab a copy of AS1004 32-bit and run it for comparison purposes.)


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        The -V of my ASPerl:
        d:\Web-Dev\arrays>perl -V Summary of my perl5 (revision 5 version 10 subversion 0) configuration +: Platform: osname=MSWin32, osvers=5.00, archname=MSWin32-x86-multi-thread Characteristics of this binary (from libperl): Compile-time options: MULTIPLICITY PERL_DONT_CREATE_GVSV PERL_IMPLICIT_CONTEXT PERL_IMPLICIT_SYS PERL_MALLOC_WRAP PL_OP_SLAB_ALLOC USE_ITHREADS USE_LARGE_FILES USE_PERLIO USE_SITECUSTOMIZE Locally applied patches: ActivePerl Build 1004 [287188] 33741 avoids segfaults invoking S_raise_signal() (on Linux) 33763 Win32 process ids can have more than 16 bits 32809 Load 'loadable object' with non-default file extension 32728 64-bit fix for Time::Local Built under MSWin32 Compiled at Sep 3 2008 13:16:37 @INC: C:/Perl/site/lib C:/Perl/lib .
        I'm not sure as to how i can look closer here. Suggestions as to what tests i can run are welcome. Meanwhile, here's snapshots of both, done with Procmon and NYTProf: http://drop.io/perl_performance