in reply to Re^4: Performance oddity when splitting a huge file into an AoA
in thread Performance oddity when splitting a huge file into an AoA

Suggestions as to what tests i can run are welcome.

The profiling you've done doesn't get into enough detail in the critical areas.

The first thing I would try, is isolating whether the extra time is spent reading from the file or shuffling memory. To that end, I'd see what happens to the timings if I just read the data but didn't store it:

#! perl -slw #use 5.010; use strict; use Time::HiRes qw[ time ];; sub x{ open my $fh, '<', shift or die $!; # my @AoA; my $dummy = [ split ',' ] while <$fh>; close $fh; return $.; } for ( 1 .. 5 ) { my $start = time; printf "Records: %d in %.3f seconds\n", x( sprintf 'junk%d.dat', 1+ ($_ & 1) ), time() - $start; }

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Replies are listed 'Best First'.
Re^6: Performance oddity when splitting a huge file into an AoA
by Xenofur (Monk) on May 07, 2009 at 09:40 UTC
    Alright, broke the script up a bit and ran 3 different benchmarks in ActivePerl, Cygwin and Strawberry Perl. Here's the results: http://drop.io/perl_performance/asset/ap-vs-cw-vs-sb-rar

    It's really weird. If it pushes the data into the AoA, it takes a long time on the splitting. However if it doesn't push, then the splits go fast.

      Hm. I would have taken a look, but when I try to unrar your archive, there are mutliple copies of files all with the same name and no path information, so each overwrites the last.

      (I have to say that all those (2.25 MB of) htmls, pngs & csss seem like overkill as a way of presenting the same information that could be conveyed in three short text files. And with the latter, I could manipulate the numbers programmically rather than having to constantly chase my tail around several dozen html pages. )


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        I have absolutely no idea what could cause this, as I use completely standard rar.exe from the original producer to make these archives with nothing special about them whatsoever. In case this works better on your system, here's a zip file: http://drop.io/perl_performance/asset/ap-vs-cw-vs-sb-zip

        Next time i'll just upload the nytprof.out file. I kinda assumed that with what little information there is, looking at it normally would be all that's required. And as far as the format itself goes, ask the nytprof guys, i only used their tools. ;) (Should mention you're supposed to start with opening the "index.html" file.)
      (Feel free to ignore since I just started reading this thread.) The link requires too much work (possiblly to allow drop.io domain to run JavaScript, followed by taking a quiz) to see the results.
        Drop.io works fine on the four most-used browsers. Feel free to suggest other options?