in reply to Re^5: Performance oddity when splitting a huge file into an AoA
in thread Performance oddity when splitting a huge file into an AoA

Alright, broke the script up a bit and ran 3 different benchmarks in ActivePerl, Cygwin and Strawberry Perl. Here's the results: http://drop.io/perl_performance/asset/ap-vs-cw-vs-sb-rar

It's really weird. If it pushes the data into the AoA, it takes a long time on the splitting. However if it doesn't push, then the splits go fast.
  • Comment on Re^6: Performance oddity when splitting a huge file into an AoA

Replies are listed 'Best First'.
Re^7: Performance oddity when splitting a huge file into an AoA
by BrowserUk (Patriarch) on May 07, 2009 at 10:26 UTC

    Hm. I would have taken a look, but when I try to unrar your archive, there are mutliple copies of files all with the same name and no path information, so each overwrites the last.

    (I have to say that all those (2.25 MB of) htmls, pngs & csss seem like overkill as a way of presenting the same information that could be conveyed in three short text files. And with the latter, I could manipulate the numbers programmically rather than having to constantly chase my tail around several dozen html pages. )


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      I have absolutely no idea what could cause this, as I use completely standard rar.exe from the original producer to make these archives with nothing special about them whatsoever. In case this works better on your system, here's a zip file: http://drop.io/perl_performance/asset/ap-vs-cw-vs-sb-zip

      Next time i'll just upload the nytprof.out file. I kinda assumed that with what little information there is, looking at it normally would be all that's required. And as far as the format itself goes, ask the nytprof guys, i only used their tools. ;) (Should mention you're supposed to start with opening the "index.html" file.)

        The zip worked fine.

        I cannot make much sense of the statistics either. There is something weird going on. The spliting seems to be taking an inordinate amount of time.

        Part of the problem is that with all three files calling the same subroutine, all the statistics get lumped in together and averaged out, so you cannot see if there is any significant differences between the first run and the other two.

        To address that, I'd create C&P copies of the X() subroutine (say x1() x2() & x3()), and call a different version for each file. That will break out the timings for each file and perhaps highlight run to run differences.

        Also, from the numbers presented, it looks like end-of-loop overhead is getting lumped in with the last statement in the loop. To counter that, I'd stick a dummy statement at the bottom of the loop:

        my $dummy; sub x1{ open my $fh, '<', shift or die $!; my @AoA; while (my $line = <$fh>) { my @line_arr = split ',', $line; #push @AoA, \@line_arr; $dummy = 1; } }
        Next time i'll just upload the nytprof.out file.

        That would certainly be easier (I assume by "upload", you mean here to PM!). Especially for anyone attempting to follow along.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re^7: Performance oddity when splitting a huge file into an AoA
by parv (Parson) on May 07, 2009 at 10:21 UTC
    (Feel free to ignore since I just started reading this thread.) The link requires too much work (possiblly to allow drop.io domain to run JavaScript, followed by taking a quiz) to see the results.
      Drop.io works fine on the four most-used browsers. Feel free to suggest other options?
        I looked at this thread only as a mere curiosity, so I am not really interested in doing anything out of ordinary. (And no, I am not witholding information about any other host.)