in reply to Re^7: Performance oddity when splitting a huge file into an AoA
in thread Performance oddity when splitting a huge file into an AoA

I have absolutely no idea what could cause this, as I use completely standard rar.exe from the original producer to make these archives with nothing special about them whatsoever. In case this works better on your system, here's a zip file: http://drop.io/perl_performance/asset/ap-vs-cw-vs-sb-zip

Next time i'll just upload the nytprof.out file. I kinda assumed that with what little information there is, looking at it normally would be all that's required. And as far as the format itself goes, ask the nytprof guys, i only used their tools. ;) (Should mention you're supposed to start with opening the "index.html" file.)
  • Comment on Re^8: Performance oddity when splitting a huge file into an AoA

Replies are listed 'Best First'.
Re^9: Performance oddity when splitting a huge file into an AoA
by BrowserUk (Patriarch) on May 07, 2009 at 18:33 UTC

    The zip worked fine.

    I cannot make much sense of the statistics either. There is something weird going on. The spliting seems to be taking an inordinate amount of time.

    Part of the problem is that with all three files calling the same subroutine, all the statistics get lumped in together and averaged out, so you cannot see if there is any significant differences between the first run and the other two.

    To address that, I'd create C&P copies of the X() subroutine (say x1() x2() & x3()), and call a different version for each file. That will break out the timings for each file and perhaps highlight run to run differences.

    Also, from the numbers presented, it looks like end-of-loop overhead is getting lumped in with the last statement in the loop. To counter that, I'd stick a dummy statement at the bottom of the loop:

    my $dummy; sub x1{ open my $fh, '<', shift or die $!; my @AoA; while (my $line = <$fh>) { my @line_arr = split ',', $line; #push @AoA, \@line_arr; $dummy = 1; } }
    Next time i'll just upload the nytprof.out file.

    That would certainly be easier (I assume by "upload", you mean here to PM!). Especially for anyone attempting to follow along.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      As the nytprof files are binary, I can't upload them here. Instead I've made an effort to make the next batch of benchmarks more useful: http://drop.io/perl_performance/asset/arrays-zip

      I've taken your suggestions and implemented them. Also, there's only two variants this time, as one of them shows the performance issue and the other one has a tiny change which resolves the issue to an extent.

      Sidenote: Are you available on IRC? This stuff is very time-consuming and I'm way out of my depth here, so the only thing i literally can do is provide info in the hopes it helps others, but with the delays involved in the form of communication here i'm getting incredibly frustrated.

        Having spent a couple of hours flicking between the two sets of results in your latest zip, I am at a loss to explain or reproduce the problem. I doesn't make any sense at all to me why

        my @line_arr = split ',', $line;

        would take 8 times longer in one run relative to another. Assuming the same interpreter is being used for both runs.

        As I cannot reproduce it, and nobody else has spoken up to say that they can, it would appear to be confined to your system. If you have a work around, and are not concerned that this will adversely affect your other programs, then simply drop the issue.

        Quite frankly, I find this NYT prof output very difficult to use. Pretty, but otherwise essentially useless. But, once again, I seem to be in a minority here.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        You've had the change that seems to (inexplicably) resolve the performance issue for a week now. Does your frustration comes from not being able to use it for some reason? Or are you like me impatient to find out what the heck this one is about? :)