in reply to timethese, and pushing array values

No, timethese doesn't do any cleanup for you. You're using a special global variable (@_), and that's not going to be cleaned up automatically for you. One option would be to use a different variable, one you could declare lexically with my; that way your array would go out of scope after each iteration of the benchmark, and you'd be starting from scratch each time, as it were.

If you do use lexicals, make sure that you declare the variable in the code that you're benchmarking; otherwise Benchmark won't be able to "see" your variable, because it won't be in the correct scope.

You could use something like this:

timethese(100, { first => sub { my @foo = qw/bar foo/; my $last = pop @foo }, second => sub { my @foo = qw/bar foo/; my $last = $foo[-1] }, });
Or whatever you're benchmarking.

Is that the behavior I expected? I think yes, because frankly it wouldn't make much sense for Benchmark to mess about with the @_ you're playing with; it should do as little as possible, because for example, maybe you don't *want* cleanup of your variables. This allows you to have more control over what's happening.

Replies are listed 'Best First'.
RE: Re: timethese
by husker (Chaplain) on Jul 08, 2000 at 00:58 UTC
    I guess I disagree. The value of benchmarking is greatly reduced if each iteration of the code we are benchmarking starts off with even a slightly different state than the previous iteration. Even more dangerous is that some people probably aren't even AWARE that this is happening, and thus don't fully understand what's going on behind those benchmark numbers. (I also ran into the case where my benchmarking died because I ran out of memory ... I was running 1,000,000 iterations ... )

    As far as perhaps NOT wanting my variables cleared .. in what realistic benchmarking case would you think that I would want that behavior? I can't think of one myself ... but maybe you have run across that need before?

    To make things even less predictable, let's say I have two functions A and B that I am using in timethese, and that I use @_ in both like I described (pushing and popping). timethese clears @_ for the FIRST time I run A, but no subsequent times, and it clears @_ for the FIRST run of B, but no subsequent times.

    I don't think this is desired behavior ... at least not in my case.

      Consider @_ as being any variable with higher scope than any local my'd variables found within the timethese code.

      Now consider the case that I don't want to time 1,000,000 iterations of the exact same case. Rather, I would like time 1,000,000 iterations of my code on various cases.

      In fact, for real world benchmarking (rather than simply using a benchmark to profile a particular (simple) peice of code), you would probably want to run your timethese using a statistical sample, or even real sample of test cases.

      If timethese could somehow magically (after all how is it going to know which global variables you want to reset? isn't that what my is for anyway?) reset the code to it's initial state (with zero time cost to boot!), then you would not be able to do a simple benchmark such as:

      (warning pseudo code)
      srand; timethese(factorial(int rand*1000), 1 000 000);
      Which would give you a sort of average performance of your code over certain input ranges. However you could not do this benchmark if timethese magically reset your initial state (because then you would pick the same random number a million times).

      Please pardon if I'm less than coherent, I wanted to make sure I posted something today that was semi-useful, however I didn't get to manage it until the wee hours of the morning.

      Ciao,
      Gryn
        It is apropriate to have initilization code in your benchmarking routines. Having initilization code will impact the absolute performance of the code by its added overhead. However, when you use Benchmark/timethere you are looking to determine the relative performance of a piece of code. As long as you use the same initilization routine in all pieces of code being benchmarked your results will be valid and useful. If you really want to knoe the absolute performance, you can also run a dummy example along with your real code that contains only the initilization routine and you can factor out the initilization routine's effect when analyzing the results.
        OK I can see your point ... in some benchmarking endeavors, you might not want to initialize variables on each iteration.

        However, to me this seems more like a "side effect" or "artifact" of Benchmark's behavior, not an explicitly designed mechanism to provide the kind of facility you are describing. It's this lack of a formal design that makes it dangerous ... I would bet you that half the people who use Benchmark aren't aware this is what happens and so they can't understand the numbers they are seeing.

        If this behavior *was* part of an intentional design decision, then I would think there would be a way to turn it off. For instance, my code I was working on originally was for use in the thread over at sieve of erothenes. The behavior of @_ in this case really punishes one of the algorithms being used (not mine, actually) because @_ grows and grows with each iteration. I doubt maverick knew that when he wrote his code, but it's his code that gets bitten by this. In any case, I think that either this global variable behavior of Benchmark is either an accident, in which case it should be formalized somehow so everyone understands what is happening, or was *is* formalized, in which case there should be an option to defeat it.