Kc12349 has asked for the wisdom of the Perl Monks concerning the following question:

I have been attempting to do some optimization which involved array assignment and iteration. While doing so I found something that confused me but was ultimately unrelated to my production code, as it only occurs when I print some debug.

I have boiled down a loss of almost an order of magnitude to the below test case. I have removed all the test code save for copying an array into another array from my for loop.

I have commented out three different lines of code which I tested one at a time and got a time interval value for. Running with the #<nothing> line, i.e., with no extra line, I see similar performance to the line say @array. What has me confused is the performance I get after printing an interpolated copy of the array with the line say "@array".

If anyone can enlighten me, even if I am missing something elementary, it would be greatly appreciated.

use Time::HiRes qw(gettimeofday tv_interval); my @array = (1..100); # say "@array"; # 0.332776 s # say @array; # 0.052496 s # #<nothing> # 0.052361 s my $t0 = [gettimeofday]; for (1..10000) { my @array2 = @array; } say tv_interval($t0);
  • Comment on Strange performance loss after interpolating an array and then copying to another array;
  • Download Code

Replies are listed 'Best First'.
Re: Strange performance loss after interpolating an array and then copying to another array;
by SuicideJunkie (Vicar) on Aug 19, 2011 at 18:01 UTC

    I've limited knowledge of the internals, but as I understand it, interpolating into a string will force the array elements to calculate a string value for all the numbers.

    I would then expect the array copying to be duplicating the *strings* rather than only the integers, and that's a lot more bytes to copy.

      Let's test that hypothesis:

      use Time::HiRes qw(gettimeofday tv_interval); my @array = (1..100); # say "@array"; # 0.332776 s # say @array; # 0.052496 s # #<nothing> # 0.052361 s my $t0 = [gettimeofday]; for (1..10000) { my @array2 = @array; } say tv_interval($t0); __END__ SV = IV(0xe47dd8) at 0xe47de8 REFCNT = 1 FLAGS = (IOK,pIOK) IV = 1 SV = PVIV(0xe512d8) at 0xe47de8 REFCNT = 1 FLAGS = (IOK,POK,pIOK,pPOK) IV = 1 PV = 0xe56050 "1"\0 CUR = 1 LEN = 16

      So yes, interpolating the array into the string adds a PV (string) representation to the scalar.

        So what causes the performance gain, though small, going from this:

        use Time::HiRes qw(gettimeofday tv_interval); my @array = (1..100); for my $element (@array) { $element .= ''; } for my $element (@array) { $element += 0; } my $t0 = [gettimeofday]; for (1..10000) { my @array2 = @array; } say tv_interval($t0);

        To this:

        use Time::HiRes qw(gettimeofday tv_interval); my @array = (1..100); my $t0 = [gettimeofday]; for (1..10000) { my @array2 = @array; } say tv_interval($t0);

      That makes sense, though I am curious if perl bothers to copy both internal scalar representations. I ran my test again preceded by the below. I force each element into numeric context again after interpolating. I get almost all the performance back with this approach, but not quite all of it.

      for my $element (@array) { $element += 0; }
Re: Strange performance loss after interpolating an array and then copying to another array;
by Kc12349 (Monk) on Aug 19, 2011 at 18:03 UTC

    Having thought this through a bit more I think I understand what is going on. I think it's due to how perl represents scalars internally. After interpolating, the array now contains both the original numeric representation of the elements and the string representation created by calling them within the string context of interpolation. This means that the data footprint to be copied from one array to the other is now significantly larger.