how can I speed this up?

rbi has asked for the wisdom of the Perl Monks concerning the following question:

Hi,
I have translated a program from FORTRAN to perl.
When I run it as FORTRAN-compiled executable, it is at least 4 times faster than perl, on input files of about 120000 records.
Since I'm a beginner, I'm sure the perl I've written is not very good... In particular, in (1) I prepare a $tmp string through a loop to be printed. the subroutine decode might have problems too... I've replicated the FORTRAN code and there I compute the exponential, but maybe string manipulation can do much better...
Thanks in advance for any corrections/suggestions
Roberto


# here some initialisation or operations on small
# vectors

     ....
            $nvar = 9;
     ....
            foreach $k (1..$nvar) {
                $record = <IN>;
                chomp($record);
                @values = &decode($record,$nt,$val_m);
                foreach $it (1..$nt) {
                  if ($values[$it-1] < 0 and $values[$it-1] != $val_m)
+ {
                     die "error: found negative value.\n";
                  }
                  $field[$it-1][$k-1] = $values[$it-1];
                }
            }
            foreach $it (1..$nt) {
                $tmp  = sprintf "%s", $some_vector[$it-1];
                foreach $k (1..$nvar) {
                   $tmp = $tmp.sprintf " %11.4E", $field[$it-1][$k-1];

  #   (1)  in Fortran this should be a single write intruction with
           FORMAT ('i4.4,4i2.2,9(1x,1pe11.4))')
                }
               $tmp = $tmp."\n";
               print OUT $tmp;
            }

#     (2) this is seems to go the same speed as the one above ...
#
#           foreach $it (1..$nt) {
#               printf OUT "%s", $some_vector[$it-1];
#               foreach $k (1..$nvar) {
#                  printf OUT " %11.4E", $field[$it-1][$k-1];
#               }
#               print OUT "\n";
#           }
               ....


sub decode() {

# this routine should decode the $record where there are a series of
# $nt decimal fractions in the first part of the string 
# (5*$nt characters),and exponents in the last decimal fractions are
# written as 5-digit integers, and must be multiplied by 10.**(-5),
# negative exponents are coded as 2-digit odd numbers, while positive
# exponents are even numbers
# 12345...01... where 01 are characters $nt*5 and $nt*5+1 will give
# 1.234E-01

   my $record = shift();
   my $nt = shift();
   my $val_missing = shift();

   $lmina = 0;
   $lminb = 5 * $nt;
   foreach $it (1..$nt) {
      $ia = substr($record,$lmina,5);
      $ib = substr($record,$lminb,2);
      if ($ia ==  0 and $ib == 99) {
         $values[$it-1] = $val_missing;
      } else {
         $x1 = $ia  * 10.**(-5);
      if (($ib % 2) == 0) {
            $x2 = 10.**($ib/2);
         } else {
            $x2 = 10.**(-($ib+1)/2);
         }
         $values[$it-1] = $x1 * $x2;
      }
      $lmina = $lmina + 5;
      $lminb = $lminb + 2;
   }
   return @values;
}
[download]

Comment on how can I speed this up? Select or Download Code

Replies are listed 'Best First'.
Re: how can I speed this up? by TheoPetersen (Priest) on Feb 23, 2001 at 22:51 UTC
Wow, there's nothing quite like FORTRAN in Perl. :) My sympathies, I have to do a lot of this too. First of all, let me say that if your goal is to make the Perl code run as fast as the FORTRAN code, then stick with FORTRAN. If your goal is to get away from FORTRAN's miserable excuses for text handling, then welcome to Perl and you're on the right track. As for what could be better: the first thing that jumped out to me was to replace the loop in decode with combinations of unpack and map. The only tricky bit is that you have to build the template string for unpack on the fly, since we don't know the number of values in advance. If I read the code correctly, something like this would work: `sub decode { my ($record, $nt, $val_missing) = @_; my $template = ('A5' x $nt) . ('A2' x $nt); my @values = unpack($template, $record); my @fractions = @values[0 .. $nt-1]; my @exponents = @values[$nt .. $#values];` [download] unpack did the equivalent of all the substr calls in one go, so that should save some time. Look at perlman:perlop if it's not clear how the string is being constructed. At this point you could process the two arrays in parallel in a for loop. Note that the C-style loop is appropriate here since we are using the index value in multiple arrays: `@values = (); for (my $i = 0; $i < @fractions; $i++) { if ( == 0 and $exponents[$i] == 99) { $values[$i] = $val_missing; } else { $fractions[$i] = 10.(-5); if (($exponents[$i] % 2) == 0) { $exponents[$i] = 10.($exponents[$i]/2); } else { $exponents[$i] = 10.(-($exponents[$i]+1)/2); } $values[$i] = $fractions[$i] $exponents[$i]; } } return @values; }` [download] I had thought there might be a faster way to do this with map, but all of my answers seem contrived and convoluted, especially considering how much conditional code there is per item. If you're learning your way around Perl, try getting this or something similar going first, then look at a more idiomatic approach. Is it possible to change the format of the data? If so, a simpler algorithm could greatly improve the speed.	[reply] [d/l] [select]
(boo) Dissecting PerlTRAN by boo_radley (Parson) on Feb 23, 2001 at 23:48 UTC
This was quite a sizable snippet, so I downloaded it, and commented everything. My first concern would be the number of loops you have, and their placement: # for each of the 9 $nvars, # for each of the $nts # For each of the $nts... # And the for each of the $nvars... # for each of the $nts... you loop through 2 variables 5 times. I believe that it'd be possible to compress this to one nested loop, or to use (look away, JAPHY :) an intermediary variable to store the results of parts of your loops. Consider the following snippet ( I have removed my comments : `foreach $it (1..$nt) { $tmp = sprintf "%s", $some_vector[$it-1]; foreach $k (1..$nvar) { $tmp = $tmp.sprintf " %11.4E", $field[$it-1][$k-1]; } $tmp = $tmp."\n"; print OUT $tmp;` [download] This loop follows one similar to it, though the inner and outer loops are swapped. Consider printing $tmp as the last step of the first`foreach $k (1..$nvar)` loop. Also, I might try a little trickery with perl's definition of scalar values when computing the powers. The following snippet "reproduces" 10n without ever using . Though, I confess I don't know what the benchmark would be. The following snippet shows how to get the same results. This is only good for powers of ten though : `#--- powers $ib=10; $x2 = 10.($ib/2); print "$x2\n"; $x2 = "1". "0"x($ib/2); print "$x2\n"; #--- negative power $ib=3; $x2 = 10.(-($ib+1)/2); print "$x2\n"; $x2 = "0." . "0"x int($ib-1)/2 ."1"; print "$x2\n";` [download] I'd like to hear from some other monks on whether this may be speedier, though. Finally, whenever I need to speed up code, I'll document it, and more often than not, something will jump out at me. I've included the comments I made while running through your code, and I hope they prove useful. --- .... $nvar = 9; .... foreach $k (1..$nvar) { # # for each of the 9 $nvars, # $record = <IN>; chomp($record); # # I will read in a record and clear trailing newline. # @values = &decode($record,$nt,$val_m); # # I will 'decode' a record, using $nt and $val_m. # # # for each of the $nts # foreach $it (1..$nt) { # # if the $it'th-1 element of $values is less than one # and is not equal to $val_m, I will die. if ($values[$it-1] < 0 and $values[$it-1] != $val_m) + { die "error: found negative value.\n"; } $field[$it-1][$k-1] = $values[$it-1]; # # I will put $values[$it-1] into $field[$it-1][k-1]. # } } foreach $it (1..$nt) { # # For each of the $nts... # $tmp = sprintf "%s", $some_vector[$it-1]; # # I will format $some_vector[$it-1]. # foreach $k (1..$nvar) { # # And the for each of the $nvars... # $tmp = $tmp.sprintf " %11.4E", $field[$it-1][$k-1]; } $tmp = $tmp."\n"; print OUT $tmp; # # I'll format it and print it. # } # (2) this is seems to go the same speed as the one above ... # # foreach $it (1..$nt) { # printf OUT "%s", $some_vector[$it-1]; # foreach $k (1..$nvar) { # printf OUT " %11.4E", $field[$it-1][$k-1]; # } # print OUT "\n"; # } .... sub decode() { # this routine should decode the $record where there are a series of # $nt decimal fractions in the first part of the string # (5$nt characters),and exponents in the last decimal fractions are # written as 5-digit integers, and must be multiplied by 10.(-5), # negative exponents are coded as 2-digit odd numbers, while positive # exponents are even numbers # 12345...01... where 01 are characters $nt5 and $nt5+1 will give # 1.234E-01 my $record = shift(); my $nt = shift(); my $val_missing = shift(); $lmina = 0; $lminb = 5 $nt; foreach $it (1..$nt) { # # for each of the $nts... # $ia = substr($record,$lmina,5); $ib = substr($record,$lminb,2); # # I'll set ia and ib to be certain substrings of $record # if ($ia == 0 and $ib == 99) { $values[$it-1] = $val_missing; # # if $ia is exactly zero, and $ib is exactly 99, set $values[$it-1] +to $val_missing. # } else { $x1 = $ia * 10.(-5); # # otherwise, set $x1 to $ia times 10 to the negative fifth power. # if (($ib % 2) == 0) { $x2 = 10.($ib/2); } else { $x2 = 10.*(-($ib+1)/2); } # # if $ib is evenly divisible by two, set $x2 to # 10 to the (half of $ib'th) power, # otherwise, set $x2 to the negative (half of $ib'th+1) power. # $values[$it-1] = $x1 $x2; # # set $values [$it-1] to the sum of $x1 and $x2. # } $lmina = $lmina + 5; $lminb = $lminb + 2; # increase the position of $lmina and $lminb. } return @values; } [download]	[reply] [d/l] [select]
Re: (boo) Dissecting PerlTRAN by Albannach (Monsignor) on Feb 24, 2001 at 01:50 UTC
Just a quick note on the avoidance of the operator: I think as a general rule string manipulations are always far slower than math. I actually did try a variation on what you suggest myself (in a desperate quest for a better overall decode sub) and the benchmarks indicate the string route is very much slower (using 5 CPU sec run I went from 4800/sec to 4000/sec). Update:** tilly kindly offered this clarification: "Math will normally be faster if it already is a number. But converting strings to numbers to use a math operation, then converting back can go either way because of the conversion operation. (Usually better though.)" Since this problem does boil down to getting strings, doing some math on them and spitting out strings, trying the all string manipulation approach could work if the math were just right. -- I'd like to be able to assign to an luser	[reply]
Re: how can I speed this up? by lemming (Priest) on Feb 23, 2001 at 23:03 UTC
TheoPetersen seems to be on the right path. One thing he showed, but didn't mention was that in your code you tend to build your loops with `foreach $it (1..$nt)` and then subtract one each time you use $it. A lot of calculations can be saved just by `foreach $it <0..$nt-1)` That will save some time, but getting away from nested loops will save you more.	[reply] [d/l] [select]
Re: how can I speed this up? by Albannach (Monsignor) on Feb 24, 2001 at 02:28 UTC
I quite liked this problem at first, and like TheoPetersen my reaction was to try `unpack` and `map`. Adding `unpack` was easy, and I even squeezed in a `map` or two, but not avoiding a `for` loop. Now the strange thing is that all my efforts are actually slower than rbi's original. I'll add code if anyone wants but it's basically what was posted here: Alba_1 is almost exactly TheoPetersen's solution (I didn't bother posting after seeing his) Alba_2 is a variant that does the unpacking this way: `my @fractions = unpack('A5'x$nt, substr($record, 0, 5$nt)); my @exps = unpack('A2'x$nt, substr($record, 5$nt, 2*$nt));` [download] rbi_fortran is rbi's code (though made strict compliant). And the result are (based on 7 fields in $record): `Benchmark: running Alba_1, Alba_2, rbi_fortran, each for at least 5 CP +U seconds... Alba_1: 6 wallclock secs ( 5.31 usr + 0.00 sys = 5.31 CPU) @ 39 +94.16/s (n=21217) Alba_2: 6 wallclock secs ( 5.41 usr + 0.00 sys = 5.41 CPU) @ 47 +89.68/s (n=25893) rbi_fortran: 5 wallclock secs ( 5.26 usr + 0.00 sys = 5.26 CPU) @ 5 +211.78/s (n=28266)` [download] So what gives? I could have sworn `unpack` was faster. One thing I tried is the effect of the number of fields to extract, so here it is with 28 fields in $record: `Benchmark: running Alba_1, Alba_2, rbi_fortran, each for at least 5 CP +U seconds... Alba_1: 5 wallclock secs ( 5.39 usr + 0.00 sys = 5.39 CPU) @ 12 +14.06/s (n=6545) Alba_2: 6 wallclock secs ( 5.41 usr + 0.00 sys = 5.41 CPU) @ 14 +25.64/s (n=7707) rbi_fortran: 5 wallclock secs ( 5.37 usr + 0.00 sys = 5.37 CPU) @ 1 +540.09/s (n=8278)` [download] Basically no effect. Any other ideas? -- I'd like to be able to assign to an luser	[reply] [d/l] [select]
Re: Re: how can I speed this up? by rbi (Monk) on Feb 24, 2001 at 03:00 UTC
Thanks for the answers and comments so far. I did test all the suggestions too, and I found that the things were worsening about 20% (no profiles, I still have to learn that, just used ps...) with: 1. use of `unpack` (quite clean coding though) 2. extracting the strings from `$record` and "printing" the exponetial format with, more or less, this code: `... foreach $it (1..$nt) { $x11 = substr($record,$lmina,1); $x12 = substr($record,$lmina+1,4); $ib = substr($record,$lminb,2); if ($x11 == 0 and $x12 == 0 and $ib == 99) { $values[$it-1] = $val_missing; } else { if (($ib % 2) == 0) { $x2 = $ib/2; } else { $x2 = -($ib+1)/2; } $values[$it-1] = sprintf "%u.%4.4uE%3.3d\n",$x11,$x12,$x2; } $lmina = $lmina + 5; $lminb = $lminb + 2; } return @values; }` [download] There is a 20% improvement by removing the subtractions when indexing (avoiding things like `$values[$it-1]`) There is no gain by preparing a `@tmp` array and make just one printing per record Roberto	[reply] [d/l] [select]