G'day Mano,
Welcome to the Monstery.
In general, calling print 20,000 times with individual records will be slower than calling it once with all records.
I ran the following Benchmark several times.
#!/usr/bin/env perl use strict; use warnings; use autodie; use constant { LINES => 20_000, RECORD => 'X' x 100 . "\n", }; use Benchmark 'cmpthese'; open my $fh, '>>', '/dev/null'; cmpthese 0 => { singly => sub { print $fh RECORD for 1 .. LINES; }, concat => sub { print $fh join '', (RECORD) x LINES; }, list => sub { print $fh +(RECORD) x LINES; }, string => sub { print $fh RECORD x LINES; }, };
Here's a representative result:
Rate singly list concat string singly 437/s -- -64% -71% -91% list 1205/s 176% -- -20% -74% concat 1497/s 243% 24% -- -68% string 4720/s 981% 292% 215% --
You didn't give any indication of record size (error messages can vary wildly in length): I just used 100 'X's (plus a newline). If that's a reasonable guess, I don't imagine you'd have any problem with ~2MB of data (either holding it in memory or passing it to print).
As you can see, printing every record singly was slower than the other methods. A single print with concatenated records appears a little faster than using a list; however, that wasn't the case in all runs: I'd consider these too close to call. Also bear in mind that, because I've used constant values, Perl may have performed some optimisations at compile time. Consider what other code is involved as you capture records and add them to a string or use them to populate an array.
There are some other factors to take into consideration. Is this a one-off run? If not, how frequently is it run? How long does the entire process take to run? Is it being run by multiple processes at the same time? Are there other users on the system? How might this affect them?
Although printing records individually may be slower in the benchmark scenario I present, if done correctly, this method should have a substantially smaller memory footprint. In addition, spreading the printing tasks over the life of the process, may mean it plays more nicely with other, concurrent processes.
There's a fair amount to think about. I'd recommend writing your own benchmark, using more representative data, and running it in an environment that's closer to one in which the code will actually be run.
See also: "perlperf - Perl Performance and Optimization Techniques".
— Ken
In reply to Re: Performance In Perl
by kcott
in thread Performance In Perl
by Mano_Man
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |