in reply to Re: Performance In Perl
in thread Performance In Perl
There are really two questions to answer:
To answer the second question first, Perl has no built-in limits for string size. Try this simple test program using the splendid Devel::Size module:
use strict; use warnings; use Devel::Size qw(size); my $str = ''; for (1 .. 1_000_000) { $str .= "x" x 80; } print "ASCII string has @{[ length($str) ]} characters and consumes @{ +[ size($str) ]} bytes\n"; $str = ''; for (1 .. 1_000_000) { $str .= "\N{U+1234}" x 80; } print "Unicode string has @{[ length($str) ]} characters and consumes +@{[ size($str) ]} bytes\n";
On my machine, the output is
ASCII string has 80000000 characters and consumes 89779352 bytes Unicode string has 80000000 characters and consumes 273984424 bytes
So even with one million 80-character lines, you're only using a couple hundred megabytes of RAM.
To answer the I/O speed question, you can try benchmarking it like the program below:
use strict; use warnings; use feature "say"; use Benchmark; use Devel::Size qw(size); my $t0 = Benchmark->new; my $total_bytes_chunked = 0; for (1 .. 1_000_000) { my $str = 'x' x 80; #$total_bytes_chunked += size($str); say STDERR $str; } my $t1 = Benchmark->new; my $str = ''; for (1 .. 1_000_000) { $str .= 'x' x 80 . "\n"; } my $total_bytes_lump = 0; #$total_bytes_lump = size($str); print STDERR $str; my $t2 = Benchmark->new; say "Printing in small chunks ($total_bytes_chunked bytes): @{[ timest +r(timediff($t1, $t0)) ]}"; say "Printing one big chunk ($total_bytes_lump bytes): @{[ timestr(tim +ediff($t2, $t1)) ]}";
If you run it, redirect STDERR or the comparison is meaningless: perl test.pl 2>/dev/null. Be warned that it may be a false comparison nonetheless. On my machine, printing one big lump is faster than printing one million small chunks. However, if you uncomment the size() calls to see how much the total string sizes differ, you'll find the first loop suddenly takes four times longer, because it's doing a lot more calculation at each loop iteration.
Probably the only right way to answer your question is to try both in your program and with your input and see which one performs faster. It really depends on how much you can afford to keep in memory and how much computation you need to do for each individual chunk to print.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^3: Performance In Perl
by Mano_Man (Acolyte) on Mar 15, 2017 at 09:45 UTC |