in reply to Performance In Perl

Hello Mano_Man and welcome to the monstery!

You'll receive for sure detailed answers but my feeling is that the maximum length of a string is based on your RAM: see Maximum string length

Then, in terms of performances i suspect it is much more convenient to print out such messages as soon as possible, without accumulating them into a variable.

Infact $toPrint will grow up consuming much RAM at any append you make to it.

In the other scenario, you do not even need a variable: you just print out a string and it is gone.

With modern hardware i suspect 20k lines are an affordable task, to print and to read, though.

L*

There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

Replies are listed 'Best First'.
Re^2: Performance In Perl
by vrk (Chaplain) on Mar 15, 2017 at 09:40 UTC

    There are really two questions to answer:

    1. Is it faster to call print/say once with several megabytes of data, or many more times with small amounts of data each time?
    2. How big a string can you build before you run out of memory

    To answer the second question first, Perl has no built-in limits for string size. Try this simple test program using the splendid Devel::Size module:

    use strict; use warnings; use Devel::Size qw(size); my $str = ''; for (1 .. 1_000_000) { $str .= "x" x 80; } print "ASCII string has @{[ length($str) ]} characters and consumes @{ +[ size($str) ]} bytes\n"; $str = ''; for (1 .. 1_000_000) { $str .= "\N{U+1234}" x 80; } print "Unicode string has @{[ length($str) ]} characters and consumes +@{[ size($str) ]} bytes\n";

    On my machine, the output is

    ASCII string has 80000000 characters and consumes 89779352 bytes Unicode string has 80000000 characters and consumes 273984424 bytes

    So even with one million 80-character lines, you're only using a couple hundred megabytes of RAM.

    To answer the I/O speed question, you can try benchmarking it like the program below:

    use strict; use warnings; use feature "say"; use Benchmark; use Devel::Size qw(size); my $t0 = Benchmark->new; my $total_bytes_chunked = 0; for (1 .. 1_000_000) { my $str = 'x' x 80; #$total_bytes_chunked += size($str); say STDERR $str; } my $t1 = Benchmark->new; my $str = ''; for (1 .. 1_000_000) { $str .= 'x' x 80 . "\n"; } my $total_bytes_lump = 0; #$total_bytes_lump = size($str); print STDERR $str; my $t2 = Benchmark->new; say "Printing in small chunks ($total_bytes_chunked bytes): @{[ timest +r(timediff($t1, $t0)) ]}"; say "Printing one big chunk ($total_bytes_lump bytes): @{[ timestr(tim +ediff($t2, $t1)) ]}";

    If you run it, redirect STDERR or the comparison is meaningless: perl test.pl 2>/dev/null. Be warned that it may be a false comparison nonetheless. On my machine, printing one big lump is faster than printing one million small chunks. However, if you uncomment the size() calls to see how much the total string sizes differ, you'll find the first loop suddenly takes four times longer, because it's doing a lot more calculation at each loop iteration.

    Probably the only right way to answer your question is to try both in your program and with your input and see which one performs faster. It really depends on how much you can afford to keep in memory and how much computation you need to do for each individual chunk to print.

      Thank you for the speedy replies. I've just checked it - the difference is about 30% performance, which is a lot. Of course, in small prints, this is negligible. Thank you !