Re: Re: how do I line-wrap while copying to stdout?

Replies are listed 'Best First'.
Re: Re: Re: how do I line-wrap while copying to stdout? by merlyn (Sage) on Apr 20, 2001 at 16:41 UTC
You might also benchmark a version that's not destructive to the string, which could possibly win for long strings: `print "$1\n" while /\G(.{1,80})/gs;` [download] The advantage here is that the long string in `$_` can remain idle and yet scanned, while the substr version requires constant shifting and shortening. And just to factor out the multiple prints, also try: `print map "$_\n", /\G(.{1,80})/gs;` [download] -- Randal L. Schwartz, Perl hacker	[reply] [d/l] [select]
Re:{4} how do I line-wrap while copying to stdout? by jeroenes (Priest) on Apr 20, 2001 at 17:22 UTC
Well, why not? use Benchmark; undef $/; open DATA, "/home/jeroen/texs/review/reviewnew.tex" or die $!; #just s +ome lengthy manuscript $str = <DATA>; open DUMP, ">/dev/null"; timethese( -1, {'regex' => sub { $a = $str; print DUMP map "$_\n", $a=~/\G(.{1,80})/gs; }, 'substr' => sub { $a = $str; $b=''; $b .= substr( $a, 0, 80, '')."\n" while length($a) >80; print DUMP "$b$a"; } }); #givesBenchmark: running regex, substr, each for at least 1 CPU second +s... regex: 1 wallclock secs ( 1.05 usr + 0.01 sys = 1.06 CPU) @ 28 +7.74/s (n=305) substr: 1 wallclock secs ( 1.26 usr + 0.01 sys = 1.27 CPU) @ 55 +6.69/s (n=707) #This changes a bit when leaving the map out: Benchmark: running regex, substr, each for at least 1 CPU seconds... regex: 2 wallclock secs ( 1.06 usr + 0.00 sys = 1.06 CPU) @ 39 +6.23/s (n=420) substr: 2 wallclock secs ( 1.04 usr + 0.02 sys = 1.06 CPU) @ 54 +7.17/s (n=580) #I left the print out@substr at first (didn't want to test # print), but putting it back in gives: Benchmark: running regex, substr, each for at least 1 CPU seconds... regex: 1 wallclock secs ( 1.30 usr + 0.01 sys = 1.31 CPU) @ 36 +6.41/s (n=480) substr: 1 wallclock secs ( 1.06 usr + 0.02 sys = 1.08 CPU) @ 47 +3.15/s (n=511) #I added a better comparison: Rate regmap regex fixreg substr regmap 285/s -- -19% -22% -56% regex 350/s 23% -- -4% -46% fixreg 364/s 28% 4% -- -44% substr 650/s 128% 86% 78% -- [download] Apparently, the substr is just too efficient compared to regex. End the print is only a bit inefficient compared to storing stuff in memory. Jeroen "We are not alone"(FZ)	[reply] [d/l]
Re (tilly) 5: how do I line-wrap while copying to stdout? by tilly (Archbishop) on Apr 20, 2001 at 18:18 UTC
As merlyn mentioned, his change is only a win on strings which are so long that repeated editing of the string is a loss. Most text files will be a loss. Try a large file full of lines that are a few thousand characters long on average and see if the substr solution isn't getting slowed down...	[reply]
Re:{6} how do I line-wrap while copying to stdout? by jeroenes (Priest) on Apr 20, 2001 at 18:36 UTC
Re (tilly) 8: how do I line-wrap while copying to stdout? by tilly (Archbishop) on Apr 20, 2001 at 19:20 UTC
Some notes below your chosen depth have not been shown here
Re: Re:{4} how do I line-wrap while copying to stdout? by Rhandom (Curate) on Apr 20, 2001 at 17:58 UTC
Too efficient? Execution efficiency, programming efficiency, or maintenance efficiency? There are many types. Using a regex may not always be as fast (in many cases it is faster -- try using index and substr to find word space), but in most instances they are more maintainable and more readable and faster to write. `use Benchmark; my $n = 1000; open(STDERR,">/dev/null"); cmpthese (1000, { match => sub { local $_ = "abcde " x $n; print STDERR "$1\n" while /\G(.{1,80})/gs +; }, swap => sub { local $_ = "abcde " x $n; s/\G(.{1,80})/$1\n/gs; print STDERR $_; }, subst => sub { local $a = "abcde " x $n; $b=''; $b .= substr( $a, 0, 80, '')."\n" while l +ength($a) >80; print STDERR "$b$a"; }, });` [download] Produces `Benchmark: timing 1000 iterations of match, subst, swap... match: 1 wallclock secs ( 0.97 usr + 0.00 sys = 0.97 CPU) @ 10 +30.93/s (n=1000) subst: 1 wallclock secs ( 0.58 usr + 0.00 sys = 0.58 CPU) @ 17 +24.14/s (n=1000) swap: 1 wallclock secs ( 0.75 usr + 0.00 sys = 0.75 CPU) @ 13 +33.33/s (n=1000) Rate match swap subst match 1031/s -- -23% -40% swap 1333/s 29% -- -23% subst 1724/s 67% 29% --` [download] A substr method is faster this time, but if it gets any more complex than that a regex will do just fine. If done once per script will you notice the difference between 1700 per second and 1300 per second? Maybe.	[reply] [d/l] [select]
Re:{6} how do I line-wrap while copying to stdout? by jeroenes (Priest) on Apr 20, 2001 at 18:09 UTC
Re: Re:{6} how do I line-wrap while copying to stdout? by Rhandom (Curate) on Apr 20, 2001 at 18:26 UTC
Re: Re:{4} how do I line-wrap while copying to stdout? by petral (Curate) on Apr 24, 2001 at 00:40 UTC
Just to make a little more trouble -- what about: `local $,=$\;` [download] and either `print /(.{1,80})/g` [download] or `print grep /./, split /(.{1,80})/` [download] p	[reply] [d/l] [select]
Re: Re: Re:{4} how do I line-wrap while copying to stdout? by merlyn (Sage) on Apr 24, 2001 at 00:43 UTC
Re: Re: Re: Re:{4} how do I line-wrap while copying to stdout? by petral (Curate) on Apr 24, 2001 at 01:04 UTC
Some notes below your chosen depth have not been shown here
Re: Re:{4} how do I line-wrap while copying to stdout? by chipmunk (Parson) on Apr 20, 2001 at 23:25 UTC
Note that neither of these solutions wraps the text properly, because neither one accounts for the newlines already present in the text. Suppose you were wrapping the following at 10 characters: `abcdefghijklm nopqrstuvwxyz` [download] The proper result is: `abcdefghij klm nopqrstuvw xyz` [download] However, the solutions in the Benchmark will produce: `abcdefghij klm nopqrs tuvwxyz` [download] The regex solution is easy to fix: `print DUMP map "$_\n", $a=~/(.{1,80})/g;` The substr() solution requires more work to get right, such as splitting on newlines and wrapping each line separately, or sticking any partial lines back onto to the beginning of the string after each substr().	[reply] [d/l] [select]
Re:{6} how do I line-wrap while copying to stdout? by jeroenes (Priest) on Apr 21, 2001 at 02:15 UTC
Re: Re: Re: Re: how do I line-wrap while copying to stdout? by Rhandom (Curate) on Apr 20, 2001 at 17:46 UTC
As above (in merlyn's code) but with a swap `s/\G(.{1,80})/$1\n/gs; print;` [download] Maybe I should benchmark that. This doesn't have the advantage of not affecting long strings and it does put multiple lines into one variable, but that might be OK. Couldn't help but benchmark this thing... `use Benchmark qw(cmpthese); open(STDERR,">/dev/null"); cmpthese (10000, { match => sub { local $_ = "abcde " x 100; print STDERR "$1\n" while /\G(.{1,80})/gs +; }, swap => sub { local $_ = "abcde " x 100; s/\G(.{1,80})/$1\n/gs; print STDERR $_; }, });` [download] Produces `Benchmark: timing 10000 iterations of match, swap... match: 1 wallclock secs ( 1.26 usr + 0.01 sys = 1.27 CPU) @ 78 +74.02/s (n=10000) swap: 1 wallclock secs ( 1.01 usr + 0.01 sys = 1.02 CPU) @ 98 +03.92/s (n=10000) Rate match swap match 7874/s -- -20% swap 9804/s 25% --` [download] So the swap will save you time (if you are nitpicky about speed).	[reply] [d/l] [select]