Re:{4} how do I line-wrap while copying to stdout?

Replies are listed 'Best First'.
Re (tilly) 5: how do I line-wrap while copying to stdout? by tilly (Archbishop) on Apr 20, 2001 at 18:18 UTC
As merlyn mentioned, his change is only a win on strings which are so long that repeated editing of the string is a loss. Most text files will be a loss. Try a large file full of lines that are a few thousand characters long on average and see if the substr solution isn't getting slowed down...	[reply]
Re:{6} how do I line-wrap while copying to stdout? by jeroenes (Priest) on Apr 20, 2001 at 18:36 UTC
By the undef $/, the file is read in as a huge string. Of course I checked, and with a n=1000 string I get: `Rate regex substr regex 21775/s -- -27% substr 29748/s 37% --` [download] Which makes sense, as my file has some 50k chars. Jeroen "We are not alone"(FZ) Update: (I'm not going to make a Re:{9} post) I'd say, start checking the source {grin} `$str='a'x1E6; => Rate regex substr regex 15.1/s -- -16% substr 18.0/s 19% -- $str='a'x1E7; => Rate regex substr regex 1.41/s -- -21% substr 1.79/s 27% --` [download] At 100M, I'm testing my swap ;-)..... I tried it nevertheless, but now I'm waiting for my box stop swapping... /me is afraid that may take a while... :-)... finally, I had to use that reboot button :-< 25M still went OK: `s/iter regex substr regex 1.80 -- -21% substr 1.42 27% --` [download] at 50M, benchmark produced a division by zero .....	[reply] [d/l] [select]
Re (tilly) 8: how do I line-wrap while copying to stdout? by tilly (Archbishop) on Apr 20, 2001 at 19:20 UTC
Oops, missed that. Still the principle that merlyn stated is correct, and is a common performance mistake in parsing. Try it with a 1 MB string. If substr still wins then I guarantee you that someone implemented a buffering strategy with strings and substr where doing a destructive substr at the beginning of a string just moves indexes around and does not recopy. Perl plays a lot of games like that, for instance that is why push, pop and friends are fast. I am just a little surprised to see it played on strings...	[reply]
(tye)Re2: how do I line-wrap while copying to stdout? by tye (Sage) on Apr 20, 2001 at 20:34 UTC
Re: Re:{4} how do I line-wrap while copying to stdout? by Rhandom (Curate) on Apr 20, 2001 at 17:58 UTC
Too efficient? Execution efficiency, programming efficiency, or maintenance efficiency? There are many types. Using a regex may not always be as fast (in many cases it is faster -- try using index and substr to find word space), but in most instances they are more maintainable and more readable and faster to write. `use Benchmark; my $n = 1000; open(STDERR,">/dev/null"); cmpthese (1000, { match => sub { local $_ = "abcde " x $n; print STDERR "$1\n" while /\G(.{1,80})/gs +; }, swap => sub { local $_ = "abcde " x $n; s/\G(.{1,80})/$1\n/gs; print STDERR $_; }, subst => sub { local $a = "abcde " x $n; $b=''; $b .= substr( $a, 0, 80, '')."\n" while l +ength($a) >80; print STDERR "$b$a"; }, });` [download] Produces `Benchmark: timing 1000 iterations of match, subst, swap... match: 1 wallclock secs ( 0.97 usr + 0.00 sys = 0.97 CPU) @ 10 +30.93/s (n=1000) subst: 1 wallclock secs ( 0.58 usr + 0.00 sys = 0.58 CPU) @ 17 +24.14/s (n=1000) swap: 1 wallclock secs ( 0.75 usr + 0.00 sys = 0.75 CPU) @ 13 +33.33/s (n=1000) Rate match swap subst match 1031/s -- -23% -40% swap 1333/s 29% -- -23% subst 1724/s 67% 29% --` [download] A substr method is faster this time, but if it gets any more complex than that a regex will do just fine. If done once per script will you notice the difference between 1700 per second and 1300 per second? Maybe.	[reply] [d/l] [select]
Re:{6} how do I line-wrap while copying to stdout? by jeroenes (Priest) on Apr 20, 2001 at 18:09 UTC
Oh well, you are right about that: the difference is too low to be noticable for normal use. I was just picking up merlyn's glove ;-}. I just added the cmpthese stats when I saw your posting. Take a look at them. Some nitbits: I would call the swab 'insert'. But that can be done by substr as well... benchmarks coming up.... `Rate inssub regmap regex fixreg substr inssub 13.7/s -- -95% -97% -97% -97% regmap 260/s 1800% -- -34% -35% -47% regex 395/s 2784% 52% -- -1% -20% fixreg 398/s 2809% 53% 1% -- -19% substr 491/s 3483% 89% 24% 23% --` [download] That insert must be really inefficient :-) Jeroen "We are not alone"(FZ) Let me add the new code: use Benchmark; undef $/; open DATA, "/home/jeroen/texs/review/reviewnew.tex" or die $!; $str = <DATA>; open DUMP, ">/dev/null"; $result = timethese( -5, { 'regex' => sub { $a = $str; $b = ''; $b .= "$1\n" while $a=~/\G(.{1,80})/gs; print DUMP "$b"; }, 'regmap' => sub { $a = $str; $b = ''; print DUMP map "$_\n", $a=~/\G(.{1,80})/gs; }, 'fixreg'=> sub { $a = $str; $b = ''; $b .= "$1\n" while $a=~/\G(.{1,80})/gos; print DUMP "$b"; }, 'substr' => sub { $a = $str; $b=''; $b .= substr( $a, 0, 80, '')."\n" while length($a) >80; print DUMP "$b$a"; }, 'inssub' => sub { $a = $str; $idx = 0; substr( $a, $idx+=81, 0)="\n" while $idx< (length( $a) - 80 ); print DUMP "$a"; } }, 'none'); Benchmark::cmpthese($result); [download]	[reply] [d/l] [select]
Re: Re:{6} how do I line-wrap while copying to stdout? by Rhandom (Curate) on Apr 20, 2001 at 18:26 UTC
No worries. I've just had lots of people rant about how fast substr and index are (and they are). Which is true for simple cases (and some complex cases but you really don't want to be writing that kind of code do you?).	[reply]
Re: Re:{4} how do I line-wrap while copying to stdout? by petral (Curate) on Apr 24, 2001 at 00:40 UTC
Just to make a little more trouble -- what about: `local $,=$\;` [download] and either `print /(.{1,80})/g` [download] or `print grep /./, split /(.{1,80})/` [download] p	[reply] [d/l] [select]
Re: Re: Re:{4} how do I line-wrap while copying to stdout? by merlyn (Sage) on Apr 24, 2001 at 00:43 UTC
`print grep $_, split /(.{1,80})/` [download] I don't like that one. It throws away any line that comes out as "0". -- Randal L. Schwartz, Perl hacker	[reply] [d/l]
Re: Re: Re: Re:{4} how do I line-wrap while copying to stdout? by petral (Curate) on Apr 24, 2001 at 01:04 UTC
Right, changed it back to "/./". p	[reply]
Re: Re: Re: Re: Re:{4} how do I line-wrap while copying to stdout? by merlyn (Sage) on Apr 24, 2001 at 01:52 UTC
Re: Re:{4} how do I line-wrap while copying to stdout? by chipmunk (Parson) on Apr 20, 2001 at 23:25 UTC
Note that neither of these solutions wraps the text properly, because neither one accounts for the newlines already present in the text. Suppose you were wrapping the following at 10 characters: `abcdefghijklm nopqrstuvwxyz` [download] The proper result is: `abcdefghij klm nopqrstuvw xyz` [download] However, the solutions in the Benchmark will produce: `abcdefghij klm nopqrs tuvwxyz` [download] The regex solution is easy to fix: `print DUMP map "$_\n", $a=~/(.{1,80})/g;` The substr() solution requires more work to get right, such as splitting on newlines and wrapping each line separately, or sticking any partial lines back onto to the beginning of the string after each substr().	[reply] [d/l] [select]
Re:{6} how do I line-wrap while copying to stdout? by jeroenes (Priest) on Apr 21, 2001 at 02:15 UTC
Look at the root node. There were the lines coming in one by one, but some of them too long. The b'marked routines all work on one string. I took a file for it because that makes up a nice long string. Somewhere deaper i just took $str='a'x 1e7. Boils down to the same. Jeroen "We are not alone"(FZ)	[reply]