Increasing Buffer size of STDIN on PERL Program

kr123 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Increasing Buffer size of STDIN on PERL Program by sgifford (Prior) on Sep 09, 2003 at 22:45 UTC
Sure you can. `#!/usr/bin/perl -w use strict; use FileHandle; use constant BUFSIZE => 12000; use vars qw($buf); $buf = "A" x BUFSIZE; STDIN->setvbuf($buf,_IOFBF,BUFSIZE) or die "Couldn't setvbuf\n"; while (<>) { print "This is a line.\n"; }` [download] produces: `$ strace -e read -f perl /tmp/t101 </etc/passwd ... read(0, "root:x:0:0:root:/root:/bin/bash\n"..., 12000) = 1482 This is a line. ... read(0, "", 12000) = 0` [download]	[reply] [d/l] [select]
Re: Increasing Buffer size of STDIN on PERL Program by Zaxo (Archbishop) on Sep 09, 2003 at 22:42 UTC
The libc function `setvbuf()` does what you want. Perl exposes that function in the IO::Handle module. There is a note in the docs saying that setvbuf is not imported by default in Perl 5.8.0 and later, because of PerlIO's bypassing of stdlib. You should test for whatever portability matters to you. After Compline, Zaxo	[reply]
Re: Increasing Buffer size of STDIN on PERL Program by Roger (Parson) on Sep 10, 2003 at 00:08 UTC
Try the following code.... `{ local $/; # turn off input separator $input = <>; }` [download] This reads everything from <STDIN> into the variable $input in one go. I believe this is the quickest method of reading input.	[reply] [d/l]
Re: Re: Increasing Buffer size of STDIN on PERL Program by williamp (Pilgrim) on Sep 10, 2003 at 06:08 UTC
Hi O Monks, Changing $/ looks a cool way of getting your data in. However are there any occasions where you could run into problems?. I notice you use "local", would "my" be even safer within those brackets?. Thank you for your reply.	[reply]
Re: Re: Re: Increasing Buffer size of STDIN on PERL Program by Roger (Parson) on Sep 10, 2003 at 07:52 UTC
The reason why I use local $/ instead of my $/ is the difference between how local and my work. The keyword local has been around for a long time, way before the keyword my (which was introduced only after Perl 5). The difference is that local is run-time scoping, while my is compile-time scoping. Local $/ will first push the value of current $/ onto the stack, and then when the code leave the scope, automatically restores its previous content. my $/ will not work here, it will give you a perl compilation error, any $-punctuation variable will have to be localized with local, it's a limitation (or feature?) of perl. local keyword reuses a variable already exists in the global context. This method is not efficient if you have lots of data (hundreds of Mb) pumping out from another process, because it will have to allocate memory to hold the input.	[reply]
Re: Re: Re: Increasing Buffer size of STDIN on PERL Program by rob_au (Abbot) on Sep 10, 2003 at 12:12 UTC
In addition to the reply from Roger above, the node Perl Idioms Explained - my $string = do { local $/; <FILEHANDLE> }; from jeffa and the discussion that follows provides an excellent overview of this idiom. `perl -le "print+unpack'N',pack'B32','00000000000000000000001010000011'"`	[reply]
Re: Re: Increasing Buffer size of STDIN on PERL Program by BazB (Priest) on Sep 10, 2003 at 21:44 UTC
This is only advisable if the input data will fit into the available memory without compromising other processes at the same time. You don't want the system to start paging heavily or exhaust memory just for the sake of having all data in memory at once. Slurping files can sometimes be advantageous. Other times using `read()` or `sysread()` to read large blocks of data might do the trick. Personally I still find myself using a simple `while (<FH>){...}` for most jobs, from logfiles to multi-gigabyte datasets. Cheers, BazB If the information in this post is inaccurate, or just plain wrong, don't just downvote - please post explaining what's wrong. That way everyone learns.	[reply] [d/l] [select]
Re: Increasing Buffer size of STDIN on PERL Program by Abigail-II (Bishop) on Sep 09, 2003 at 22:29 UTC
That would be the buffer size of the pipe you are using. What the limit of the buffer is, and how to modify it is system dependent. For some OSses, it might require a reboot, or even a kernel rebuild. Oh, and it's `Perl`, `P-e-r-l`, just one capital. Abigail	[reply]
Current Solution by kr123 (Novice) on Sep 10, 2003 at 12:40 UTC
Thanks for all the quick responses. Unfortunately slurping the data won't work in my application since the data is a neverending stream -- I'm getting data straight from syslog. I followed the initial suggestions of using setvbuf and found that on my Solaris 8 machine with Perl 5.6 my maximum usable buffer size is 10,240 characters. Though this isn't as large as I'd have liked (something bordering 500K would be nice), I found that the average length of my incoming syslog messages is 136 characters, thus a buffer of this size allows me to capture 75 lines in the buffer.	[reply]
Slurp it all by jonadab (Parson) on Sep 10, 2003 at 03:08 UTC
Will the whole thing fit in memory? Just read from the filehandle in list context... `my @input = <STDIN>; for (@input) { process_line($_); }` [download] Now your Perl script can take as long as it likes to process the data; the other program has finished and probably exited, but so what? `$;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}} split//,".rekcah lreP rehtona tsuJ";$\=$ ;->();print$/` [download]	[reply] [d/l] [select]