Re: Reading from file, not to memory
by steves (Curate) on Feb 20, 2003 at 03:10 UTC
|
Ouch. You don't have to read all lines in at once first. This line is doing that since the angle bracket operator is invoked in array context:
my @file = <FH>;
To read a line at a time do this instead:
local *FH;
open(FH, "/path/to/file") || die "Can't open file: $!";
while (<FH>)
{
# line is in $_
if ($_ eq $external_info)
{
print "great\!";
last;
}
}
close(FH);
| [reply] [d/l] [select] |
|
|
I got what I needed -- thanks for the help!
| [reply] |
Re: Reading from file, not to memory
by Zaxo (Archbishop) on Feb 20, 2003 at 03:21 UTC
|
You may be getting into swap by reading the whole file into an array. A while loop on the filehandle will read one line at a time into $_, keeping memory requirements to a minimum.
open my $fh, '<', '/path/to/file'
or die $!;
while (<$fh>) {
if ( $_ eq $external_info ) {
print 'great!', $/;
last;
}
}
close $fh or die $!;
You will need to chomp at the top of the while loop if $external_info does not have a newline at the end.
I converted to 3-arg open, and used a lexical file handle for that. You can convert back to suit previous versions of perl.
After Compline, Zaxo | [reply] [d/l] |
Re: Reading from file, not to memory
by rob_au (Abbot) on Feb 20, 2003 at 10:30 UTC
|
Another option which I'm surprised hasn't been mentioned yet is the use of the Tie::File module - This module, part of the standard 5.8.0 distribution, allows lines of a file to be manipulated via a tied-array without requiring the entire file to be read into memory. Additionally, the amount of memory used for read caching and write buffering can be controlled by the memory argument to the tie constructor.
Using this code, you code might look like:
# Tie the array @file to /path/to/file using a maximum of 1Mb of mem
+ory
tie my @file, 'Tie::File', '/path/to/file', 'memory' => 1_000_000 or
die "Cannot open file - $!";
foreach my $line ( @file )
{
# ... Your code follows ...
}
This module is quite stable, despite the beta label given to it by its author, and works exceptionally well in a production environment (even under 5.005.03).
perl -le 'print+unpack("N",pack("B32","00000000000000000000001000110010"))' | [reply] [d/l] [select] |
Re: Reading from file, not to memory
by Anonymous Monk on Feb 20, 2003 at 08:23 UTC
|
You don't need the array, you can just use a while statement on the filehandle.
e.g.
while <FH>
{
if ($_ eq $external_info) {
print "great\!"; last;
}
}
| [reply] [d/l] |
Re: Reading from file, not to memory
by physi (Friar) on Feb 20, 2003 at 09:09 UTC
|
You can set the INPUT_RECORD_SEPARATOR $/ to your $external_info.
$/="$external_info";
open(FH, "/path/to/file") || die "Can't open file: $!";
while (<FH>) {
if length($_) != (stat FH)[7]) {
print "great\n";
last;
}
}
The bad Thing about this way is, that again the whole File goes into memory if the $external_info is not in the File...
So the other solutions might be a bit better ;-)
-----------------------------------
--the good, the bad and the physi--
-----------------------------------
| [reply] [d/l] [select] |
Re: Reading from file, not to memory
by pfaut (Priest) on Feb 20, 2003 at 03:14 UTC
|
When you reference a file handle in list context (by assigning to an array), you read the whole file at once into memory. You can intead read the file line by line by using the file handel in scalar context (by assigning to a scalar). Instead of this...
open(FH, "/path/to/file") || die "Can't open file: $!";
my @file = <FH>;
close FH;
...try this...
open(FH, "/path/to/file") || die "Can't open file: $!";
foreach my $file (<FH>){
if($file eq $external_info){
print "great\!";
last;
}
}
close FH;
---
print map { my ($m)=1<<hex($_)&11?' ':'';
$m.=substr('AHJPacehklnorstu',hex($_),1) }
split //,'2fde0abe76c36c914586c';
| [reply] [d/l] [select] |
|
|
You have to be careful here, because you are doing exactly the same thing. The foreach statement is also accessing <FB> in array context and is reading the whole file into an anonymous array and then iterating over it. You have to use a while loop for this to work correctly.
Try out the following bit of code to test this:
use Benchmark;
timethese(1, {
'Trial1 While' => sub {
open (FILE, "file2") or die "Can't open file: $!\n";
while (<FILE>) {
last; # read one line and exit
}
close FILE;
},
'Trial2 Foreach' => sub {
open (FILE, "file1") or die "Can't open file: $!\n";
foreach my $line (<FILE>) {
last; # read one line and exit
}
close FILE;
},
});
Make sure that file1 and file2 are identical (I used two files so that we know there is no caching going on), and that they are large text files. I got the following results with 2 50Meg files:
Benchmark: timing 1 iterations of Trial1 While, Trial2 Foreach...
Trial1 While: 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)
Trial2 Foreach: 17 wallclock secs (10.77 usr + 1.56 sys = 12.33 CPU)
+@ 0.08/s (n=1)
| [reply] [d/l] [select] |
|
|
my $file = $0 ;
open(FH, $file) ;
foreach my $line (<FH>){
my $tel = tell(FH) ;
print "$tel>> $line" ;
}
close FH;
OK way:
my $file = $0 ;
open(FH, $file) ;
while ( my $line = <FH> ) {
my $tel = tell(FH) ;
print "$tel>> $line" ;
}
close FH;
You can see that in the 1st way the buffer was always in the end, because it already have loaded all the file!
Graciliano M. P.
"The creativity is the expression of the liberty". | [reply] [d/l] [select] |
|
|
Let this note stand as a testimony to the dangers of cut-n-paste or cargo culting. In my code, I always use while when reading from files (honest!) but in my laziness, I copied and pasted from the base node into my reply without looking the code over well enough.
---
print map { my ($m)=1<<hex($_)&11?' ':'';
$m.=substr('AHJPacehklnorstu',hex($_),1) }
split //,'2fde0abe76c36c914586c';
| [reply] [d/l] [select] |