Re: reading 100 line at a one time
by chromatic (Archbishop) on Mar 01, 2005 at 07:46 UTC
|
i have a file of 10000 line and i want to read each time 100 line instead of line by line as it will reduce my disk I/O.
Why would it reduce your disk IO? You'll have to read the same amount of information to read a hundred lines regardless of whether you do it in a hundred pieces or one piece and disks care absolutely nothing for line lengths.
If your hard drive or operating system have disk caches, make sure to enable them. That's the right place to optimize this for almost every application.
| [reply] |
|
|
open(handle, "c:\\file.log");
while(<handle>)
{
#here i use regular expression for parsing line
#here i print parse information
}
now i want it as follow
function(filepath, numberofline)#return number of line
while(<return line>)
{
#here i use regular expression for parsing line
#here i print parse information
}
Is there any solution for it. its very urgent for me.
Edit by BazB. Add code tags. Remove excess br tags.
| [reply] [d/l] |
|
|
sub read_n_lines
{
my ($fh, $count) = @_;
my $buffer;
$buffer .= $_ for 1 .. $count;
return $buffer;
}
Use it something like:
open(my $handle, 'c:/file.log') or die "Can't read file: $!\n";
while(not eof( $handle ))
{
my $chunk = read_n_lines( $handle, 100 );
my $parsed = parse_chunk( $chunk );
print_parsed( $parsed );
}
You might need a little more logic in read_n_lines() to check for eof, but if you look at the documentation and play around a little bit, you'll figure it out. | [reply] [d/l] [select] |
|
|
The special symbol $. contains the line number of the file you are reading. If you only want the frist 1000 lines, put "last if $. > 1000 just after the start of the while loop. If you want a group from the middle or end just keep track of $. and only start using the lines when you get to where you want to start. If you want the last 1000 lines in the file and it's a big file, you probably should use File::Tail
| [reply] |
Re: reading 100 line at a one time
by bart (Canon) on Mar 01, 2005 at 08:13 UTC
|
#!/usr/local/bin/perl -w
use Time::HiRes 'time';
$file = 'mshtmlc.h';
my @t = time;
open IN, '<', $file or die "Can't read file: $!";
while(<IN>) {
# nada
}
close IN;
push @t, time;
open IN, '<', $file or die "Can't read file: $!";
while(read IN, $_, 1024) {
# nada
}
close IN;
push @t, time;
printf <<'--', $t[1]-$t[0], $t[2]-$t[1];
line by line: %.3f s
1k blocks: %.3f s
--
The file, an incude file coming with lcc.exe, is a text file of 1.67MB and close to 28000 lines. The results of this test:
- desktop PC, 600MHz, Win98, 6GB disk:
line by line: 0.490 s
1k blocks: 0.060 s
- XP laptop of 2.4GHz, 30GB 2in disk:
line by line: 0.026 s
1k blocks: 0.200 s
As you can see, I even get conflicting results. | [reply] [d/l] |
Re: reading 100 line at a one time
by brian_d_foy (Abbot) on Mar 01, 2005 at 08:56 UTC
|
Have you specifically identified this as a problem for your application? I'm curious about this behavior if you have, including the technical details of the disks you are using. Can you elaborate?
If the amount of time that it takes you to get something off a disk is very large compared to whatever you are doing, something seems odd.
If that is really a problem though, you might want to look into a design that knows to buffer lines from the file, and to get more lines when it runs low. You're still paying the penalty for reading lines though. Something like sysread might give you a little boost, but then you have to spend time figuring out whre the line breaks are.
--
brian d foy <bdfoy@cpan.org>
| [reply] |
Re: reading 100 line at a one time
by inman (Curate) on Mar 01, 2005 at 09:20 UTC
|
This question was previously discussed in this thread
Perl uses buffered IO to read from any file. This means that a while (<FILE>){} line will read the file line by line from start to finish as fast as possible. Furthermore, because you want 100 lines of data, your code still has to scan every character in the data to discover 100 linefeeds.
Reading the file into memory to process it is useful if you need to process the same data multiple times. You can do this by slurping the data straight into an array. | [reply] [d/l] |
Re: reading 100 line at a one time
by jpeg (Chaplain) on Mar 01, 2005 at 07:48 UTC
|
I'm not sure what I/O savings you'll get. Your program will still have to read every single line and do the same amount of disk I/O. If you want to process data one hundred lines at a time...well, you're still processing the same amount of characters no matter what, one at a time.
Second, the read and sysread functions, as you know, use bytes as delimiters. The diamond operator <> reads a line at a time.
Hope this clarifies a little. | [reply] |
Re: reading 100 line at a one time
by TedPride (Priest) on Mar 01, 2005 at 08:38 UTC
|
If you want to save on input time, the proper way to do it is read x number of bytes and then read to the end of the current line and split on line break. Though to be honest, with a 10000 line file I'd probably just read the entire thing into memory, since it's not likely to take up more than maybe 10 MB of space. | [reply] |
Re: reading 100 line at a one time
by jkva (Chaplain) on Mar 01, 2005 at 07:22 UTC
|
Update 2 : It has come to my attention that the code down here is very wrong. Ignore it!
Update : This should do the trick.
It opens "C:/test.txt" , reads the lines into an array, loops through the array, printing every 100th slice, if that array slice is initialized.
my @lines = ();
open THISFILE, "c:/test.txt";
while(<THISFILE>)
{push(@lines,$_);}
close THISFILE;
for(my $i = 0; $i <= 1000; $i += 100)
{print $lines[$i] || "";}
| [reply] [d/l] [select] |
|
|
If the file contains 10_000 lines, you are loading them all. What about 10_000_000 lines? Can your system handle that?
The original poster wanted to read 100 at a time. Though I doubt that they'll notice any real advantage, in terms of I/O, because Perl already does buffering behind the scenes. But answers here should fit the algorithm they're requesting, because any other assumptions are likely bad.
-- [ e d @ h a l l e y . c c ]
| [reply] |
|
|
I get your point. My first mistake was not understanding the original poster's question - even after a few tries. My second mistake was replying whilst damn well knowing I cannot code very well yet. I must say though that the question specified a file of 10.000 lines. If very much larger, or of unknown length, it would have been disadvisable.
Hope I cleared things up ;)
| [reply] [d/l] |
|
|
Re: reading 100 line at a one time
by pearlie (Sexton) on Mar 01, 2005 at 08:08 UTC
|
Another way to read whole file into an array is:
open FILE, "c:/test.txt";
my @arr = <FILE>;
close FILE;
| [reply] [d/l] |