in reply to How does the while works in case of Filehandle when reading a gigantic file in Perl

raj4489,

You can check this yourself with the following untested code:

use strict; use warnings; use Time:HiRes qw( gettimeofday ); open ( my $fh, "<", "./something.txt" || die "! open $!\n"; my $stime = gettimeofday; while (<$fh>) { } print gettimeofday - $stime, "\n"; close $fh; open ( $fh, "<", ".something.txt" || die "! open $!\n"; $stime = gettimeofday; while (<$fh>) { # do something is your actual code! } print gettimeofday - $stime, "\n"; close $fh;
Now you have the time for the 'while' loop with and without your 'do something' actual code.

I suspect your experiential growth is something your doing in the 'do something' part of the script. Post it and we may help improve the process.

Regards...Ed

"Well done is better than well said." - Benjamin Franklin

  • Comment on Re: How does the while works in case of Filehandle when reading a gigantic file in Perl
  • Download Code

Replies are listed 'Best First'.
Re^2: How does the while works in case of Filehandle when reading a gigantic file in Perl
by raj4489 (Acolyte) on Feb 06, 2015 at 12:39 UTC

    @perlholic has posted my code, and I have checked the timings for each step separately but all are linear. The problem lies with 'while' because for every new line the time requirements go up

      There is nothing particular with a while(<>){ loop that would make the time go up linearly. Maybe some of your processing is consuming memory or accumulating data in an array that gets larger and larger without ever getting cleared. We will need to see more accurate code than the reduced version that was posted.

      If the time taken per line gets larger and larger, maybe you can post a small/short XML example and some code more to the point so we can try to replicate the problem? Please also make sure that the problem appears with the code you post.

      If you do other work, like for example, inserting the data into a database instead of writing it to a file, that could get slower with each new row that gets added.