sesemin has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I need to read a tab delimited file two times during one compilation, and each time extract lines with different criteria. The first while loop goes through but for the second time it does not. In another word if I have to while loops of the same file it does execute the second one. Any suggestion?

Pedro

Replies are listed 'Best First'.
Re: Having Access to a file two times
by davido (Cardinal) on Sep 21, 2008 at 06:14 UTC

    seek, or close and re-open the file. ...or make your filehandle lexically scoped to a block just outside your while loop (which will handle closing your file for you). In other words:

    { open my $fh, '<', $filename or die $! while ( <$fh> ) { # do your stuff. } # Note, your $fh is about to pass out of scope, which # will close the filehandle in cleanup. } { $open my $fh, '<', $filename or die $! while ( <$fh> ) { # do your other stuff. } } # Now the second filehandle fell out of scope and closed too.

    Dave

      Hi Dave,

      I check the perl doc for SEEK. after the first while but before closing the file. I Added,

      seek , 0, 1;

      while <FH> ...

      it did not work, Is this the right way of using SEEk?

        No: you need to specify a filehandle.
        seek FH, 0, 1;
        Take another look at the seek docs.

        It's also nice to check the return status of the file operation for success:

        seek FH, 0, 1 or die "seek failed: $!";
Re: Having Access to a file two times
by Perlbotics (Archbishop) on Sep 21, 2008 at 08:44 UTC

    Do you really need to read the TSV file twice in order to process your data? For large TSV files it might be faster to read the file only once line-by-line and perform the two operations - currently distributed over two different while-loops - within the same loop.

    On the other hand, if processing time doesn't matter and code keeps cleaner, seek might be the better option.

      Yeah, that was basically my first thought, too. If the file is small enough to easily fit in memory, read it all into an array and then walk the array for each operation. If it's too big for that, then read through it once doing both operations as you go. Either way, you'll get better performance out of it, since disk access is almost always the slowest operation.