in reply to Reading files, skipping very long lines...

I am uncertain whether I see what you mean. Just as an idea, would the following do what you want?

cat file | perl -le '$max=79; while(<>){print unless lentgh $_ > $max}'

If this would work for you, except from the memory issue, write the while loop properly with open first...
if that still fails, consider using Tie::File.

Cheers, Sören

  • Comment on Re: Reading files, skipping very long lines...

Replies are listed 'Best First'.
Re^2: Reading files, skipping very long lines...
by Limbic~Region (Chancellor) on Sep 29, 2005 at 17:14 UTC
    Happy-the-monk,
    I do not believe either one of these approaches will work (if I understand the problem correctly). Some lines are too long to read into a single variable so it is not possible to use length to determine if a line is too long. Using Tie::File would help since it only indexes where the newlines in the file begin, but you still need to read the whole line to determine if it is too long (length $file[42] > 1024 * 1024).

    I can see one way it may work though. If there is a way to get at the indices of the newlines, you would only have to subtract the 2 to determine if the line was too long.

    Cheers - L~R

    Update:The following is an untested proof-of-concept.

    #!/usr/bin/perl use strict; use warnings; use Tie::File; my $obj = tie my @file, 'Tie::File', 'file.big' || die "Unable to tie +'file.big': $!"; my $big = 1024 * 1024; for ( 0 .. $#file - 1 ) { my $beg = $obj->offset($_); my $end = $obj->offset($_ + 1); next if $end - $beg > $big; # process $file[$_]; } # Handle last line as special case my $beg = $obj->offset($#file); my $end = -s 'file.big'; if ( $end - $beg <= $big ) { # process $file[-1]; } #Cleanup undef $obj; untie @file;