gri6507 has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monks

This is my first time trying to parse a file which has lines that are apparently longer than what

open(FIL,$file); while(<FIL>){dosomething()}

can read in. In fact, some lines are just over 8kB in length and I am only reading in 512-1024 bytes (I'm guessing based on what I can see - I didn't actually count it). Is there some system variable which dictates a maximum for how much to glob at a time?

Replies are listed 'Best First'.
Re: Reading a file with 8kB long lines
by samtregar (Abbot) on Jun 08, 2007 at 20:21 UTC
    I don't think your diagnosis is correct. My perl can read long lines with no problem:

    $ perl -e 'for (1 .. 10) { print("A" x (9000 + $_) . "\n") }' > fil +e.txt $ perl -le 'open FOO, "file.txt"; print length $_ for (<FOO>);' 9002 9003 9004 9005 9006 9007 9008 9009 9010 9011

    What version of Perl are you running? What OS? Can we see the real code, please?

    -sam

      D'oh. You are right. I thought that the wrapping of the long lines was because of the text editor. I opened up my file in a hex-viewer and that confirmed it - this text file has newlines sprinkled throughout the "long lines". So, now I have to figure out how to parse this file.

      I'd like to hear your suggestions. The file is kind of C-code like (but it isn't - it's actually an SVF JTAG Boundary scan file) and looks like this

      // comments some code; more code; // more comments some very long long long code;
      basically, I want to read in every line that does not begin with a "//" and ends with a ";". Should I just set $/ to ';' and then process the read in strings to drop everything between // and \n?
        You can't set the input delimeter unless you know more than you've told us about comment lines.

        I'd do something like:

        my $full_line = ''; while ( my $line = <FH> ) { chomp $line; next if substr( $line, 0, 2 ) eq '//'; $full_line .= " $line"; # space wanted? your call! next unless substr( $line, -1, 1 ) eq ';'; #do something with $full_line print "$full_line\n"; $full_line = ''; }
        Update: replace the $full_line .= line with

        $full_line .= $full_line eq '' ? $line : " $line";

        Update^2: I should have said this has been tested and produced the following with your sample data:

        some code; more code; some very long long long code;

      First of all: ++. Then (I wanted to /msg you, but it resulted to be longer than expected, so:)

      $ perl -le 'open FOO, "file.txt"; print length $_ for (<FOO>);'

      Ok, they're just ten lines so it doesn't make a difference, but we recommend people all the time not to slurp files in all at once if possible and since we're talking about oneliners anyway, I would rewrite them like thus:

      $ perl -le 'print "A" x (9000 + $_) for 1..10' > file.txt $ perl -lpe '$_=length' file.txt 9001 9002 9003 9004 9005 9006 9007 9008 9009 9010
Re: Reading a file with 8kB long lines
by BrowserUk (Patriarch) on Jun 08, 2007 at 21:04 UTC
Re: Reading a file with 8kB long lines
by FunkyMonk (Bishop) on Jun 08, 2007 at 20:19 UTC
    Which OS? Which perl?

    @ARGV = 'x'; print length <>; #output: #67831

    This is perl, v5.8.8 built for x86_64-linux-gnu-thread-multi on Debian Testing

Re: Reading a file with 8kB long lines
by bart (Canon) on Jun 09, 2007 at 06:30 UTC
    Rest assured: a line in Perl can be megabytes long, even as long as you can fit into your memory.