Bretheren,

I've got a bit of an odd one here. After attempting, several times, to utilize the good tool csplit on a 4.6 Gig file, and having it seg fault on me, I resorted to rolling my own short, sweet perl script.

Essentially I've discovered a kernel log file that hasn't been rotated since September of 2005, and I'd like to keep only the data for the year 2007; I'd done a grep with line numbers and discovered that the first line for 2007 was located at line 19,035,437.

#!/usr/bin/perl $skip = 19035437; while(<>) { if(! ($. % 1000000)) { printf STDERR "processed line $.\n" } last if $. == $skip; } while(<>) { printf STDOUT $_; }
And when run (with a different skip value) on a different 2.5 Gig log file, it worked beautifully, but when run on the 4.6 Gig file, I'm getting this:
processed line 1000000 processed line 2000000 processed line 3000000 processed line 4000000 processed line 5000000 processed line 6000000 processed line 7000000 processed line 8000000 processed line 9000000 processed line 10000000 processed line 11000000 processed line 12000000 processed line 13000000 processed line 14000000 processed line 15000000 processed line 16000000 processed line 17000000 processed line 18000000 processed line 19000000 Modification of a read-only value attempted at mysplit.pl line 13, <> +line 20804003.
Line 13 corresponds to the "printf STDOUT $_;" line.

Thinking this might be an overflow of some sort, I checked the binary representation of 20804003, and it happens to be: 1001111010111000110100011, so it's certainly not what I would think would be near an overflow.

Does anyone have any ideas of what might be happening here?

This is perl 5.8.5 running on dual intel 32 bit xeon 2.4 gig chips.

-Scott


In reply to line number ($.) problem ? by 5mi11er

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.