peokai has asked for the wisdom of the Perl Monks concerning the following question:

Hello. I have a text file, containing a load of junk I don’t want, and then a part that I do want. This part that I want is some text after a heading, say for example the heading is “Sentences” and after that heading follows some sentences until the end of file, I simply want all the sentences after the heading. So far everything works fine except the part where I match the incoming line with the heading (Sentences in this example), then I don’t quite know the code for the script to take _only_ everything after it matched “Sentences” until the EOF. Halp? <3
  • Comment on Reading lines from a specific location in a text file.

Replies are listed 'Best First'.
Re: Reading lines from a specific location in a text file.
by grizzley (Chaplain) on Sep 28, 2009 at 12:57 UTC
    This is well explained in perlfaq6 under topic "How can I pull out lines between two patterns that are themselves on different lines?"
Re: Reading lines from a specific location in a text file.
by BioLion (Curate) on Sep 28, 2009 at 13:01 UTC

    Let's have a look at what code you have got so far!

    If you only want text after the matching line, then the easiest approach is probably to set a 'flag' on successful matching :

    use strict; use warnings; ## flag my $flag = 0; while(<DATA>){ my $line = $_; if ($flag){ ## we have already seen the flag, so print the line print $line; } else { ## haven't yet seen the flag ## so let's see if this line contains it if ($line =~ m/^Sentence/){ ## if we do see it, set the flag, ## so we no longer look for it, but just spew out the lines $flag = 1; } } }
    Just a something something...
Re: Reading lines from a specific location in a text file.
by jakobi (Pilgrim) on Sep 28, 2009 at 13:10 UTC

    perl -lne '/Sentences/ and ++$f or $f and print' FILE

    Or just '/Sentences/ and ++$f; $f and print' to print both heading and lines.

    With a more idiomatic scalar context range op version of the latter being
    'print if /THIS-IS-NOT-MY-HOMEWORK-AFTER-THE-Sentences/ .. (0 and "or is it?")'.

    :>
Re: Reading lines from a specific location in a text file.
by Marshall (Canon) on Sep 28, 2009 at 21:06 UTC
    Hello. I have a text file

    Ok, you have a series of text characters that periodically have a "\n" in the data stream.

    The disk and the file system hides some "ugliness" from you, but basically you can think of a disk file as a stream of bytes from 0 to N.

    A "line" has some number of characters followed by "\n". There is no general way to "jump over 5 lines" because the number of characters contained within those lines is variable and unknown. You have to read sequentially through the bytes and count the number of "\n" chars to decide that you've skipped over say 5 lines.

    There are some types of disk data structures that have fixed length records and in that case, it is possible to calculate "the starting byte address" of record 213 and you can "seek" (or in other words "jump") there.

    The disk system reads sequential characters very quickly. Even a couple of thousand characters of "header" is normally of no consequence.

    Now if you have say 1MB of stuff to "plow through" at the beginning, then maybe you have the wrong data structure?

    But it appears to me that you just need to "read and throw away" the lines that don't matter until you get to the "real stuff".

Re: Reading lines from a specific location in a text file.
by vitoco (Hermit) on Sep 28, 2009 at 14:49 UTC

    Unless you know the exact position of the begin of your wanted data before you open the text file, you must read all the lines of your file, and ignore those from the first to the line you recognize as the end of the heading.

    I hope that your header is much less than your data... ;-)