Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

hi everyone, I am trying to grep for word and delete the lines below the grepped word. Is there any way I could do that with seek and truncate? Thank you

Replies are listed 'Best First'.
Re: grep and delete
by davido (Cardinal) on Jan 21, 2006 at 05:22 UTC

    This sounds like a good job for Tie::File. (the following is untested.)

    use strict; use warnings; use Tie::File; my @array; tie @array, 'Tie::File', 'filename.txt' or die $!; foreach ( 0 .. $#array ) { if( $array[$_] =~ /word/ ) { $#array = ( $_ < $#array ) ? $_ + 1 : $_; last; } } untie @array;

    This works by iterating over the array tied to the file. When 'word' is found, the array is resized (by setting $#array to a smaller value). An extra check is done to ensure that if 'word' is found on the last line of the array we're not resizing the array larger than it previously was. The one inefficiency here is that behind the scenes Tie::File must learn how many lines there are in the array, and that means scanning through the file internally first. In practice it doesn't seem to slow things down noticably, but I suppose for super large files it could.

    I often overlook Tie::File, thinking of it as mostly a novelty; not really something you would think of actually using. But I shouldn't think of it that way; it works, it's fast, and it makes some simple tasks even simpler.


    Dave

Re: grep and delete
by davidrw (Prior) on Jan 21, 2006 at 05:16 UTC
    What kind of size of file are you talking about? (i.e. is it acceptable to slurp into memory)

    one way is to just read a line at a time, and skip a chunk after your marker .. then write the whole thing back out..
    my @lines; open FILE, "<", "foo.txt"; while( <FILE> ){ push @lines, $_; next unless /\byourWord\b/; <FILE> for 1 .. 5; } close FILE; open FILE, ">", "foo.txt"; print FILE, @lines; close FILE;
    Hmm.. that removes the 5 lines after each match .. re-reading OP it seems like it mightbe asking for just delete all the lines after the match ... using similar method:
    my @lines; open FILE, "<", "foo.txt"; while( <FILE> ){ push @lines, $_; last if /\byourWord\b/; } close FILE; open FILE, ">", "foo.txt"; print FILE, @lines; close FILE;
    Another option is Tie::File and a similar approach to above .. or start looping through the array, and just splice off everything after your word is found.
Re: grep and delete
by BrowserUk (Patriarch) on Jan 21, 2006 at 05:59 UTC

    If the files aren't huge, or the word appears fairly early (a few hundred MB) in the file, then you can do this with a "one-liner" (wrapped for viewing; adjust quoting to your OS needs):

    perl -e"BEGIN{$/=qq[the word or phrase]}" -ne"$n=tell ARGV; close ARGV; truncate $ARGV, $n" path\to\the\file

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Here's another one-liner that doesn't use a begin block:

      perl -ni.bak -e "print; /word/ && last;" file.txt

      When the thought came to me, I was surprised at how simple the solution turned out to be.


      Dave

        That doesn't quite match the OPs stated requirements in that it will preserve the entire 'line', (which could be the entire file if this was a binary file), containing the keyword rather than truncating immediately after the keyword.

        Has the advantage that it does matter how big the file (assuming non-binary), or how far into the file the truncation point is.

        Has the disadvantage that it will consume more disc space rather than reducing it--if that was the original intent.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: grep and delete
by ambrus (Abbot) on Jan 21, 2006 at 11:35 UTC

    You can, but these simpler tasks are easier to do with an editor unless the file is very large:

    ed file <<<$'/\<word\>/+,$d\nw'
    Alternately replace ed with ex, the same command will work with it.

    Update: Sure, you can do it with truncate too, like

    perl -we 'open $F, "+<", $ARGV[0] or die; while (<$F>) { /\bword\b/ an +d truncate $F, tell $F; }' file

    Using an external grep program might also be possible, but beware that this command below truncates before the line containing the word. (I only include it because this is the first use I've found for the bash <> redirection operator. perl -we 'length($n = `grep -wbm1 word`) and truncate STDIN, $n;' <>file

      Hi everyone, WOW..!! I got so many replies.....I am trying to run the suggestions you gave...Now the problem I face is that...I am grepping an large xml file....(I dont have any hassles with memory ..) ..;-].....I have to grep for a word in the tail end of the file and remove the line below that word. thank u kindly,