grep and delete

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: grep and delete by davido (Cardinal) on Jan 21, 2006 at 05:22 UTC
This sounds like a good job for Tie::File. (the following is untested.) `use strict; use warnings; use Tie::File; my @array; tie @array, 'Tie::File', 'filename.txt' or die $!; foreach ( 0 .. $#array ) { if( $array[$_] =~ /word/ ) { $#array = ( $_ < $#array ) ? $_ + 1 : $_; last; } } untie @array;` [download] This works by iterating over the array tied to the file. When 'word' is found, the array is resized (by setting $#array to a smaller value). An extra check is done to ensure that if 'word' is found on the last line of the array we're not resizing the array larger than it previously was. The one inefficiency here is that behind the scenes Tie::File must learn how many lines there are in the array, and that means scanning through the file internally first. In practice it doesn't seem to slow things down noticably, but I suppose for super large files it could. I often overlook Tie::File, thinking of it as mostly a novelty; not really something you would think of actually using. But I shouldn't think of it that way; it works, it's fast, and it makes some simple tasks even simpler. Dave	[reply] [d/l]
Re: grep and delete by davidrw (Prior) on Jan 21, 2006 at 05:16 UTC
What kind of size of file are you talking about? (i.e. is it acceptable to slurp into memory) one way is to just read a line at a time, and skip a chunk after your marker .. then write the whole thing back out.. `my @lines; open FILE, "<", "foo.txt"; while( <FILE> ){ push @lines, $_; next unless /\byourWord\b/; <FILE> for 1 .. 5; } close FILE; open FILE, ">", "foo.txt"; print FILE, @lines; close FILE;` [download] Hmm.. that removes the 5 lines after each match .. re-reading OP it seems like it mightbe asking for just delete all the lines after the match ... using similar method: `my @lines; open FILE, "<", "foo.txt"; while( <FILE> ){ push @lines, $_; last if /\byourWord\b/; } close FILE; open FILE, ">", "foo.txt"; print FILE, @lines; close FILE;` [download] Another option is Tie::File and a similar approach to above .. or start looping through the array, and just `splice` off everything after your word is found.	[reply] [d/l] [select]
Re: grep and delete by BrowserUk (Patriarch) on Jan 21, 2006 at 05:59 UTC
If the files aren't huge, or the word appears fairly early (a few hundred MB) in the file, then you can do this with a "one-liner" (wrapped for viewing; adjust quoting to your OS needs): `perl -e"BEGIN{$/=qq[the word or phrase]}" -ne"$n=tell ARGV; close ARGV; truncate $ARGV, $n" path\to\the\file` [download] Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^2: grep and delete by davido (Cardinal) on Jan 21, 2006 at 06:25 UTC
Here's another one-liner that doesn't use a begin block: `perl -ni.bak -e "print; /word/ && last;" file.txt` [download] When the thought came to me, I was surprised at how simple the solution turned out to be. Dave	[reply] [d/l]
Re^3: grep and delete by BrowserUk (Patriarch) on Jan 21, 2006 at 07:48 UTC
That doesn't quite match the OPs stated requirements in that it will preserve the entire 'line', (which could be the entire file if this was a binary file), containing the keyword rather than truncating immediately after the keyword. Has the advantage that it does matter how big the file (assuming non-binary), or how far into the file the truncation point is. Has the disadvantage that it will consume more disc space rather than reducing it--if that was the original intent. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply]
Re^4: grep and delete by davido (Cardinal) on Jan 21, 2006 at 16:31 UTC
Re: grep and delete by ambrus (Abbot) on Jan 21, 2006 at 11:35 UTC
You can, but these simpler tasks are easier to do with an editor unless the file is very large: `ed file <<<$'/\<word\>/+,$d\nw'` [download] Alternately replace `ed` with `ex`, the same command will work with it. Update: Sure, you can do it with truncate too, like `perl -we 'open $F, "+<", $ARGV[0] or die; while (<$F>) { /\bword\b/ an +d truncate $F, tell $F; }' file` [download] Using an external grep program might also be possible, but beware that this command below truncates before the line containing the word. (I only include it because this is the first use I've found for the bash `<>` redirection operator. perl -we 'length($n = `grep -wbm1 word`) and truncate STDIN, $n;' <>file	[reply] [d/l] [select]
Re^2: grep and delete by Anonymous Monk on Jan 21, 2006 at 14:54 UTC
Hi everyone, WOW..!! I got so many replies.....I am trying to run the suggestions you gave...Now the problem I face is that...I am grepping an large xml file....(I dont have any hassles with memory ..) ..;-].....I have to grep for a word in the tail end of the file and remove the line below that word. thank u kindly,	[reply]