grep -v is an acceptable Unix command line method to exclude a line, but it does not qualify as a Perl solution ... at least not as a GOOD perl solution. Besides which, that would drop only one line from the file. Imagine a worst case in which you wind up excluding every line in a 1000 line file ... You would have to copy the file, sans one line, 1000 times.
What you want is to go through the file, line by line, using open(), while(), and close(), test each line, and if acceptable, copy it to the output file. That means only one copying of the file, whether you drop 0 lines or a million.
You say "the absolute value of column 2 minus column 2 ( I guess you mean column 3 ) is greater than or equal to 1". Except for the absolute value bit, I would test for $col2 > $col3. But it's significantly different whether you mean abs( $col2 ) > abs( $col3 ) or whether you mean abs( $col2 - $col3 ).
As Occam said: Entia non sunt multiplicanda praeter necessitatem.
| [reply] [d/l] |
thank you for the reply. I do mean abs(column 2 - column 3). I would like the script to be able to either remove those rows in which that absolute value is greater than or equal to 1.
| [reply] |
The poster above has given you a very good logic flow and design for your program. I'll provide a little more on the functions you might want to use.
open();
#look up the proper syntax for using the open function to
#open the file
while(<FILEHANDLE>)
{
my $line = $_;
#I always prefer to copy $_ into an actual named
#variable. Personal preference. Some other monk please
#correct me if there is a best practice for this.
}
With the variable $line, you will want to look at the split() function. This will help you separate out the columns in each line. Also take a look at chomp if one of the columns is at the end of the line. Once you have the columns, abs will help with retrieving the absolute values. Finally, if the column matches your criteria, just print the variable $line. Afterwards, just run your program and redirect to the textfile of your choice.
Happy coding!
N.B. the code I posted has not been tested and thus prone to typos. | [reply] [d/l] [select] |
perl -anle"$F[1]==$F[2] and print" infile > outfile
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
Here's a short script that I think does what you're after. I didn't spend any time optimizing it; so it is just so you can see a quick and dirty strategy. It uses Perl references to create 2 dimensional matrices and the arrow notation to simplify and clarify what is going on. The subrouting, printMatrix(), is just for convenience so that you can better see what the 'before' and 'after' situation looks like. The output from the little script is: Good luck; welcome to Perl.
| [reply] [d/l] [select] |
perl -lane 'print if abs($F[1] - $F[2]) >= 1' infile > outfile
| [reply] [d/l] |
There is, in fact, a grep function.
See:
- perldoc perlfunc
- perldoc -f grep
- perldoc -f map
Incidentally, since lists usually contain references to the things that they “contain,” I often design filtering-routines so that they scan through the input list, selecting what they want to keep and pushing those onto an output list, which is then returned. Since we’re only moving references around, we aren’t burning up memory. And, the process is non-destructive: at the end of the day, we have the output list but the input list hasn’t actually been touched. We can now, if we choose, discard the one and keep the other, or we can keep both.
| |