in reply to Re: Efficient way to sum columns in a file
in thread Efficient way to sum columns in a file
Also, I was curious to see the impact of cut and Perl's split.
So I tested these two commands/script on 500K file generated using (R's code)...However I output 25 columns instead of 4 and cut out 15 columns for testing
[sk]% time cut -d, -f"1-15" numbers.csv > out.csv 5.670u 0.340s 0:06.27 95.8%
[sk]% time perl -lanF, -e 'print join ",", @F[0..14];' numbers.csv > o +ut.csv 31.950u 0.200s 0:32.26 99.6%
I am surprised that Perl's split is *very* slow when compared to UNIX built in cut. Is this because Perl's split does a lot more than the Unix's cut? I see a lot of use cases for Perl in handling large files but if parsing is a bottle neck then I need to be careful on when to use it.
Thanks again everyone! I enjoyed reading the replies. Esp i liked the explanantions on eof and eof() (very good example to demonstrate the diff) and also the END {} idea :)
cheers
SK
|
|---|