in reply to One Liner to strip crlf
Hmmm ... it feels wrong to me, and here's why:
Just for grins, I built a reasonably large file (465MB) and tried several filters on it:
$ # Do nothing but count lines $ time perl -i -pe '++$cnt; END {print STDERR $cnt}' floop.cr 10000000 real 0m2.641s user 0m2.218s sys 0m0.375s $ # Your original filter $ time perl -i -pe 's/\r//g; ++$cnt; END {print STDERR $cnt}' floop.cr 10000000 real 0m6.298s user 0m5.703s sys 0m0.421s $ # Don't do it globally, end at the first one $ time perl -i -pe 's/\r//; ++$cnt; END {print STDERR $cnt}' floop.cr 10000000 real 0m3.439s user 0m2.937s sys 0m0.390s $ # Do it only at the end of the line $ time perl -i -pe 's/\r$//; ++$cnt; END {print STDERR $cnt}' floop.cr 10000000 real 0m3.188s user 0m2.781s sys 0m0.359s
So you can gain a bit of performance by tweaking your regular expression a bit. After I did so, the search and replace overhead was roughly 20% of the entire runtime. So you can't really get a big win here. Or, if 20% is enough time to be significant, then I'd suggest changing your processing so that rather than using a filter, you instead write a small perl script that would simply check the first line of the file. If it has "\r" then filter it, otherwise process using the original file. That way could could save nearly all of the I/O time when you don't have a "\r" in the file.
...roboticus
When your only tool is a hammer, all problems look like your thumb.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: One Liner to strip crlf
by toolic (Bishop) on Sep 04, 2014 at 16:19 UTC | |
|
Re^2: One Liner to strip crlf
by dirtdog (Monk) on Sep 04, 2014 at 16:10 UTC |