dirtdog:

Hmmm ... it feels wrong to me, and here's why:

Just for grins, I built a reasonably large file (465MB) and tried several filters on it:

$ # Do nothing but count lines $ time perl -i -pe '++$cnt; END {print STDERR $cnt}' floop.cr 10000000 real 0m2.641s user 0m2.218s sys 0m0.375s $ # Your original filter $ time perl -i -pe 's/\r//g; ++$cnt; END {print STDERR $cnt}' floop.cr 10000000 real 0m6.298s user 0m5.703s sys 0m0.421s $ # Don't do it globally, end at the first one $ time perl -i -pe 's/\r//; ++$cnt; END {print STDERR $cnt}' floop.cr 10000000 real 0m3.439s user 0m2.937s sys 0m0.390s $ # Do it only at the end of the line $ time perl -i -pe 's/\r$//; ++$cnt; END {print STDERR $cnt}' floop.cr 10000000 real 0m3.188s user 0m2.781s sys 0m0.359s

So you can gain a bit of performance by tweaking your regular expression a bit. After I did so, the search and replace overhead was roughly 20% of the entire runtime. So you can't really get a big win here. Or, if 20% is enough time to be significant, then I'd suggest changing your processing so that rather than using a filter, you instead write a small perl script that would simply check the first line of the file. If it has "\r" then filter it, otherwise process using the original file. That way could could save nearly all of the I/O time when you don't have a "\r" in the file.

...roboticus

When your only tool is a hammer, all problems look like your thumb.


In reply to Re: One Liner to strip crlf by roboticus
in thread One Liner to strip crlf by dirtdog

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.