I'm curious as to how (and how often) anyone else deals with files with foreign linebreaks in their scripts. I, personally, have to deal with this quite a lot, and not just the windows (CRLF) <--> linux (LF) issues. We also have a few Mac users (CR), which keeps us on our toes.
 
For instance, let's say we have a clueless user who FTP's a configuration file onto the linux server in binary mode, and everything comes crashing down. I've dealt with this many a time, so I've grown accustomed to opening ALL my config/data files in binmode and splitting them with /\x0D?\x0A|\x0D/ (that is, if I'm not using a module to read them in).
 
How about my fellow monks? Thoughts? Comments? Insults?

__________
Build a man a fire, and he'll be warm for a day. Set a man on fire, and he'll be warm for the rest of his life.
- Terry Pratchett

Replies are listed 'Best First'.
Re: Different linebreaks for different folks...
by fireartist (Chaplain) on May 09, 2006 at 08:18 UTC
Re: Different linebreaks for different folks...
by radiantmatrix (Parson) on May 10, 2006 at 17:16 UTC

    I've used something like this little stub on the off times I've needed to "guess" at filetypes:

    sub guess_newline { my $file = shift; open my $F, '<', $file or return undef; my ($buf, $sep); until ( eof($F) or defined $sep) { # read a 1-k chunk + a random few bytes read( $F, $buf, 1024+int(rand(5)) ); # the trailing dot below is important, in case part of a newline i +s # truncated in the read! $sep = $1 if $buf=~m/(\x0A|\x0D|\x0D\x0A)./; } close $F; return $sep; }

    The purpose of the random length change is a workaround for a couple files I've run across where the lines (including newline) were exactly 1024 bytes. As a result, my regex never matched. ;-)

    This works very well like this:

    { local $/ = guess_newline($filename) || die "Can't guess sep for $fi +lename"; open my $IN, '<', $filename or die "Can't read $filename: $!"; while (<$IN>) { ... } }
    <radiant.matrix>
    A collection of thoughts and links from the minds of geeks
    The Code that can be seen is not the true Code
    I haven't found a problem yet that can't be solved by a well-placed trebuchet
Re: Different linebreaks for different folks...
by spiritway (Vicar) on May 10, 2006 at 16:25 UTC

    I rarely find it a problem. I only have to deal with Unix<->Windows issues, and it's a simple matter to convert between the two. If I forget, the results are usually noticeable enough to remind me right away.