Re: Different linebreaks for different folks...

I've used something like this little stub on the off times I've needed to "guess" at filetypes:

sub guess_newline {
  my $file = shift;
  open my $F, '<', $file or return undef;
  
  my ($buf, $sep);
  until ( eof($F) or defined $sep) {
    # read a 1-k chunk + a random few bytes
    read( $F, $buf, 1024+int(rand(5)) );
    
    # the trailing dot below is important, in case part of a newline i
+s
    # truncated in the read!
    $sep = $1 if $buf=~m/(\x0A|\x0D|\x0D\x0A)./;
  }
  
  close $F;
  return $sep;
}
[download]

The purpose of the random length change is a workaround for a couple files I've run across where the lines (including newline) were exactly 1024 bytes. As a result, my regex never matched. ;-)

This works very well like this:

{
   local $/ = guess_newline($filename) || die "Can't guess sep for $fi
+lename";
   open my $IN, '<', $filename or die "Can't read $filename: $!";
   while (<$IN>) { ... }
}
[download]

<–radiant.matrix–>
A collection of thoughts and links from the minds of geeks
The Code that can be seen is not the true Code
I haven't found a problem yet that can't be solved by a well-placed trebuchet

Comment on Re: Different linebreaks for different folks... Select or Download Code