in reply to Re: Reading CSV Files Containing UTF8 Characters
in thread Reading CSV Files Containing UTF8 Characters
I wrote this code to try to figure out what encoding it is, but for any file with the "special" characters I just get the "Didn't work" message.
This file was generated on Windows by exporting from Outlook. The Windows is setup for American English, but the keyboard is Danish. :-/ All the files that DO work are reported with "ascii" encoding (as I expect).#!/usr/bin/perl use strict; use warnings; use Encode::Guess; undef $/; # slurp on my $dir = '.'; if (@ARGV > 0) { $dir = $ARGV[0]; } opendir DIR, $dir or die "Can't opendir '.': $!\n"; my @files = grep /\.csv$/i, readdir(DIR); closedir DIR; Encode::Guess->add_suspects(qw(latin1 cp1252)); # What else? foreach my $file (@files) { open my $fh, "<:raw", "$dir/$file" or die "Can't open $!\n"; my $data = <$fh>; close $fh; my $enc = guess_encoding($data); if (ref $enc) { print "$file: " . $enc->name . "\n"; } else { print "Didn't work for: $file\n"; } } exit;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Reading CSV Files Containing UTF8 Characters
by graff (Chancellor) on Nov 09, 2007 at 04:31 UTC |