Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
Dear monks
I have to import text files into a SQLite database. Text files are generally provided in UTF-8 encodings. However, I cannot exclude that a file can be in another format. I would like to check if my text file is in UTF-8, if not discard it (printing an error message). I have this script. But something seems not to work. I'm now trying to check every line, even if it could be -maybe- better to check the file as a whole
#!/usr/bin/perl use warnings; use strict; use Encode; use Encode::Guess; open (DATA, "<:utf8", "a.txt") or die $!; binmode DATA, ":utf8"; my $line = <DATA>; while($line){ my $decoder = guess_encoding($line); if (ref($decoder) eq 'Encode::utf8'){ print "File is in UTF-8\n"; #doing something } $line = <DATA>; } __END__
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Guessing encode text file
by McA (Priest) on Jan 20, 2014 at 15:44 UTC | |
|
Re: Guessing encode text file
by aitap (Curate) on Jan 20, 2014 at 19:18 UTC | |
by karlgoethebier (Abbot) on Jan 20, 2014 at 19:39 UTC |