in reply to What's the best way to detect character encodings, Windows-1252 v. UTF-8?
String::UTF8
Search::Tools::UTF8
It outputs:#!/usr/bin/perl use strict; use warnings; use Search::Tools::UTF8; use String::UTF8 qw(:all); my $text = 'There are those of you out there stuck with Latin-1.'; print my $str = is_utf8($text), "\n", #check if well-formed is_valid_utf8($text), "\n", is_ascii($text), "\n", looks_like_cp1252($text), "\n";
It's well-formed, valid utf8. It's also ascii but not cp1252. The well-formed test comes from String::UTF8, while the other methods come from Search::Tools::UTF8. Does this help?1 1 1 0
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: What's the best way to detect character encodings, Windows-1252 v. UTF-8?
by ikegami (Patriarch) on Jun 17, 2011 at 18:50 UTC | |
|
Re^2: What's the best way to detect character encodings, Windows-1252 v. UTF-8?
by Jim (Curate) on Jun 17, 2011 at 17:38 UTC |