Beyond that, if there is non-ASCII content, the actual nature of such content (what character encoding, what language) might require some guessing... Encode::Guess could be helpful, depending on what language and character encoding are actually present.#!/usr/bin/perl use strict; my $non_ascii = 0; while (<>) { $non_ascii++ if ( /[^\x00-\x7f]/ ); } warn "input contains non-ASCII\n" if ( $non_ascii );
People who are smart enough to use XML with non-ASCII data usually have the clue about using utf8 encoding, and if your data falls into this category, Encode::Guess will work fine to confirm that (byte patterns in utf8 are quite distinctive and unmistakable). But if its one or another single-byte encoding (any of the cp125* or iso-8859-* character sets), you'll need to know what the intended language is in order to help Encode::Guess come up with a right answer.
In reply to Re: How to check the encoding format of an XML
by graff
in thread How to check the encoding format of an XML
by rellaboyina
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |