in reply to reading unicode files
There is no such thing as "Unicode files".
Unicode is:
The things of interest here are the transfer formats: the most known are utf-8 and utf-16:
In the following, I'll assume you have utf-8-encoded files.
For Perl 5.005, you just have to handle them as binary files, i.e. you don't have support for Unicode strings.
Perl 5.8 does have support, you just have to tell it which encoding your files are in:
This script would open the file assuming it is in utf-8, and print a message if it finds any character in the Devangari script.open UTF8FILE,'<:utf8','filename'; while (<UTF8FILE>) { /\p{Devangari}/ and print "A Devangari character!\n"; } close UTF8FILE;
Thing to look at (in the 5.8 docs):
And also The UTF-8 and Unicode FAQ
--
dakkar - Mobilis in mobile
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: reading unicode files
by Anonymous Monk on Mar 13, 2003 at 15:06 UTC | |
by dakkar (Hermit) on Mar 13, 2003 at 15:18 UTC |