in reply to Getting Data from an Excel File
Not exactly sure I understood how you extracted the textual data from the Excel file... but for converting Windows unicode (UTF-16) plain text files into UTF-8, the following should do the trick:
use strict; use warnings; open IN, "<:encoding(utf16le)", "test.utf16le" or die $!; open OUT, ">:encoding(utf8)", "test.utf8" or die $!; while (my $line = <IN>){ print OUT $line; } close IN; close OUT
The idea is essentially to tell Perl what your existing input and desired output encoding is, and letting Perl do the rest.
Update: BTW, if the input file contains a BOM (which it almost always does on Windows), it would have been sufficient to specify :encoding(utf16). Perl can figure out itself that the file is in little-endian format in this case. Interestingly though, the output file does not contain a UTF-8 BOM when doing it that way — I never really understood the reasoning behind that behaviour... (When you convert it as shown above, however, the output file will have a BOM (presumably because it's then converted just like any other codepoint), which is recommended on Windows.)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Getting Data from an Excel File
by Jim (Curate) on Feb 27, 2008 at 19:19 UTC | |
by almut (Canon) on Feb 27, 2008 at 21:54 UTC | |
|
Re^2: Getting Data from an Excel File
by mrguy123 (Hermit) on Feb 27, 2008 at 12:40 UTC |