Not exactly sure I understood how you extracted the textual data from the Excel file... but for converting Windows unicode (UTF-16) plain text files into UTF-8, the following should do the trick:
use strict; use warnings; open IN, "<:encoding(utf16le)", "test.utf16le" or die $!; open OUT, ">:encoding(utf8)", "test.utf8" or die $!; while (my $line = <IN>){ print OUT $line; } close IN; close OUT
The idea is essentially to tell Perl what your existing input and desired output encoding is, and letting Perl do the rest.
Update: BTW, if the input file contains a BOM (which it almost always does on Windows), it would have been sufficient to specify :encoding(utf16). Perl can figure out itself that the file is in little-endian format in this case. Interestingly though, the output file does not contain a UTF-8 BOM when doing it that way — I never really understood the reasoning behind that behaviour... (When you convert it as shown above, however, the output file will have a BOM (presumably because it's then converted just like any other codepoint), which is recommended on Windows.)
In reply to Re: Getting Data from an Excel File
by almut
in thread Getting Data from an Excel File
by mrguy123
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |