Rather than trying to come up with explanations for why get what
you get... let me just recommend that, when debugging encoding
problems, you start by looking at the raw data1 with as few
tools in between as possible. Every tool (terminal, putty,
grep, vi, browser, whatever) or processing step might have it's own problems or
automagic-isms with one or the other encoding, so what you see may not
necessarily be what you really have.
My preferred tool in cases like these is a classical hexdump, or as
ikegami suggested, a Devel::Peek dump (when in Perl). Together with
some knowledge of how the various encodings represent data, this
usually allows you to figure out what's going wrong, eventually.
In this particular case, I would start by looking at a hexdump of
the CSV file (e.g. with the command line tool hexdump, which is
installed/available on most distros). Then we would reliably know what
the "Déjà" is represented as in the CSV file.
___
1 though Juerd might disagree, saying that
you shouldn't, unless you're an expert already...
| [reply] [d/l] |
What are the values you get when typing
:set encoding
:set fileencoding
in vim?
What do you get when you simply enter locale on the command line? | [reply] [d/l] [select] |
root@slarti:/media/usbdisk/music_org/scripts/test# localev
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
:set encoding
encoding=utf-8
:set fileencoding
fileencoding=latin1
| [reply] |
| [reply] [d/l] [select] |