in reply to encoding issues
First of all, it'll help a lot to get a look at your data in terms of hex code-point numbers -- here's a tool you can use for that: tlu -- TransLiterate Unicode
Next, in terms of grepping for particular unicode characters in data, there's this: grepp -- Perl version of grep
Apart from that, in terms of getting things into the database properly, do you have the ability to create or alter tables? If so, you should be able to find the means to (re)define tables or columns to use utf8 encoding rather than the server's default latin1 encoding; that way, you won't need to worry about whether your data contains anything outside the latin1 range.
|
|---|