Please pardon the Shameless Plug for My Own Nodes, but I hope these will be useful...
First of all, it'll help a lot to get a look at your data in terms of hex code-point numbers -- here's a tool you can use for that: tlu -- TransLiterate Unicode
Next, in terms of grepping for particular unicode characters in data, there's this: grepp -- Perl version of grep
Apart from that, in terms of getting things into the database properly, do you have the ability to create or alter tables? If so, you should be able to find the means to (re)define tables or columns to use utf8 encoding rather than the server's default latin1 encoding; that way, you won't need to worry about whether your data contains anything outside the latin1 range.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.