Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

So, I know it's all confusing. Took me forever. But it's actually really simple. A string of bytes is nothing. It's just binary data. You have to know what it's supposed to be and tell your code when coming from binary and going back to it. The raw stuff doesn't know (well, some charsets do have BOM flags but it's not something on which you can rely here). Your DBI/DBD driver can do the encode/decode two-step for you automatically as I suggested (might work even if table definition is wrong but it's best to ensure it's in agreement). :P Examples of the setting to check include–

  • DBD::mysql -> mysql_enable_utf8
    • This attribute determines whether DBD::mysql should assume strings stored in the database are utf8. This feature defaults to off.
  • DBD::SQLite -> sqlite_unicode
    • If the attribute $dbh->{sqlite_unicode} is set, strings coming from the database and passed to the collation function will be properly tagged with the utf8 flag; but this only works if the attribute is set before the first call to a perl collation sequence . The recommended way to activate unicode is to set the sqlite_unicode parameter at connection time
  • DBD::Pg -> pg_enable_utf8 (integer)
    • DBD::Pg specific attribute. The behavior of DBD::Pg with regards to this flag has changed as of version 3.0.0. The default value for this attribute, -1, indicates that the internal Perl utf8 flag will be turned on for all strings coming back from the database if the client_encoding is set to 'UTF8'. Use of this default is highly encouraged. If your code was previously using pg_enable_utf8, you can probably remove mention of it entirely. :\

Update: s/simply/simple/;


In reply to Re^6: UTF8 issue when getting website via LWP::UserAgent in Perl by Your Mother
in thread UTF8 issue when getting website via LWP::UserAgent in Perl by ultranerds

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (4)
As of 2024-03-28 20:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found