With a while ago I meant "six months ago". Since it's a forum DB there have been many changes and new posts. So redoing the import is no option.
Exporting the text and repair the encoding is what I am trying to do right now. The text is now UTF8, so simply changing the encoding does not work. | [reply] |
Exporting the text and repair the encoding is what I am trying to do right now.
Export using current database tools, not perl. This makes sure you won't get encoding problems at that level, plus often, a bulk import / export is faster than working on table rows.
The text is now UTF8, so simply changing the encoding does not work.
There is no text, there are only database columns, and they contain a mix of properly UTF-8 encoded characters and mojibake.
Exporting, fixing mojibake, and importing takes DBI, DBD::mysql, and the MySQL client libraries used by DBD::mysql out of the picture. Only a UTF-8 encoded text file is left. Much easier to handle. The export should show proper encoded characters in a UTF-8 capable editor for fresh postings. After fixing it (maybe using perl or some other tool), the entire export should show proper encoded characters in the editor. Importing it back to the DB will fix the problem.
Use a second DB (and a second forum installation) to develop and test the process. Once the process works, shut down the forum, export, fix, import, restart the forum.
(This is what you should have done six months ago.)
Alexander
--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
| [reply] |
Aww, now I understand. Yes, that's very damn simple.
Export, use search & replace in my texteditor and import. That'll do the trick. Thanks a lot. :)
And yes... I should've done this six months ago. Well, it didn't bother me much but we have some texts in the forum that we want to reuse now.
| [reply] |