in reply to Re^7: Problem with russian / cyrillic in e-mail program.
in thread Problem with russian / cyrillic in e-mail program.

I was scraping from that site because I needed russian text. Someone was complaining that like 1% of the e-mails being sent out were not right when russian characters were entered into the form. I tried the print already (non cyrillic ) and the program itself is not introducing HTML Entities. I have also been testing in thunderbird and gmail. Both end up with the odd subjects.

From the sound of it, it seems like I need to figure out what charset people are using every time. I guess it could be any charset considering people around the globe use the form.

I will just start debugging more and maybe I will find something. Thanks for your help, and if you can think of anything else let me know.

  • Comment on Re^8: Problem with russian / cyrillic in e-mail program.

Replies are listed 'Best First'.
Re^9: Problem with russian / cyrillic in e-mail program.
by Corion (Patriarch) on Apr 05, 2010 at 08:06 UTC

    So, to recap, you don't control what character sets you are getting from your HTML form submission. This has very little to do with the way you're sending mail, and everything to do with how HTTP does not specify the character set it is sending form data in. The usual approach to solving that problem is to add an explicit encoding to every HTML page containing a form and hoping that the browser will use that encoding when sending the data. Some browsers also send some header indicating the character set.