in reply to Re: Encoding confusion with CGI forms
in thread Encoding confusion with CGI forms

Sounds like a nice trick. Especially since the official ways of handling request character encoding are not well supported and not very well thought out either (i.e. as far as I know a user-agent is only required to send charset information for multi-part forms)

I'm just wondering about what string to use, though. There are dozens of encodings in use around the world, and you should be able (ideally) to recognize each one. Is there any "standard" way of doing this? (a CPAN module would be wonderful, ofcourse)

  • Comment on Re^2: Encoding confusion with CGI forms

Replies are listed 'Best First'.
Re^3: Encoding confusion with CGI forms
by borisz (Canon) on Oct 22, 2004 at 19:48 UTC
    I life in Germany so my string is only 'äöü' to distingush between ISO-8859-1, utf8 and unknown. But you can extend this to all your supported encodings, just find a char with different representations.
    Boris
Re^3: Encoding confusion with CGI forms
by davistv (Acolyte) on Oct 22, 2004 at 21:05 UTC
    I really like this approach as well, I'm just having trouble coming up with a string that degrades in some predictable manner for different encodings.

    A perl module for this would be awesome, btw! It would be even more automagical if it were integrated into CGI.pm behind the scenes!

    Cheers,
    Troy