in reply to Re^12: somethign wrong with the sumbit
in thread somethign wrong with the sumbit
I must apologize -- I should have seen that. But you seem to be saying "I haven't tried to figure out why it doesn't run or why I get this error message". The first thing you could try, after downloading the code, is this command line (in a shell window):
If there were a syntax error in the script, this would tell you what lines in the script have problems. If there are no errors, you can just run it as a shell command, and it should print out valid HTML to the shell window.perl -T -cw testmenu.pl
The "premature end of headers" message just means that the script exited before it got very far in writing HTML data to the browser. Maybe the web server doesn't think the file is executable? If you have a working cgi script, does the file name for that script end in ".pl" or something else? (Often, a web server is configured so that it will only execute a script if it has a specific filename extension; ".pl" might not be right for your server, and you may need to rename testmenu.pl to something else.)
I can assure you that it runs correctly for me (perl 5.8.8 on macosx, running an apache2 web server), and I don't think there's anything in the script that depends on the particular OS, web server, browser or perl version (so long as it's perl 5.8.0 or later). Please keep trying to see if you can make it work on your machine. In any case, the important thing is the "decode()" line. Use that in your own script, and see what happens.
... the char 'n' in greek-iso will be 10010000 while the char 'n' in utf-8 will be stored as 10101010 01011010 ? Which means we are using different bytes in each case for storing?
First off, let's use hex digits, ok? It's just easier. I'm sorry but 0xB0 (the hex equivalent for 10010000) is not "n" -- it is the "DEGREE SIGN" character (a small raised circle). The Greek "CAPITAL LETTER NU" in 8859-7 is 0xCD and "SMALL LETTER NU" is 0xED (likewise in cp1253). Those single-byte (non-unicode) codepoints represent the same letters as the unicode codepoints 039D and 03BD, respectively. You can look those up at www.unicode.org.
The Unicode Standard, in its divine wisdom, provides several ways of storing those 16-bit codepoints -- here are the various byte sequences for those two unicode characters, depending on which encoding you choose:
(Note that the binary numbers you gave for "greek-iso 'n' in utf8" were also wrong. You must have been making them up.)UTF-16LE: 9D 03 and 8D 03 UTF-16BE: 03 9D and 03 8D utf8: CE 9D and CE BD
Each of those byte pairs, when interpreted correctly, is linguistically equivalent to the single byte characters 0xCD and 0xED used in 8859-7 and cp1253 (so obviously unicode uses more bytes per Greek character than the non-unicode encoings). That relationship between different byte values for the "same letter" is what character encoding conversion (the Encode module) is all about.
If you are handling character data that you know is Greek, and it ends up looking like Chinese when you display it, this means that the byte stream is being misinterpreted. As you should know by now, there are lots of ways to interpret the bytes incorrectly, and only one correct interpretation.
But if the client's browser aint smart enough to send form strings back to the sending script using the same encoding as the sending script does, HOW am i supposed to add a line to my index.pl telling perl to take the unknown encoded string and convert it to 'utf-8'?
It's up to the person using the browser to make sure that the browser is using the correct character encoding in order to display the page you send. You control the character set being used, so the browser has to conform to your usage.
In any case, the form data sent back to your server from the browser is determined by you when you create the form. Assuming the browser user is being cooperative, you will get back the byte sequences that you provided in the form that you sent out. (Of course, non-cooperative users will try to spoof you by sending requests with strings that you never put into your forms; that is what taint checking is all about).
In other words, when you send a form to a browser, and the user clicks things on the form and submits it, the values sent back are exactly the parameter values that you provided in the form -- the browser is not supposed to do anything to change those values (not even anything like changing the character encoding); it just provides a way for the user to make selections, and it sends back the information you requested about those selections.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^14: somethign wrong with the sumbit
by Nik (Initiate) on Jan 02, 2008 at 19:01 UTC | |
by graff (Chancellor) on Jan 03, 2008 at 03:32 UTC | |
|