in reply to Re^8: somethign wrong with the sumbit
in thread somethign wrong with the sumbit
Programming based on assumptions is just like programming based on beliefs -- not good enough. When you lack reliable or coherent documentation, you need to do tests. They can be simple tests, like "do the file names display correctly on the web page when I use encode() this way?" So do the tests, and spend less time asking us to do them for you.
I quoted many other things in my previous post that you said, and asked so to understand whats going on but you didn't answer me those so i still don't have a clear picture of what went wrong and why the values returned from the form ain't matching one of the values of array @display_files.
Did you try to run the little test script that I posted? Save that script as a file in your cgi-bin directory (call it "testmenu.pl" or something like that), make it executable, and point your browser at it. When you see that it works, try to set your own application so it handles the menu strings and parameter values the same way. If it does not work for you, try to be as explicit and clear as possible when you report what actually happens (error messages, web page content); if you made changes in the code before running it (though this should be unnecessary), show the code that you actually ran.
What do you mean by "can be read as if it were anything at all"
Let's see if I can explain it better. Here's a sequence of four bytes, expressed as hex numbers:
If you treat those bytes as an ISO-8859-1 string, it's four characters, where the first and third are "capital-I-with-circumflex", the second is "broken-bar", and the fourth is "yen-sign". If you treat it as ISO-8859-7 (Greek), the "a6" byte is still the "broken-bar" char, but "a5" is the "drachma-sign" and "ce" is "Greek-capital-letter-Xi". If interpreted as utf8, its just two Unicode Greek letters (capital-Phi / U+03A6, capital-Upsilon / U+03A5). If treated as UTF-16BE (big-endian), it's two other unicode characters: U+CEA6 and U+CEA5 (Hangul syllables); treated as UTF-16LE, it's U+A6CE and U+A5CE, which are unassigned (no unicode characters exist at those code points). Many other non-unicode character sets could be used to get even more interpretations of the string as two or four characters.ce a6 ce a5
Those same four bytes could even be interpreted as a four-byte integer or as a pair of two-byte integers (signed or unsigned, big- or little-endian) -- that is, you could use perl's "unpack" function to get a variety of numeric values from that same byte sequence.
The point is this: there in nothing intrinsic to the byte stream that says "this is utf8 text" or anything else. You have to know what it's supposed to be, treat it accordingly, and handle the cases when there are problems with the data source that cause the data to be something different from what it's supposed to be.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^10: somethign wrong with the sumbit
by Nik (Initiate) on Jan 01, 2008 at 18:39 UTC | |
by graff (Chancellor) on Jan 01, 2008 at 21:24 UTC | |
by Nik (Initiate) on Jan 01, 2008 at 23:22 UTC | |
by graff (Chancellor) on Jan 02, 2008 at 06:03 UTC | |
by Nik (Initiate) on Jan 02, 2008 at 19:01 UTC | |
|