Re^2: somethign wrong with the sumbit

Replies are listed 'Best First'.
Re^3: somethign wrong with the sumbit by graff (Chancellor) on Dec 30, 2007 at 00:00 UTC
Ah. Well, it looks like maybe you need to tell perl explicitly that the value of "params('select')" must be treated as a utf8 string -- either that or else the string comparison in your grep statement needs to be done with explicitly declared byte semantics. In other words, either do this (flag the param value as utf8): `use Encode; # add this if you don't have it already if ( param('select') ) { #If User selected an item from the drop do +wn menu my $selected_file = decode('utf8', param('select')); unless ( grep { $_ eq $selected_file } @display_files ) #If User Selection doesn't match one of the passages then its a +Fraud! { ...` [download] (updated to fix spelling of "selected_file") Or else this (do the grep with byte semantics): `if ( param('select')) { use bytes; unless ( grep {$_ eq param('select') } @display_files ) #If User Selection doesn't match one of the passages then its a +Fraud! { ...` [download] Note that the "use bytes" pragma is lexically scoped: it applies within the block where you put it. Hope that helps...	[reply] [d/l] [select]
Re^4: somethign wrong with the sumbit by Nik (Initiate) on Dec 30, 2007 at 17:29 UTC
I tried both ways but i got this error in both cases as well: `Software error: Cannot decode string with wide characters at C:/Perl/lib/Encode.pm lin +e 182.` [download] Also i beleive there is no need to explicitly tell perl to handle param('select') as utf8 it must do this by default i think. In the past this code used to work without any need for utf8 conversion as long as it concerns param('select') the only conversion needed was this: `Encode::from_to($_, 'ISO-8859-7', 'utf8') for @display_files;` [download]	[reply] [d/l] [select]
Re^5: somethign wrong with the sumbit by graff (Chancellor) on Dec 30, 2007 at 20:03 UTC
I tried both ways but i got this error in both cases as well: `Software error: Cannot decode string with wide characters at C:/Perl/lib/Encode.pm lin +e 182.` [download] You should show, for both cases the contents of line 182 of your script. As it is, I can only guess that you did not actually follow my suggestions, because only one of the two approaches uses the Encode::decode function, so only that one case would issue this particular error message. If the "decode()" call was being used in both cases, then you didn't understand my second suggestion, and you probably didn't do it right. In any case, at this point, I've lost track of what problem you are actually having. The form returns a utf8 string value for the "select" param, and this value comes from the "@display_files" array, which you use both to create the menu and to test the return value of the "select" param. The array contains file names that are read from your directory as iso-8859-7 strings, and you convert them to utf8 before putting them into the popup menu, and you are confident that the strings being returned by the form are being correctly handled as utf8 string. And despite all this being true, your testing of the "param('select')" value never succeeds? Try an experiment. Reduce the process to just the bare minimum, where your cgi script dummies up a list of Greek "name" strings, puts up a form with a popup menu, and checks the param value that comes in when the form is submitted. With the process reduced to just this activity, you can focus more carefully on a variety of diagnostics, if/when it fails. If you can't figure out a diagnostic that reveals the problem, the test script should be small enough to post here in its entirety, and it would be "self-contained" (runnable anywhere), so others can try it out and help find the problem. If it doesn't fail, then the task is to figure out what the difference is between this simple test script, and the logic you used in the larger application. One last thing to check about the encoding issue. Suppose the client browser's form submission includes a value for the "select" paramater that is four bytes long, and those four bytes (expressed in hex) are: `ce a6 ce a5` [download] That would be the utf8 byte stream for a two-character string containing the letters "PHI" and "UPSILON". Let's suppose further that there actually was a file with this name in your directory, and @display_files contains this very same 2-character utf8 string. There's a chance that something in the handling of the input parameter string is doing an improper conversion of the original 4-byte sequence into a perl-internal utf8 string. The result of this improper conversion might be an 8-byte string, consisting of: `c3 8e c2 a6 c3 8e c2 a5` [download] That's what you get if the original four-byte string is assumed to be non-utf8 (e.g. iso-8859-1) and is then "converted" to utf8 based on that false assumption. You would be able to check this with a suitable test script where the strings for the menu all the same length. If the string coming back from the form is twice as long, it's a problem with interpreting the form data correctly as utf8 characters.	[reply] [d/l] [select]
Re^6: somethign wrong with the sumbit by Nik (Initiate) on Dec 30, 2007 at 21:31 UTC
Re^7: somethign wrong with the sumbit by graff (Chancellor) on Dec 30, 2007 at 23:32 UTC
Some notes below your chosen depth have not been shown here
Re^3: somethign wrong with the sumbit by graff (Chancellor) on Jan 04, 2008 at 06:10 UTC
I'll try one last time (update: because I missed a relevant clue in one your replies, and this might make a difference for you -- see the "last update" at the bottom of this post). I will repeat the advice I have given 3 times already in this thread. I do so with some reluctance, because I suspect that once you do try this, you'll discover (or create) some other bone-headed mistake in your code, and will start another lengthy sub-dialog... oh well, here it goes anyway. Let's go back to the code at the very beginning of this train-wreck -- I'll add some commentary, and make the changes that should get you over the particular hump that started it all: my @files = glob "$ENV{'DOCUMENT_ROOT'}/data/text/.txt"; my @display_files = map m{([^/]+)\.txt}, @files; Encode::from_to($_, 'ISO-8859-7', 'utf8') for @display_files; # SO FAR, SO GOOD. The same @display_files array is used later to cre +ate # the popup menu, which shows up correctly/as intended in the browser, # so this use of Encode::from_to() is correct and necessary. if ( param('select') ) { #If User selected an item from the drop do +wn menu my $selected_file = decode('utf8', param('select')); ## ADD THIS +LINE ### UPDATED 3 days after initial post: it wa +s originally ### "encode" which, as Nik points out below, + was wrong unless ( grep /^\Q$selected_file\E$/, @display_files ) #If User Selection doesn't match one of the passages then its a +Fraud! { ## REPORT AN INVALID SUBMISSION (you don't need to worry abou +t saying ## what properties it has that make it invalid -- it doesn't +match any ## known file name, and that is all that matters. ## ... but before exiting, send some kind of error page back + to the browser exit; } ## IF YOU GET HERE, YOU HAVE A VALID MATCH ## so you can see where your next coding mistake is... [download] The first time I suggested using the "decode()" function on the "select" param value, you said: i beleive there is no need to explicitly tell perl to handle param('select') as utf8 it must do this by default i think.* I was able to prove (to my own satisfaction, at least) that your belief here was wrong. So try the suggestion and see what happens. I gather that you don't read documentation much at all, but if you could do that, and spend some time looking at the man page for Encode, you might be able to learn this important concept: There is a difference between a "perl-internal utf8 string" and a "raw string containing utf8". The first thing is a byte sequence that stores valid utf8 characters and is flagged in perl's internal storage as being a utf8 string; in contrast, that latter thing is a byte sequence that happens to come from some external source of utf8 data, but has not been flagged as a perl-internal utf8 string. As explained in the Encode man page, a "perl-internal utf8 string" and a "raw string" will never match, even if the actual byte sequences in the two strings are identical. The "utf8 flag" being different (set vs. not set) makes the strings different, regardless of anything else. That is why the "decode('utf8', ...)" function is used on the parameter value -- if it really came from your cgi web form, then it really is a byte sequence for a valid utf8 string, but perl won't consider it to be the same as a "perl-internal utf8 string", even when the actual sequence of bytes is identical. The utf8 flag must be set on both strings, or not set on both strings (in addition to the bytes being the same), for a match to succeed, and setting the utf8 flag is one of the things that the "decode()" function does. (Nit-picky details:) In complementary fashion, doing "encode( 'utf8'. ...)" on a perl-internal utf8 string will produce a "raw" string (the utf8 flag is turned off). But the "difference" between "perl-internal utf8" and "raw" only applies when "wide" characters are involved -- i.e. those that lie outside the 7-bit ascii range -- note the following command lines using different versions of a one-line script: perl -MEncode -le '$a="\x{0341}"; $b=encode("utf8",$a); print "a:b ", +(($a eq $b) ? "same":"different")' # prints "different" -- $a is perl-internal utf8, $b is raw perl -MEncode -le '$a="foo"; $b=encode("utf8",$a); print "a:b ", (($a +eq $b) ? "same":"different")' # prints "same" perl -MEncode -le '$a="foo"; $b=decode("utf8",$a); print "a:b ", (($a +eq $b) ? "same":"different")' # also prints "same" perl -MEncode -le '$a=decode("utf8","foo"); $b=encode("utf8","foo"); p +rint "a:b ", (($a eq $b) ? "same":"different")' # still prints "same [download] Regarding the last example: note that running "decode('utf8',$a)" would be an error if $a were already flagged as a perl-internal utf8 value and contained wide characters. If all this confuses you, get over it. That's the reality. (updated to fix a typo and add clarification in the last paragraph) LAST UPDATE: Okay, I know that you have tried adding the "decode()" line before, and you reported the error message you got as a result, which was "Cannot decode string with wide characters at ... line 182" I didn't make the connection until after I updated that last paragraph above. The point is, at line 182 (wherever that was in your script -- you didn't make that clear) you are running "decode()" on a string that already has the utf8 flag set, and contains a wide character. If line 182 is the decode line that I told you to add, then I'm really puzzled, because it would mean that this cgi parameter string is already flagged as utf8 (though I can't imagine how), and if that's true, and the string came from the popup menu, then it should match. (In this case, try opening a separate text file for output -- make sure to set the mode to ">:utf8" -- and print the parameter and @display_files strings to that file, so you can inspect them manually, with a hex-dump tool if need be.) But if line 182 is somewhere else, it's probably just the next bone-headed programming error in your script, and you had not seen it before because the script had never gotten that far before. It's really frustrating when you leave out relevant details like this. Even after I told you days ago that you should have shown us that line, you didn't do it. It's tiresome. Think harder before you post again -- read what you write before you hit the "create" button, and try to imagine that you are someone else, and think what questions this other person would ask about the information in the post. Then add the answers to those questions. Better yet, try to imagine what advice this other person would give you, and try it out before posting. Take your time, don't rush it. Only create the node when you have included a clear description of what you have tried (code, inputs and outputs).	[reply] [d/l] [select]
Re^4: somethign wrong with the sumbit by Nik (Initiate) on Jan 06, 2008 at 21:29 UTC
First of all i'am VERY GLAD to have solved the problem myself, 2 days ago before i read your last post It looks like it wasn't an encoding problem at all.I'll tell you later on what i did Your todays explanation was very insightful and helped me understand even more about encodings I have also managed to run your test cgi script and saw that the value before submission and the value returned was the same, so indeed the browser returned the value user selected intact exactly the same as the original.What i tried before 2 days was this: print header( -charset=>'utf8' ); my $article = param('select') \|\| "Αρχική + Σελίδα!"; my @files = glob "$ENV{'DOCUMENT_ROOT'}/data/text/.txt"; my @menu_files = map m{([^/]+)\.txt}, @files; Encode::from_to($_, 'ISO-8859-7', 'utf8') for @menu_files; if ( param('select') ) { #If user selected an item from the drop dow +n menu $article = decode( 'utf8', $article ); unless ( grep /^\Q$_\E$/, @menu_files ) #Unless user selection do +esn't match one of the valid filenames within @display_files { ...... [download] But as i result i got this: Cannot decode string with wide characters at C:/Perl/lib/Encode.pm line 182.* Line 182 is completely irrelevant with "decode()" and i have no idea why Perl refers to it. Its obvious the problem was on line 35 which is this one: $article = decode( 'utf8', $article ); At the time i had no clue what that error meant, but after your today's reply i now know, that, i was running "decode()" on a string that already had the utf8 flag set, and contained a wide character and as you said Perl would return an error to that But what does that error tell us now? If my thinking is correct, that error tell us, that the parameter the script(index.pl) got back as a return from the browser was utf8 flagged already!! Why you ask?! Because this line of code Encode::from_to($_, 'ISO-8859-7', 'utf8') for @menu_files; has created for us an array full of well defined 'utf8 flagged ' items since the Perl script itself created this array. So when the user selects one of them and submits it, the browser grabs this 'utf8 flagged' item and sent it back to the script UNTOUCHED as it has been proved from the error we got above, otherwise we wouldn't get this error, as he supposed to do, and that proves your words to be correct in a previous post on this thread, saying that a browser should not alter a string in any way(not even in an encoding manner). So now we DO know for sure that the browser ain't sending the string back malformed in any way, because if he were then this line of code: $article = decode( 'utf8', $article ); would have no problem being parsed perhaps because the browser might have removed the "internal utf8 flag" Perl uses to characterize the "utf8" data. Do you agree with me with this logic or have i misunderstood? If the above is TRUE (original and returned strings are identical) then no conversion has to be made neither by doing encodings or decodings. My script works now as intended with no alternation of encodings here is the code: print header( -charset=>'utf8' ); my $article = param('select') \|\| "Αρχική + Σελίδα!"; my @files = glob "$ENV{'DOCUMENT_ROOT'}/data/text/.txt"; my @menu_files = map m{([^/]+)\.txt}, @files; Encode::from_to($_, 'ISO-8859-7', 'utf8') for @menu_files; if ( param('select') ) { #If user selected an item from the drop dow +n menu #No alternation to utf8 encoding or decoding is needed here....the ret +urned value is consisted of utf8 flag and contains wide characters as + the original unless ( grep /^\Q$_\E$/, @menu_files ) #Unless user selection do +esn't match one of the valid filenames within @display_files { if( param('select') =~ /\0/ ) { $article = "Null Byte Injection* attempted & logged!"; print br() x 2, h1( {class=>'big'}, $article ); } if( param('select') =~ /\/\.\./ ) { $article = "Backwards Directory Traversal attempted & logge +d!"; print br() x 2, h1( {class=>'big'}, $article ); } $select = $db->prepare( "UPDATE guestlog SET article=?, date=?, +counter=counter+1 WHERE host=?" ); $select->execute( $article, $date, $host ); exit 0; } Encode::from_to($article, 'utf8', 'ISO-8859-7'); #Convert user sel +ected filename to greek-iso so it can be opened open FILE, "<$ENV{'DOCUMENT_ROOT'}/data/text/$article.txt" or die $ +!; local $/; $data = <FILE>; close FILE; Encode::from_to($article, 'ISO-8859-7', 'utf8'); #Convert user sel +ected filename back to utf8 before inserting into db $select = $db->prepare( "UPDATE guestlog SET article=?, date=?, cou +nter=counter+1 WHERE host=?" ); $select->execute( $article, $date, $host ); } else { [download] The only thing i corrected was the $data variable before sending the contents of the file to the javascript. `for ($data) { #Replace special chars like single & double quotes to i +ts literally values s/\n/\\n/g; s/'/\\'/g; s/"/\"/g; tr/\cM//d; }` [download] because single and double quotes were incorrectly interpolated as special chars. I you visit my page now http://nikos.no-ip.org and test it by selecting something you'll notice it works normally Also you last suggestion still doesn't work: print header( -charset=>'utf8' ); my $article = param('select') \|\| "Αρχική + Σελίδα!"; my @files = glob "$ENV{'DOCUMENT_ROOT'}/data/text/.txt"; my @menu_files = map m{([^/]+)\.txt}, @files; Encode::from_to($_, 'ISO-8859-7', 'utf8') for @menu_files; if ( param('select') ) { #If user selected an item from the drop dow +n menu $article = encode( 'utf8', $article ); unless ( grep /^\Q$_\E$/, @menu_files ) #Unless user selection do +esn't match one of the valid filenames within @display_files { ...... [download] i get this error: Invalid argument at D:\www\cgi-bin\index.pl line 57.* Line 57 is a correct line this time trying to open FILE, "<$ENV{'DOCUMENT_ROOT'}/data/text/$article.txt" or die $!; encoding must have messed the variable up somehow.... ps1: Your test cgi script required me to turn taint mode(-T) off in order to run ps2: I don't yet understand whats the difference of $article = encode( 'utf8', $article ); opposed to $article = decode( 'utf8', $article ); ps3. I cant run the one-linears: i get Can't find string terminator "'" anywhere before EOF at -e line 1. Tried to switch single with double quotes but iam still getting errors.	[reply] [d/l] [select]
Re^5: somethign wrong with the sumbit by graff (Chancellor) on Jan 07, 2008 at 06:32 UTC
Also you last suggestion still doesn't work: `... if ( param('select') ) { #If user selected an item from the drop dow +n menu $article = encode( 'utf8', $article ); ...` [download] The thing that I find astonishing here is that this snippet is NOT what I was suggesting. Look again at my previous reply and focus carefully on the line that has the comment "## ADD THIS LINE". Can you see the difference between the code I suggested and your failed attempt that I quoted just now? It's an important difference. My suggestion was to create a new utf8 string by decoding the value returned by "param('select')", so that you could compare this utf8 version of the parameter to the contents of the utf8 filename array. What you did instead was something else entirely, and quite brainless. The evidence is pointing more heavily to the conclusion that you are a troll, trying some novel techniques to waste everyone else's time and get people angry. Why else would you make up something stupid that obviously won't work, and assert that this is what I suggested you should do? If you are not a troll, then you are simply incompetent beyond belief. Either way, if this is what you do when people try to help you, people will stop trying, and simply won't take you seriously anymore. Personally, I'm already laughing out loud at everything you post. (update: In fact this last reply of yours is really hilarious. It's like you are the Three Stooges, all by yourself! But then, why do I keep replying? Good question... I guess sometimes it's good for a laugh, and sometimes your approach to trolling falls flat, and actually sparks some useful explanations that might be helpful to others, even though it does no good at all to give the information to you.) ps1: Your test cgi script required me to turn taint mode(-T) off in order to run So your conclusion is that your particular perl/web-server installation is unable to run anything with taint-checks turned on, and in order to make things work, you make sure taint-checks are turned off... Thanks for letting us know (and thanks also for the link to your web site) -- that's very helpful information for everyone who reads PerlMonks. (update: I just noticed... that site doesn't seem to be working at the moment. Maybe Nik pulled the plug on it? Or tripped over the power cord, or when the cream pie missed his face it hit the motherboard. I don't suppose someone would have hacked it already..) ps2: I don't yet understand whats the difference of $article = encode( 'utf8', $article ); opposed to $article = decode( 'utf8', $article ); I always have to think twice about the names myself -- here's how I keep them straight: think of "perl internal utf8" as "normal" and everything else as "coded" (like encrypted to keep it mysterious and secret and obscure); in order to turn a perl-internal utf8 string into one of these mysterious external strings, you have to encode (like encryption), and to make one of those mysterious external strings readable as perl-internal utf8, you have to decode (like decryption) -- and the Encode module is your "secret encoder/decoder" tool, your "Enigma machine". Just remember: "encode" returns something that is external (not flagged as perl-internal utf8); "decode" returns perl-internal utf8 (except when you pass it something that is already perl-internal utf8, which causes it to throw an error). ps3. I cant run the one-linears: i get Can't find string terminator "'" anywhere before EOF at -e line 1. Tried to switch single with double quotes but iam still getting errors. That's because you are using the "standard" MS-DOS Prompt shell (command.exe or cmd.exe). Try using a unix-style shell instead (bash.exe). It's available for windows from numerous sources (cygwin is probably the most popular), and it fully supports unix-style quotes and escapes for command lines. No doubt, this advice will open up whole new worlds of potential errors you can make -- have fun with that, but don't post those problems here, because they wouldn't be perl questions.	[reply] [d/l]
A reply falls below the community's threshold of quality. You may see it by logging in.