in reply to Re^2: somethign wrong with the sumbit
in thread somethign wrong with the sumbit

Ah. Well, it looks like maybe you need to tell perl explicitly that the value of "params('select')" must be treated as a utf8 string -- either that or else the string comparison in your grep statement needs to be done with explicitly declared byte semantics.

In other words, either do this (flag the param value as utf8):

use Encode; # add this if you don't have it already if ( param('select') ) { #If User selected an item from the drop do +wn menu my $selected_file = decode('utf8', param('select')); unless ( grep { $_ eq $selected_file } @display_files ) #If User Selection doesn't match one of the passages then its a +Fraud! { ...
(updated to fix spelling of "selected_file")

Or else this (do the grep with byte semantics):

if ( param('select')) { use bytes; unless ( grep {$_ eq param('select') } @display_files ) #If User Selection doesn't match one of the passages then its a +Fraud! { ...
Note that the "use bytes" pragma is lexically scoped: it applies within the block where you put it.

Hope that helps...

Replies are listed 'Best First'.
Re^4: somethign wrong with the sumbit
by Nik (Initiate) on Dec 30, 2007 at 17:29 UTC
    I tried both ways but i got this error in both cases as well:
    Software error: Cannot decode string with wide characters at C:/Perl/lib/Encode.pm lin +e 182.
    Also i beleive there is no need to explicitly tell perl to handle param('select') as utf8 it must do this by default i think.

    In the past this code used to work without any need for utf8 conversion as long as it concerns param('select') the only conversion needed was this:

    Encode::from_to($_, 'ISO-8859-7', 'utf8') for @display_files;
      I tried both ways but i got this error in both cases as well:
      Software error: Cannot decode string with wide characters at C:/Perl/lib/Encode.pm lin +e 182.

      You should show, for both cases the contents of line 182 of your script. As it is, I can only guess that you did not actually follow my suggestions, because only one of the two approaches uses the Encode::decode function, so only that one case would issue this particular error message. If the "decode()" call was being used in both cases, then you didn't understand my second suggestion, and you probably didn't do it right.

      In any case, at this point, I've lost track of what problem you are actually having. The form returns a utf8 string value for the "select" param, and this value comes from the "@display_files" array, which you use both to create the menu and to test the return value of the "select" param. The array contains file names that are read from your directory as iso-8859-7 strings, and you convert them to utf8 before putting them into the popup menu, and you are confident that the strings being returned by the form are being correctly handled as utf8 string. And despite all this being true, your testing of the "param('select')" value never succeeds?

      Try an experiment. Reduce the process to just the bare minimum, where your cgi script dummies up a list of Greek "name" strings, puts up a form with a popup menu, and checks the param value that comes in when the form is submitted. With the process reduced to just this activity, you can focus more carefully on a variety of diagnostics, if/when it fails. If you can't figure out a diagnostic that reveals the problem, the test script should be small enough to post here in its entirety, and it would be "self-contained" (runnable anywhere), so others can try it out and help find the problem.

      If it doesn't fail, then the task is to figure out what the difference is between this simple test script, and the logic you used in the larger application.

      One last thing to check about the encoding issue. Suppose the client browser's form submission includes a value for the "select" paramater that is four bytes long, and those four bytes (expressed in hex) are:

      ce a6 ce a5
      That would be the utf8 byte stream for a two-character string containing the letters "PHI" and "UPSILON". Let's suppose further that there actually was a file with this name in your directory, and @display_files contains this very same 2-character utf8 string. There's a chance that something in the handling of the input parameter string is doing an improper conversion of the original 4-byte sequence into a perl-internal utf8 string. The result of this improper conversion might be an 8-byte string, consisting of:
      c3 8e c2 a6 c3 8e c2 a5
      That's what you get if the original four-byte string is assumed to be non-utf8 (e.g. iso-8859-1) and is then "converted" to utf8 based on that false assumption. You would be able to check this with a suitable test script where the strings for the menu all the same length. If the string coming back from the form is twice as long, it's a problem with interpreting the form data correctly as utf8 characters.
        Hello Graff, few things to make clear about encodings before i try to make the test script.

        The array contains file names that are read from your directory as iso-8859-7 strings,...
        How can be sure that their encoding is 'iso-8859-7' since we dont know what encoding style windows use to save filenames? How we know for example if the encoding wasnt 'cp1253' or 'utf8'?

        And also can something that its native some encoding be read as another encoding?

        ...and you convert them to utf8 before putting them into the popup menu,
        I didn't want to but i had too because otherwise firefox wouldn't display the filenames correctly in readable Greek text and really don't know why....is it because the print header was in utf8?
        ...and you are confident that the strings being returned by the form are being correctly handled as utf8 string.
        Well, it seemed the correct thing to believe in. Since the items in the popup menu, after the conversion was made, were 'utf8', wasn't it logical to believe that the submitted item that user selected would be also stored in param('select') and handled as well in a 'utf8' manner? I mean if its a utf8 thing why not be "grabbed" as a utf8 thing and handled as a utf8 thing?
        There's a chance that something in the handling of the input parameter string is doing an improper conversion of the original 4-byte sequence into a perl-internal utf8 string. The result of this improper conversion might be an 8-byte string, consisting of: c3 8e c2 a6 c3 8e c2 a5 download
        Up until this point i understaned how utf8 encoding stores 1 char as 2 bytes long and hence 2 chars as 4 bytes long but after that i didnt understand...
        Which is the "input parameter string" You mean param('select') ?!
        What conversion are you refering to? Why change the 4byte string to perl-internal utf8 string?
        That's what you get if the original four-byte string is assumed to be non-utf8 (e.g. iso-8859-1) and is then "converted" to utf8 based on that false assumption.
        You mean the initial filenames which were 'iso-8859-7' that i re-encoded to 'utf8' in order to be able to display them properly on browser?

        Why is this wrong? The content is still the same(the name of the file) only the storage capacity changes. Sorry for 2 many questions but this encoding concept is distorted in my head and i have to ask you to helpe me clear it because i beleive we are in the heart of this weird problem.