kaloyan_iliev has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

What I am trying to do is to verify a password from a Web form send with CHAP.
The password is send in md5_hex and is formed in this way with Java Script:

document.formname.password.value = hexMD5('\027' + document.formname.p +assword.value + '\340\174\012\314\214\070\070\231\377\005\016\132\270 +\024\241\163');

When I 'alert' in JavaScript the upper string, I see that \340 and other are actually displayed as UTF characters.

For the Upper example with password 'AAA' the result is:
18ed4d4f656255182e771016deb7d23a
But in Perl I can not generate the same md5 in almost no way.

For example:
#!/usr/bin/perl use strict; use Digest::MD5; #Example 1==================== my $chapid = "\027"; my $chapchalange = "\340\174\012\314\214\070\070\231\377\005\016\132\2 +70\024\241\163"; print "\nExample1=". Digest::MD5::md5_hex($chapid."AAA".$chapchalange) +; #Example2===================== my $chapid2= '\027'; my $chapchalange2= '\340\174\012\314\214\070\070\231\377\005\016\132\2 +70\024\241\163'; print "\nExample2=".Digest::MD5::md5_hex($chapid2."AAA".$chapchalange2 +);

The result is:
Example1=18ed4d4f656255182e771016deb7d23a Example2=3f40ad10610755cefd8223a3a1a566db
As you can see in Example2 the values are not interpreted as utf-8 chars and this mess the md5 sum.
The other thing I have noticed is that when I dump the upper string with use 'Devel::Peek qw(Dump);' the result is:
Example1= ------------------------ SV = PV(0x80120d070) at 0x801269f18 REFCNT = 1 FLAGS = (PADTMP,POK,pPOK) PV = 0x801209f58 "\27AAA\340|\n\314\21488\231\377\5\16Z\270\24\241s" +\0 CUR = 20 LEN = 24 ------------------------------ Example2= ------------------------------- SV = PV(0x80120d650) at 0x8012c5810 REFCNT = 1 FLAGS = (PADTMP,POK,pPOK) PV = 0x80124ec58 "\\027AAA\\340\\174\\012\\314\\214\\070\\070\\231\\ +377\\005\\016\\132\\270\\024\\241\\163"\0 CUR = 71 LEN = 72
And I see no utf8 flag for the internal representation?
So back to my real example I receive this chapid, chapchalange and password through CGI. This is what I get as CGI params as result of CGI::Vars();
{ 'username' => 'Ptestuser', 'password' => '18ed4d4f656255182e771016deb7d23a', 'chap-id' => '\\027', 'chap-challenge' => '\\340\\174\\012\\314\\214\\070\\070\\23 +1\\377\\005\\016\\132\\270\\024\\241\\163', };
I try almost everything I have found in internet about reading utf-8 CGI params, and nothing help. I try:
------------------- Encode::decode('UTF-8'...); ------------------- utf8::decode(); ------------------- binmode STDIN, ":encoding(utf8)"; ------------------- use CGI qw( -utf8 ); ------------ use Encode qw(decode); use URI::Escape::XS qw(decodeURIComponent); $_ = decode('UTF-8', decodeURIComponent($_), Encode::FB_CROAK); ------------- s{%([a-fA-F0-9]{2})}{ pack ("C", hex ($1)) }eg; # Kept from existin +g code s{%u([0-9A-F]{4})}{ pack ('U*', hex ($1)) }eg; # Added utf8::decode $_; --------------- and many others

Currently I see no way to verify the password.
I must say that the encoding of the HTML page is windows-1251 and so is the encoding of my Perl environment (CP1251).
Any help will be appreciated as I have already lost almost 2 days on this peace of code.
Thanks in advance,
Kaloyan Iliev

Replies are listed 'Best First'.
Re: Perl CGI UTF8 AND CHAP PASSWORD VERIFICATION
by ikegami (Patriarch) on Feb 03, 2011 at 17:26 UTC

    You are starting with a variable that contains a JavaScript literal minus the quotes (e.g. "\","3","4",0"), and you want code to generate the string JavaScript would produce by that literal (byte E016).

    This has nothing to do with UTF-8 or Unicode. The byte may be part of the UTF-8 encoding of some string, but md5_hex doesn't care about that.

    my $chapid3= '\027'; my $chapchalange3= '\340\174\012\314\214\070\070\231\377\005\016\132\2 +70\024\241\163'; s/\\(\d{3})/chr(oct($1))/eg for $chapid3, $chapchalange3; print "\nExample2=".Digest::MD5::md5_hex($chapid3."AAA".$chapchalange3 +);

    And I see no utf8 flag for the internal representation?

    That flag should never matter. It only does in the presence of bugs. md5_hex is not buggy.

    perl -MTest::More=tests,1 -MDigest::MD5=md5_hex -e' $_ = chr(0xE9); utf8::downgrade( $dn = $_ ); utf8::upgrade( $up = $_ ); is(md5_hex($up), md5_hex($dn)); ' 1..1 ok 1

    If anything, I would suspect a problem if the flag was on.

      Thank you,
      You really save me a lot of time.
      Your solutions is working perfectly.

      Thank you again,
      Best regards,
      Kaloyan Iliev
Re: Perl CGI UTF8 AND CHAP PASSWORD VERIFICATION
by Corion (Patriarch) on Feb 03, 2011 at 17:07 UTC

    Browsers will usually encode the data they send back in the same encoding as the original page was in. So I would check whether you get your parameters in Windows-1251. If so, just decode from there to utf-8.

      Hi, As I say I try to decode the CGI params:
      use Encode; my $chapid = CGI::param('chap-id'); my $chapchalange = CGI::param('chap-challenge'); $chapid = Encode::decode('CP1251', $chapid, Encode::FB_CROAK); $chapchalange = Encode::decode('CP1251', $chapchalange, Encode::FB_CRO +AK);
      The only thing that changes is this:
      ---------print chapid to STDERR----------------- \027 ---------Dump chapid before decode---------- SV = PVMG(0x80d0988e8) at 0x80d3234e0 REFCNT = 1 FLAGS = (PADMY,POK,pPOK) IV = 0 NV = 0 PV = 0x80d35f118 "\\027"\0 CUR = 4 LEN = 8 ---------Dump chapid after decode---------- SV = PVMG(0x80d0988e8) at 0x80d3234e0 REFCNT = 1 FLAGS = (PADMY,POK,pPOK,UTF8) IV = 0 NV = 0 PV = 0x80d35f5d8 "\\027"\0 [UTF8 "\\027"] CUR = 4 LEN = 8
      If I change the encoding in 'decode' from 'CP1251' to 'UTF-8' nothing really happens.
      Best regards,
      Kaloyan Iliev