in reply to Decode umlauts on CGI-parameters

CGI.pm gives you binary values, unless you tell it to to return decoded values. See https://metacpan.org/pod/CGI#utf8

Replies are listed 'Best First'.
Re^2: Decode umlauts on CGI-parameters
by Yaerox (Scribe) on Jul 16, 2015 at 09:05 UTC
    I did 4 tries

    my $p_sAction = $oCGI->param( "action" ); $p_sAction = decode( 'UTF-8', $p_sAction ); if ( !defined( $p_sAction ) ){ print $oCGI->redirect( "access-denied.pl?reason=110" ); exit; } print "Content-type: text/html\n\n"; print "<html><head><meta charset=\"UTF-8\"></head>"; print "p_sAction: #" . $p_sAction . "#<br>"; print "1510 --- #$p_sAction# eq #$aText{'1510'}#<br><br>";

    Output:
    p_sAction: #Benutzer l&#65533;schen# 1510 --- #Benutzer l&#65533;schen# eq #Benutzer löschen#

    my $p_sAction = $oCGI->param( "action" ); $p_sAction = decode( 'UTF-8', $p_sAction ); if ( !defined( $p_sAction ) ){ print $oCGI->redirect( "access-denied.pl?reason=110" ); exit; } print "Content-type: text/html\n\n"; print "<html><head><meta charset=\"UTF-8\"></head>"; print "p_sAction: #" . $p_sAction . "#<br>"; $aText{'1530'} = decode( 'UTF-8', $aText{'1510'} ); print "1510 --- #$p_sAction# eq #$aText{'1510'}#<br><br>";

    Output:
    p_sAction: #Benutzer l&#65533;schen# 1510 --- #Benutzer l&#65533;schen# eq #Benutzer löschen#

    my $p_sAction = $oCGI->param( "action" ); $p_sAction = decode( 'UTF-8', $p_sAction ); if ( !defined( $p_sAction ) ){ print $oCGI->redirect( "access-denied.pl?reason=110" ); exit; } print "Content-type: text/html\n\n"; print "p_sAction: #" . $p_sAction . "#<br>"; print "1510 --- #$p_sAction# eq #$aText{'1510'}#<br><br>";

    Output:
    p_sAction: #Benutzer löschen# 1530 --- #Benutzer löschen# eq #Benutzer löschen#

    my $p_sAction = $oCGI->param( "action" ); $p_sAction = decode( 'UTF-8', $p_sAction ); if ( !defined( $p_sAction ) ){ print $oCGI->redirect( "access-denied.pl?reason=110" ); exit; } print "Content-type: text/html\n\n"; print "p_sAction: #" . $p_sAction . "#<br>"; $aText{'1530'} = decode( 'UTF-8', $aText{'1510'} ); print "1510 --- #$p_sAction# eq #$aText{'1510'}#<br><br>";

    Output:
    p_sAction: #Benutzer löschen# 1530 --- #Benutzer löschen# eq #Benutzer löschen#


    Still nothing matchs...
    Update: If I add
    if ( $p_sAction eq "Benutzer löschen" ){ print "HELLO WORLD 22222 - Benutzer löschen<br>"; }
    it works ... so it has to be the encoding of $aText{'1510'} I'd say. Doesn't it?
    I store the text in db like this: "Benutzer%20l%26ouml%3Bschen" (uriescaped_utf8), when I receive it, I uriunescape it and then I compare...

    I read on stackoverflow (http://stackoverflow.com/questions/17599103/perl-comparing-2-accentuated-strings-with-different-encodingone-being-read-from) using Unicode::Normalize::NFD could help. For me it just makes it more worse.