Re: Escaping double quotes in complete document
by haukex (Archbishop) on Jun 26, 2017 at 17:46 UTC
|
I haven't yet had much experience with CGI.pm and UTF-8 issues, so I can't comment much on that, except that using Devel::Peek has been a useful tool for me a few times when dealing with UTF-8 issues, as it helps you see what Perl thinks the string contains. Usually, if you get the encoding right at those points where the data enters and leaves the script, Perl should handle Unicode just fine. Anyway,
the HTML textboxes see the double quote as an end marker for the textbox value
This sounds to me like you might be building your HTML by interpolation, as in print qq{<input type="text" name="foo" value="$val">};? If so, that is certainly the source of the problem and you should use one of the available APIs to write your HTML instead, as they will do the escaping for you. Back before CGI.pm was discouraged, one way to do it was with its HTML generation functions (which are now deprecated). So nowadays the following is not recommended for new scripts, but note how the attribute is properly escaped:
use CGI qw/:html :form/;
my $val = q{ "Hello" <world> & };
print textfield('foo',$val), "\n";
__END__
<input type="text" name="foo" value=" "Hello" <world>
+&amp; " />
Currently, CGI::HTML::Functions recommends HTML::Tiny, which I haven't yet had the chance to try, and of course there are frameworks like Template::Toolkit or even Mojolicious, although the latter is meant to replace everything that CGI.pm does. | [reply] [d/l] [select] |
|
|
print <<"EndOfText";
<html>
<!--Foo-->
<input type="text" name="mytext" id="mytext" value="$SOAPResult"/>
<!--Bar-->
</html>
EndOfText
The whole site is based on this system and I'm pretty sure my boss is going to kill me when I go to him saying "Yeah, we gotta change the whole thing... Gonna take about two weeks."
If I have to stay with my running code, I get that I have to manually escape every HTML entity by hand, right? | [reply] [d/l] |
|
|
please don't facepalm on me over this
No, I understand, but this is a very old style of generating HTML - probably my very first attempts at CGI scripts from over 20 years ago looked like this :-) But also, the issues with double quotes would have existed the entire time, even without the Perl upgrade. Also, I agree with huck that it's possible that maybe something has changed in the way the data gets handed to your script.
my boss is going to kill me
Well, if he needs further convincing, then tell him that HTML generation code like this exposes your customers to a Cross-site scripting (XSS) attack (longer explanation).
I get that I have to manually escape every HTML entity by hand, right?
I'm sorry to say yes. The minimal change needed to the code you showed is the following (encode_entities), keeping in mind that it encodes $SOAPResult once and then the value stays that way, so if you need the value for something else later you should modify a copy instead, like e.g. encode_entities(my $copy=$SOAPResult);
use HTML::Entities qw/encode_entities/;
my $SOAPResult = q{ "Hello" <world> & };
encode_entities($SOAPResult);
print <<"EndOfText";
<input type="text" name="mytext" id="mytext" value="$SOAPResult"/>
EndOfText
__END__
<input type="text" name="mytext" id="mytext" value=" "Hello&quo
+t; <world> &amp; "/>
| [reply] [d/l] [select] |
|
|
What you show above would never have escaped double quotes, no matter what perl version you used, something else has changed.
Was something keeping double quotes out of the database before and now it no longer checks? was something encoding them when they were being read from the database that is no longer doing it now?
| [reply] |
|
|
Hi again haukex and huck
I would agree in the suspicion that something else than Perl has changed, but I already dived fairly deep into this.
Java (that runs the JBoss server from which I get my input) hasn't been updated, because it is not a package available in my package manager, neither are the JBoss itself or the database host.
The Apache has been updated, but as I understand it, it would have interpreted double quotes wrongly regardless of its version, right?
That leaves Perl and my changes in the scripts as possible culprits. As I said, the update messed up the page encoding, because none of the scripts were explicitly using UTF-8 and most of them started the HTML page encoded as ISO-8859-1.
I can confirm with 100% certainty that saving and loading UTF-8 characters and double quotes worked before the update.
However, I do not know, if it didn't work at all after the update or if it started to malfunction this way after I enabled the scripts to use UTF-8 (though I think these errors are not necessarily connected).
I appreciate your answers a lot and I will definitely propose updating our infrastructure according to your advice.
| [reply] |
|
|
Re: Escaping double quotes in complete document
by thanos1983 (Parson) on Jun 26, 2017 at 13:02 UTC
|
Hello MeinName,
I am not familiar with the problem that you are having but I searched online and I found this module HTML::Entities give it a try.
Example from documentation:
use HTML::Entities;
$a = "Våre norske tegn bør æres";
decode_entities($a);
encode_entities($a, "\200-\377");
Hope this helps, BR.
Seeking for Perl wisdom...on the process of learning...not there...yet!
| [reply] [d/l] [select] |
|
|
Hi thanos,
thank you for your reply. Unfortunately it is not exactly what I was looking for. As I wrote, I came across HTML::Entities myself, but there's just too many scripts with waaay too many variables and arrays and whatnot to go along and decode/encode every single one by hand.
I am looking for a method to tell Perl "Go ahead and just encode everything you get so that HTML entities are correctly loaded." Kind of like
use open qw(:std :utf8);
is doing for UTF-8 encoding.
Best regards
MeinName | [reply] [d/l] |
|
|
$ PERL_UNICODE=S perl script.pl
You can read further on perlrun/Command Switches.
Give it a try.
Hope this helps.
Seeking for Perl wisdom...on the process of learning...not there...yet!
| [reply] [d/l] [select] |
|
|
Re: Escaping double quotes in complete document
by holli (Abbot) on Jun 26, 2017 at 17:29 UTC
|
| [reply] |
|
|
Hi holli,
for the output method I'm using, please refer to my reply to haukex.
My input comes from multiple Java functions running on a JBoss Server that is connected to a database. I call them via SOAP::Lite and read the returned data objects
I will look into changing those to automatically encode HTML entities. That may be a quicker and more thorough solution than sifting through every script I have
| [reply] |
Re: Escaping double quotes in complete document
by holli (Abbot) on Jun 27, 2017 at 14:32 UTC
|
This is the minimal code to use a templating engine named Template::Toolkit.
my $tt = Template->new();
$tt->process( \q[<html>
<!--Foo-->
<input type="text" name="mytext" id="mytext" value="[% soap_result %
+]"/>
<!--Bar-->
</html>], { soap_result => $soap_result }) || die $tt->error(), "\n";
This protects you from Cross-Site-Scripting attacks and handles the double quote issue.
holli
You can lead your users to water, but alas, you cannot drown them.
| [reply] [d/l] |
|
|
use Template;
my $tt = Template->new();
my $soap = ' "foo" <bar> & ';
$tt->process(\<<END, {soap=>$soap}) || die $tt->error();
<html>
<input type="text" name="mytext" value="[% soap %]"/>
</html>
END
$tt->process(\<<END, {soap=>$soap}) || die $tt->error();
<html>
<input type="text" name="mytext" value="[% soap | html %]"/>
</html>
END
__END__
<html>
<input type="text" name="mytext" value=" "foo" <bar> & "/>
</html>
<html>
<input type="text" name="mytext" value=" "foo" <bar>
+ &amp; "/>
</html>
| [reply] [d/l] [select] |
|
|
| [reply] |
Re: Escaping double quotes in complete document
by MeinName (Novice) on Jun 29, 2017 at 07:16 UTC
|
Thank you all for your ideas and help on this matter!
I talked with my boss and he talked with his boss and it seems we are either revamping the entire site to use toolkits for creating HTML code or we are going to change our infrastructure to get away from completely web based appliances.
Again, thank you all for your time and help!
| [reply] |
|
|
| [reply] |