muba has asked for the wisdom of the Perl Monks concerning the following question:

A stupid and slighly long introduction
I wrote a pretty small online editor which allows me to easily change a couple of files on my webserver without having the need of using FTP.
Or so I thought.
The editor consists of a small form with a textarea and a submit button (oh and besides, for a little feeling of fake security, there is a password box :) ).
This form gets submitted to a Perl script which gets the form information through use CGI 'param' (because param is the only function of the CGI module I use).

The stupid problem
But the data from the textarea does not equal the data I entered. HTML entities like <,  , \ are changed by their actual characters.

A stupid solution
Of course I could write < in the editor (this would then render as < which is what I want) but then I should re-edit any HTML entities any time. As you can see, I don't quite feel like it.

The (stupid?) question
So my actual question(s): why does CGI eat them character entities? And how can I prevent it?
I could of course roll my own form parser, but that's, well, just stupid.

Replies are listed 'Best First'.
Re: CGI module seems to eat html entities!
by Jenda (Abbot) on Oct 03, 2005 at 14:00 UTC

    When are those eaten? Are you really sure they are eaten by CGI.pm ? I kinda doubt it.

    If you enter a < into the textarea and submit it, what does the file contain? Download it by the FTP and look at what was actually stored in the file! If you then go back to edit that file, what do you see in the text area? Doesn't this eating happen a bit later than you thought? And do you escape the text for HTML while producing the form page? You should!

    Jenda
    XML sucks. Badly. SOAP on the other hand is the most powerfull vacuum pump ever invented.

      Ok. That was an eye-opener! CGI is not the one to blame, if I enter &lt; it stores it like that. When I then re-edit the file, it shows a < character.
      Good. You tell me to escape the HTML text. So I tried (naive me!) quotemeta() but that messes things up.
      Alright, I'll be finding a nice HTML::(somewhat) module.
      Thanks for pointing me in this direction!

        use HTML::Entities;

        Jenda
        XML sucks. Badly. SOAP on the other hand is the most powerfull vacuum pump ever invented.

Re: CGI module seems to eat html entities!
by marto (Cardinal) on Oct 03, 2005 at 14:02 UTC
    "The editor consists of a small form with a textarea and a submit button (oh and besides, for a little feeling of fake security, there is a password box :) )."

    To me this sounds like a bad idea. Are you saying that if someone should stumble upon this page you have setup they could edit files on your webserver via a browser without having to provide any type of valid username/password?

    If so I would re think that before worrying about anything else.

    Martin
      Oh no worries. The editor is made so, that it allows only files ending on ".htm". If any of "..", "/", or a null character occurs in the file requested for editting, it denies access. The password is hardcoded in the .pl-file (which name doesn't end on .htm so it can't be loaded in the editor) and the submitted password is compared to this one.

      What I just meant to say by 'little feeling of fake security' is that not everyone is easily abled to edit the files, but with a little bruteforcing I think the password is easily cracked :)
      But anyway, it is a temporary solution and I will delete the editor when the site update is finished.
        It doesn't really sound like a great idea (html pages can contain various scripting languages that could do some harm to visitors to your site, or some terrorist organization could just take over pages for communications, etc. . . )

        It's your decision, I guess, if it's your web server. And at least you're aware that it's not real security.