Stenyj has asked for the wisdom of the Perl Monks concerning the following question:

I have a stupid problem that I just can't seem to get my arms around, so I'm hoping someone here with a lot more experience (or even just a little would probably do the trick ;-p) can give me a little direction.

I have a script that allows users to modify HTML pages in a textarea field of a forum. The content is displayed in a <textarea> field of an HTML form, and they can then submit the changes.

I'm encountering a problem when the HTML content to be edited contains a </textarea> tag, since when the form attempst to fill the HTML content into the textarea, when the closing tag is printed, it closes the text area that the HTML content is suppose to fill into, and the rest of the HTML content is displayed on the page as if it were being viewed.

If that was clearly explained, please let me know & I'll try to be more clear.

Anyway, one way I've found around this is by adding at the end... since this tells the browser not to treat it as displayable content and thus prints the entire content in the <textarea>

So, basically I have this:

my $content; open (FILE,"file.html"); while (<FILE>) { $content .= $_; } close(FILE);

which grabs the html file to be edited

$content = "<!--\n" . $content . "\n-->";

which allows it to be displayed properly into the textarea field, even if the HTML content contains a </textarea> tag.
then once it's submitted:

my $content = param('content'); $content =~ s/<!--\n//; $content =~ s/\n-->//;

which grabs the content that the user is submitting, and then attempts to remove the content that was added automatically to preven the problem mentioned earlier.
open (FILE,">file.html"); print FILE $content; close(FILE);

which saves the modified file.

Now, first I'll acknowledge that this is probably a stupid way to attempting to fix this problem. If anyone has any other suggestions, I'm definetly open to them.
If no other suggestions, could anyone tell me why the substitution for the strings is not functioning? It ignores them, and they end up being written to the file (when I actually want them removed prior to the file being written to).

Any help would be GREATLY appreciate.

Thx all,
Stenyj

Replies are listed 'Best First'.
Re: escaping <!-- and --> in substitution...or other help plz
by ikegami (Patriarch) on Apr 03, 2005 at 05:36 UTC

    Doesn't the following work?

    $content =~ s/&/&amp;/g; $content =~ s/</&lt;/g; $content =~ s/>/&gt;/g; # optional. print("<textarea>$content</textarea>");

    To save it:

    open(FILE, "> file.html"); print FILE $cgi->param('content'); close(FILE);
Re: escaping <!-- and --> in substitution...or other help plz
by tlm (Prior) on Apr 03, 2005 at 02:39 UTC

    If that was clearly explained, please let me know & I'll try to be more clear.

    That sounds like a runaway process. :-)

    The biggest problem I see right off the bat with your approach is that it rules out the possibility of the user entering comments in the HTML text they are submitting.

    Anyway, have you considered standard HTML escaping? There are probably many modules to do this, but I use HTML::Entities.

    But whether HTML escaping is the way to go is only a naive guess on my part, since (thankfully) I have never had to code such an application.

    the lowliest monk

Re: escaping <!-- and --> in substitution...or other help plz
by chas (Priest) on Apr 03, 2005 at 02:09 UTC
    I don't see why your substitutions aren't being made; it seems to work for me in a simple example.
    As far as other methods: I've faced exactly that sort of problem, and what I've done usually is to make some kind of substitution, e.g. substituting </textarea> with #%#%#%textarea#%#%#% so that the term isn't interpreted by the browser and then substituting back before writing out the result. Actually, I've mostly done this in inputs of type "hidden" so it isn't visible to the user, but occasionally I've done it in visible areas (e.g. in email headers where diamond brackets occur and I want to see the information in an html page, I might substitute diamond brackets for square brackets and later substitute back.)
    I know this sounds lame brained, but sometimes I couldn't find any other easy way around the problem, and this type of solution always seemed to work with no problems (for years!) I'm sure someone will point out that if the text really contained #%#%#%textarea#%#%#% then the substitution I mentioned will break, but I guess I can live with that possibility.
    Your solution seems to be something of the same type, but don't you need to worry about possibly removing comment markers that you don't want removed? By the way, I think that <!-- is supposed to be followed by a space to be proper html and similarly the ending marker should be preceeded by a space.
    chas
    (Update: The method I describe works if the users know that the strange constructions shouldn't be altered. If a large class of users can do the editing, it's probably not a great idea for them to be visible. Usually, I arrange things so the strange stuff isn't actually in the texareas, possibly by splitting things into several textareas. Of course, it often isn't a good idea to let arbitrary users edit html tags at all since they may not do it properly - restrict access to just text blocks.)
Re: escaping <!-- and --> in substitution...or other help plz
by ktross (Deacon) on Apr 03, 2005 at 02:08 UTC

    Many charactures have to be escaped in regex's...

    try changing:

    $content =~ s/<!--\n//;

    to

    $content =~ s/\<\!\-\-\n//;

    and

    $content =~ s/\n-->//;

    with

    $content =~ s/\n\-\-\>//;

    update:I put the code I provided in code tags. I appologize for the low quality of this post. I just started working on a reply when my fiance' tossed my laptop aside and ravanged me. (Kind of like the story of St Kevin but without the nettles, or the nunery) How un-monk-like! I might take another stab at this later.

      (Some of your code isn't displaying properly due to lack of code tags; I had to look at the source to see it.)
      Actually, when I tried the substitutions, they seemed to work fine without any escaping.
      chas
Re: escaping <!-- and --> in substitution...or other help plz
by kprasanna_79 (Hermit) on Apr 04, 2005 at 06:57 UTC
    Hai,
    This problem is common in web application, that to when we are handling raw HTML. So to skip html tags inside the textarea, textbox fields usually we put like this
    <textarea ESCAPE=HTML>some text</textarea> <input type="text" ESCAPE=HTML>some text</input>
    But i am not sure this will help u to solve ur problem. But usually we do this to escape html, in web application.
    Kindly excuse if i am wrong any where...
    I want to share the idea..
    --prasanna.k
Re: escaping <!-- and --> in substitution...or other help plz
by trammell (Priest) on Apr 03, 2005 at 22:40 UTC
    I'm pretty sure I got this from one of merlyn's columns:
    $foo =~ s/['<&>"]/"&#".ord($&).";"/ge;