http://qs1969.pair.com?node_id=286553

einerwitzen has asked for the wisdom of the Perl Monks concerning the following question:

I'd explain the situation but I bet there is a simple answer so I'll simply ask it... I have a script that reads the html of one page and displays (the HTML) in a text box. kind of like a view-source right in the page. Only sometimes the html is garbled, it won't show the ' > ' or ' < ' and bits like '& nbsp' show up as the space rather than the code... maybe this is an html question, but since it is in a script i'd like to fins perl command. thanks for whatecer anyone can offer!!!

Replies are listed 'Best First'.
Re: View The HTML ???
by Aristotle (Chancellor) on Aug 26, 2003 at 02:03 UTC
    You need to escape characters that have special meaning in HTML. CGI.pm has escapeHTML and then there's HTML::Entities.

    Makeshifts last the longest.

Re: View The HTML ???
by asarih (Hermit) on Aug 26, 2003 at 02:07 UTC
    What "text box"? And are you processing the text (converting > to &gt;, etc.) before calling Perl's print?

    I guess CPAN is a start.

      Yea right now I just have a whole list of filters
      @HTMLUN2 = map { s/</&lt/; $_ } @HTMLUN1; @HTML = map { s/>/&gt/; $_ } @HTMLUN2; print <<HTML; <textarea>@{[join("",@HTML)]}</textarea> HTML

      I am looking for the command that does all the stuff for me, so i don't need to map { everything.

        Update: I'm a klutz.. don't forget to see asarih's reply below.

        FWIW, you need to add s/&/&amp;/g; to your substitutions for a minimal escaping solution.

        Also, you must terminate your entities with a semicolon - it's &lt; and not &lt. (Well, the latter is acceptable in some cases under SGML, but don't go there unless you like headaches.)

        A few style suggestions on your code - try something like this:

        s/</&lt;/ for @HTML; s/>/&gt;/ for @HTML; s/&/&amp;/ for @HTML; print '<textarea>', join("", @HTML), '</textarea>';
        Much easier on the eyes, no? Anyway, as I've already said:
        use CGI qw(:standard); print textarea('', escapeHTML(join '', @HTML));

        Makeshifts last the longest.

Re: View The HTML ???
by einerwitzen (Sexton) on Aug 26, 2003 at 03:08 UTC
    Absolutely Wonderful, thanks a ton ...

    open (PAGE, "$file"); @HTML = <PAGE>; close (PAGE); print <<PART1; <textarea style="{border: solid 1 #000033; width: 900; height: 550;}" +name="text"> PART1 use CGI qw(:standard); print escapeHTML(join '', @HTML); print <<PART2; </textarea> PART2
      Well, FWIW, you shouldn't be importing a million functions using :standard if you're only using one of them. Are you using CGI.pm anyway to parse parameters?

      Makeshifts last the longest.

      You really should look into Templating Solutions. Here are two for your amusement:
      1. Template:
      2. HTML::Template:
      Feel free to ask questions about these modules, they are designed to make your life easier - but you don't appreciate that up front. ;)

      jeffa

      L-LL-L--L-LL-L--L-LL-L--
      -R--R-RR-R--R-RR-R--R-RR
      B--B--B--B--B--B--B--B--
      H---H---H---H---H---H---
      (the triplet paradiddle with high-hat)