in reply to Plain Text To HTML

Let me clarify.

I use a form to input information to a program to display a webpage. One part of that form is a text-area into which one can type input, or simply paste a Word document. I need a routine within the main program which will read the input from the text-area (call it $Description) and convert it to HTML for display in the webpage as directed. The routine I am presently using will do all of this except it does not indent the items in a bulleted or numbered list. Here is the present routine:

my $newrecord=""; my $fn=$Description; $fn =~ s/-RET-/<br>/g; $fn =~ s/\n/<br>/g; $fn =~ s/_/((22&%&%22))/g; $newrecord .= "$fn"; $Description=$newrecord;

Suggestions as to what is missing will be greatly appreciated.

Replies are listed 'Best First'.
Re^2: Plain Text To HTML
by LanX (Saint) on Sep 19, 2024 at 13:06 UTC
    > simply paste a Word document.

    What does that mean?

    Others already pointed you to SSCCE it's a good way to "clarify" input, code and expected output.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    see Wikisyntax for the Monastery

      I use a form to input information to a program to display a webpage. One part of that form is a text-area into which one can type input, or simply paste a Word document.

      This suggests to me that the idea is not to read a .docx file but that a user pastes content (with or without formatting?) into a webpage, which is backed by a Perl CGI script.

        > that a user pastes content

        Or he drags&drops a file, which is uploaded.

        But yes I also expect copy and paste

        > (with or without formatting?)

        And that's exactly the point, in what way is a pasted text still "Word"? ¹

        I've already seen ...

        • plain text
        • plain text with "markup"
        • RTF
        • HTML
        • various abominations of the MS universe (OLE, etc)

        ... copied out of Word.

        It also depends on the OS, the intermediate Clipboard, browser and attributes of the receiving textarea² used in the form.

        The OP is keeping us guessing, instead of just showing us the exact text he gets inside his CGI.

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        see Wikisyntax for the Monastery

        ¹) or to even quote the OP twice a "Word document"

        ²) "newer" browsers allow WYSIWYG editing

Re^2: Plain Text To HTML
by Milti (Beadle) on Sep 19, 2024 at 14:31 UTC

    More clarification.

    The form input is sent to a program on a web server to be processed and displayed as a part of a webpage that the site user will see. The form input might be simply typed into the 'text area' or a document might be copied and pasted into the 'text area'. In either case the final product for display will require some HTML formatting. Anyone that is authorized can post info for display at our website and the method must be simple, either type info into the form or copy and paste something into the form. In either case it is unlikely there will be any HTML formatting. Consequently we need to be able to format the input with the program that is accepting the input before it is displayed.

    Hope this makes clear what it is that I am trying to do.

      It would aid your cause greatly if you were to show sample input as pasted into the form (say 3 lines max) and the equivalent desired HTML of that input once it has been transformed. At the moment everyone is left to guess what it is that you actually want to happen during this transformation.

      When you have a moment, perhaps a read of How to ask better questions using Test::More and sample data will be of help.


      🦛

        Here's a sample input. It is a copy and paste of a Word document: This is a test posting. • Hello there! • How are you? • Very well I hope! This is the end of the posting.

        As you can see, as entered here the dots are not indented as they were on the document copied. What I am looking for is a routine that will read the input and display it like this code would. <p>This is a test posting.<ul><li>Hello there!</li><li>How are you?</li><li>Very well I hope!</li></ul>This is the end of the posting.</p>