John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:

I'm writing a website using Catalyst, and it's mostly static content, but I'm using Catalyst and TT::Alloy to keep headers and footers reusable, and things like that.

One feature is a small blerb on the front page that the customer will change every so often, like once a month. A form supplies new content which is saved in a config file that my template pulls in; I got that working without any trouble.

The problem is that one of these items is not a single line but a larger "article", and the current content contains paragraphs and and bulleted list. So, I need to allow full HTML codes to be entered and passed through to the template.

Meanwhile, I'm serving XHTML if the browser accepts it, so a syntax error will break the page, not just "do something" like most of the web pages out there (including this one!).

Now this is a seldom-used Admin feature, not a BBS with this as a core feature. So I don't want to put too much work in it, except to reuse! I'm wondering how I might allow users to enter rich content but still ensure that I serve only well-formed XHTML.

Ideas include (1) validating the resulting page before committing the changes; (2) use a different sort of markup and transform it into XHTML. I think that's why the phpBB syntax was invented, in fact. Any Perl modules available that will help me do what I'm looking for?

Thanks,
—John

Replies are listed 'Best First'.
Re: User Input for Web Content
by Corion (Patriarch) on Apr 11, 2011 at 13:22 UTC

    You could decide to transform the user blurb/article to conforming XHTML. For "entering" HTML, I would give the (non-technical) users some editor that ensures well-formedness, like fckEdit. Then you can just check that all tags are either paired or in a list of known unpaired tags.

Re: User Input for Web Content
by ww (Archbishop) on Apr 11, 2011 at 16:39 UTC
    Search, desperately if necessary, for some alternative.

    Allowing users to enter the full range of HTML and passing them on to other elements on the server side is an invitation to:

    • Horribly ill-formatted posts (that will break your XHTML anyway)
    • Frustration and anger on your part
          or
    • Exploitable vulnerabilties (validation notwithstanding) since your spec says "...I need to allow full HTML codes to be entered and passed through to the template.
Re: User Input for Web Content
by locked_user sundialsvc4 (Abbot) on Apr 11, 2011 at 15:45 UTC

    If you “need to allow full (X)HTML codes to be entered,” well then, that’s that ... you will have to go with validation.   Otherwise, the BB-syntax is quite handy, and there are modules e.g. HTML::BBCode that can do it easily.   Basically, give ’em what they prefer.

    A workflow that has done well for me is to allow writers to enter whatever “drafts” they want.   The new content is stored as they supplied it, but it is marked as a draft:   it hasn’t been validated yet, and it isn’t visible yet.   You provide a button that will validate the content upon request, and you also do this before publishing the content on the site.   Disk space being very cheap these days, I usually keep every draft, and use “soft deletes” (an is_deleted column ...), which creates a very forgiving and easy-to-use system.   A simple logging table keeps track of what the users actually did, and lets them un-do those things.   All of these things are very easy to do and they are really helpful ... even for a rarely-updated page.

Re: User Input for Web Content
by roboticus (Chancellor) on Apr 11, 2011 at 19:34 UTC

    John M. Dlugosz:

    If you want to give the users some ability to format the text, perhaps using one of the WIKI text modules (such as Text::Markup::Mediawiki) may be a solution. That way, the module will generate the HTML, rather than the users. The HTML would then be generated properly (I would hope!), and the user would get a simple mechanism to format their posts.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.