lwicks has asked for the wisdom of the Perl Monks concerning the following question:

Knowledgable ones

I am building a simple web app using DBD::ANYDATA and CGI.PM basically I am wanting to store HTML pages in a CSV file.

To maintain the pages I use a web form (textarea box) to type in the HTML, then it pastes into the CSV data table.

At this stage I have no data validation of the user input. So things like commas mess things up completely.

What I am wanting to know is:
Any modules, templates, tutorials, suggestions as to how to parse the user input? Basically I don't want to re-invent the wheel here, can any one throw some good ideas my way please?

Cheers

Kia Kaha, Kia Toa, Kia Manawanui!
Be Strong, Be Brave, Be perservering!

Replies are listed 'Best First'.
Re: Parsing user input into CSV table
by b10m (Vicar) on Apr 14, 2004 at 10:06 UTC

    If all you want to store are these HTML pages, why dump them in CSV files? It's bound to mess up ;-) IMHO, you'd better just dump the HTML pages in seperate files. That would probably be faster too.

    I'm not sure what you mean with "how to parse the user input" since you've already stated you're using the CGI module. Do you want to strip certain tags? Validate the HTML before you dump it to your harddisk?

    --
    b10m

    All code is usually tested, but rarely trusted.
      Basically the HTML is a small part of a larger whole.
      The idea of storing in separate files might work, but ideally would like to keep everything in a single storage format.

      The CSV format is because this is for use on a bog standard webserver, minus a SQL server and such like. The use here is that HTML is pasted/typed into a textarea box on a webform, (along with some other info) and is stored in the table.
      The table is then queried and used to produce firstly a list of pages and then upon clicking on a link, they are displayed a web page with the HTML.

      Kind of a CMS, mainly a get the design away from the infrastructure kinda thing. The people who will be typing the data will start by writing plain text, which would then have html tags added afterwards.

      Commas are the killer at themoment obviously :-)

      Kia Kaha, Kia Toa, Kia Manawanui!
      Be Strong, Be Brave, Be perservering!
Re: Parsing user input into CSV table
by William G. Davis (Friar) on Apr 14, 2004 at 11:15 UTC

    Why on earth would you want to store HTML pages in a CSV file? You don't plan on serving them up from this one CSV file, do you?

    Well, I'd advise you to not use CSV and instead use DBI with a real relational database backend like MySQL or PostgreSQL... But I'm guessing you can't, because if you could, you'd already be doing that instead, right? :)

    Ok, well, to mess around with CSV you'd want to use either Text::CSV (with an optional C backend) or DBI with DBD::CSV. Either one of them can handle any needed comma escaping for you. If you only have a handful of these pages, storing them separately on disk really is not a bad idea. If you have a lot of these pages, another better, faster, easier solution than CSV is storing the stuff in some big hash where each key is--I don't know--a page name or section name, and then each value is the corresponding HTML code. Then you serialize it, storing the hash itself to disk that way it can be quickly loaded in the future, using DB_File (my favorite) or some other DBM module (like MLDBM).

      Hmmm....

      Really all I want to do is store plain text, the writing of the authors. HTML is just for formating as it will be presented as a web application using IE or a good browser. ;-)

      Each "Page" is roughly the length of a page of typed text, if that. The big problem is not the HTML but the commas as in the following example:

      This is a example of the problem, this kills it!

      This is entered by hand into a textarea box and kills things. I could just write some regex to swap commas for appropriate code but was/am hoping to find a better writtehn/better tested solution. I.e. as I said earlier, I don't want to re-invent the wheel here.

      Thanks for the comments,
      LANCE

      Kia Kaha, Kia Toa, Kia Manawanui!
      Be Strong, Be Brave, Be perservering!

        Well, if you really want plain text, then Text::CSV will work fine. It appears to handle comma escaping fine. DBD::CSV is probably overkill, as you don't seem to need SQL. I'd still at least consider DBM, though.

        Further investigation shows that the problem is in fact not commas or html tags etc. Rather newlines.

        So whats the "help the noobie" suggestion for stripping the newline characters from the scaler?

        Lance

        Kia Kaha, Kia Toa, Kia Manawanui!
        Be Strong, Be Brave, Be perservering!