Hello Monks,

This (long-ish) question is mainly about design. I have a simple "database". Basically it is only one table, with about 30 fields and a few hundred records. I would like to extract this data and post to the web via CGI.

One limitation of the server I am using is that I cannot install a real database -- I am stuck with plain-text. That shouldn't be a problem. Parsing this file on a per-request basis will be easy, especially given the low-traffic nature of the site.

What is the problem? Two things, actually. Let me specify first that I am now, and likely will be always be the only maintainer for this site. I set both the file-format and the CGI parsing code. Now, my two concerns:

  1. What format do I use for the text-file?
  2. How do I ensure that meta-characters in the text-file are properly inserted into the HTML?

I had initially thought that XML was the solution, but I have run into problems with both of the above points (I'm not posting code because I want to acertain if I should stick with XML or instead try something else).

Basically, my problems thus far have been: XML is slow, XML introduced bloat, as best as I could see I needed to escape everything to insert it into the XML, and I have *no* desire to start entering records using &xx; format for all my /\"'& and whatever other characters.

So, I am looking for suggestions on how to proceed. Am I struggling with the XML approach because I'm a novice with it, or choosing poor tools (XML::Parser and XML::Writer)? I have a lot of DBI, so I was surprised at having to handle my own escaping. Are there other ways around this?

One final thing: my current alternative is just a text-file with each line containing the field-name followed by value, and a record delimiter like '-' x 50, e.g.:

Field1: DATA Field2: MOREDATA #other fields# ------------------------------- Field1: DATA Field2: MOREDATA #other fields#

This wouldn't allow easy stream-based parsing (e.g. no while (my @row = $dbh->fetchrow_array()) nice-ness), but otherwise seems refreshingly simple compared to XML. Any comments on that?


In reply to XML "Database" --> HTML by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.