Oh wise monks, I beg your favour! I have been charged with converting a flat file version of Jane Austen's emma (a typical online text) which has a rather ad hoc markup scheme into HTML!Paragraphs are seperated by blank lines, headings have no special mark up, and words that should have italics are surrounded by underscore characters instead. I have to write a script that will convert the file to HTML by putting each paragraph of text into a paragraph element, put the headings at the start into html heading elements ,converting the words enclosed by underscores to HTML italic elements; and adding the tags such as html at the start and end. I realise that without seeing the exact file you cannot give me specific code, but some guidance as to the process in general would be most helpful. Your help would be most gratefully appreciated and no doubt rewarded in future lives.

In reply to Replacement and conversion of flat file text documents by imhotep

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.