Oh wise monks, I beg your favour! I have been charged with converting a flat file version of Jane Austen's emma (a typical online text) which has a rather ad hoc markup scheme into HTML!Paragraphs are seperated by blank lines, headings have no special mark up, and words that should have italics are surrounded by underscore characters instead.
I have to write a script that will convert the file to HTML by putting each paragraph of text into a paragraph element, put the headings at the start into html heading elements ,converting the words enclosed by underscores to HTML italic elements; and adding the tags such as html at the start and end.
I realise that without seeing the exact file you cannot give me specific code, but some guidance as to the process in general would be most helpful.
Your help would be most gratefully appreciated and no doubt rewarded in future lives.