To be honest I would suggest that you start with something tried and tested like txt2html which has seemingly been around for ever, you may need to make some adjustments to the output but it generally seems to do the right thing.
/J\
| [reply] |
This has been done a lot, and so there are tools out there. I have done something similar to practice regexes (you can generally mark up consistent text in about six steps if you are happy to only do one thing at a time) but if you don't need to work hard, run the text through a tool, and then run the HTML through somthing like tidy, and you should be good to go.
| [reply] |
Generally we (the community, not the Royal 'we') like to see some effort put into a post. You've outlined your problem, but you haven't mentioned any basic research or written about a programmatic approach to the problem.
If you'd put up some basic pseudo-code, that would have helped your node be more useful to readers. Please keep this in mind for your future posts.
Alex / talexb / Toronto
"Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds
| [reply] |
Thanks for your comments! I was in a bit of a hurry when I made the post(at work!) but more of a factor in my admittedly poor post is my lack of programming knowledge! I have since had more time to look at it and this is what I make of it!
So firstly, I must place the headings in html heading elements "h2" etc, could I use a pattern match e.g.
if $line = 1 or more blank lines followed by a single line with letter characters in it(the heading, usually comes in this form) followed by 1 or more blank lines. Then save that line with characters in it in to a variable (say $saved) then print "h2 ($saved) h2".
The same for words that are supposed to be in italics(which in this case are underscore characters in the text), I could again use a pattern match but with
"s/\_/(html italics element)/g;. The problem with this approach is that the match would convert all instances of underscores into the opening italic element but not the closing element. Does that make sense, basically after the first instance was changed all the letters in the text (after the italic element) would be in italics?
Also, how would I place the whole text into html, head, body elements etc, I just can't figure out how to have both opening and closing elements!
I know I have not put the problem very clearly, but I hope you can get an idea of what I have to do.
I am going to try to work on some pseudo to make it a bit clearer!
Thanks!
| [reply] |
If you're new to Programming, then you have a little longer journey .. but if I were going to distill Programming into a little recipe, it would be something like this:
- Figure out what you're trying to do. When in doubt, leave some stuff out .. you can add it in later.
- Make a list of the various steps you're going to need to do. Just write it out on a piece of paper. Review that. Think about it. Drink a cup of coffee. Review it again.
- Get the framework going .. which means, translate some of your 'list' into code, and get something really simple running.
- Gradually add the elements on your list, testing as you go.
- Resist the temptation to throw something together that 'mostly' works. Test as rigourously as possible.
- Once you have a version that works, save it somewhere. I like to use rcs because it's dead simple .. but you can just copy your source code file to a file with the name 'foo.1' or 'foo.2005-04-18'. Once you've done that, you have the freedom to get hacking again, without any worry that you'll 'improve' the original and break it in some mysterious way .. without any way to go back to the version that worked.
The three features you've listed are
- Put heading elements like '3 -- Skippy has a Picnic' in 'h2' tags;
- Wrap text like _this is italicized_ with italic tags; and
- Wrap big blocks of text in parapgraph tags, making one big HTML page from your input.
Is this right?
I'll check back in a bit.
Alex / talexb / Toronto
"Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds
| [reply] [d/l] |
| [reply] |