Re: HTML <=> Text convertion

by dragonchild (Archbishop)
on Dec 10, 2003 at 14:04 UTC

in reply to HTML <=> Text convertion

This problem sounds like it suffers from a "Lack of Specification". You indicate that you want to convert back and forth from plaintext to HTML. However, there's a reason why there's two formats - they do different things. I ran into this when attempting to design a Document::Template that would handle PDF, Excel, and other formats. PDF and Excel are sufficiently different that it makes no sense, and HTML is even worse. A better question would be How can I convert a HTML table into fixed-width columns and back again? This is an easily solvable problem. (I could have a solution in 20 minutes and under 250 characters ... golf, anyone?)

Now, mentioning PDF brings up another idea - there are HTML => PDF converters and PDF => plaintext converters. There are also plaintext => PDF converters, but no PDF => HTML converters (that I'm aware of). A big problem with converting from XXX => HTML is that HTMl is a non-deterministic format. I find it easier to consider HTML a "hinting format" instead of a "defining format" (like PDF). Browsers, to be compliant, are free to implement whatever they want, so long as they implement something. (This is how you have HTML-x compliant browsers for the blind.)

Re: Re: HTML <=> Text convertion
on Dec 11, 2003 at 21:49 UTC
    This problem sounds like it suffers from a "Lack of Specification".

    Thank you! After thinking a bit more about my problem specification, I realized that I do not need an HTML to text convertion. All I need is a text version and a flag that will indicate if any HTML formatting should be done (with HTML::FromText module). Excellent!

    I have also upgrade the original post, just in case I will think about making the same mistake once again. :)

