Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: HTML <=> Text convertion

by dragonchild (Archbishop)
on Dec 10, 2003 at 14:04 UTC ( [id://313712]=note: print w/replies, xml ) Need Help??


in reply to HTML <=> Text convertion

This problem sounds like it suffers from a "Lack of Specification". You indicate that you want to convert back and forth from plaintext to HTML. However, there's a reason why there's two formats - they do different things. I ran into this when attempting to design a Document::Template that would handle PDF, Excel, and other formats. PDF and Excel are sufficiently different that it makes no sense, and HTML is even worse. A better question would be How can I convert a HTML table into fixed-width columns and back again? This is an easily solvable problem. (I could have a solution in 20 minutes and under 250 characters ... golf, anyone?)

Now, mentioning PDF brings up another idea - there are HTML => PDF converters and PDF => plaintext converters. There are also plaintext => PDF converters, but no PDF => HTML converters (that I'm aware of). A big problem with converting from XXX => HTML is that HTMl is a non-deterministic format. I find it easier to consider HTML a "hinting format" instead of a "defining format" (like PDF). Browsers, to be compliant, are free to implement whatever they want, so long as they implement something. (This is how you have HTML-x compliant browsers for the blind.)

------
We are the carpenters and bricklayers of the Information Age.

Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Replies are listed 'Best First'.
Re: Re: HTML <=> Text convertion
by TVSET (Chaplain) on Dec 11, 2003 at 21:49 UTC
    This problem sounds like it suffers from a "Lack of Specification".

    Thank you! After thinking a bit more about my problem specification, I realized that I do not need an HTML to text convertion. All I need is a text version and a flag that will indicate if any HTML formatting should be done (with HTML::FromText module). Excellent!

    I have also upgrade the original post, just in case I will think about making the same mistake once again. :)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://313712]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2024-04-18 00:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found